feat: adding a new systems+AI conferece, ACM CAIS (https://www.caisconf.org/), which provides artifact evaluation (#152)

bastoica · web-flow · commit f8abb708d086 · 2026-02-23T10:14:16.000+01:00
diff --git a/_conferences/cais.md b/_conferences/cais.md
@@ -0,0 +1,8 @@
+---
+title: CAIS
+---
+
+The [ACM Conference on AI and Agentic Systems](https://www.caisconf.org/) is organized by the 
+is the premier venue for rigorous, reproducible research on compound 
+AI architectures, optimization, and deployment.
+Research artifacts were first evaluated in 2026.
diff --git a/_conferences/cais2026/aec-call.md b/_conferences/cais2026/aec-call.md
@@ -0,0 +1,22 @@
+---
+title: Call for AEC Members
+order: 60
+---
+
+We are looking for reviewers who can provide detailed, constructive evaluations of 1-2 artifacts during the ACM CAIS'26 AE [review period](call). AEC members evaluate artifact quality, reproducibility, and relevance; provide clear feedback to help authors improve their work; and participate in committee discussions on artifact badge decisions. In return, they gain early insight into emerging research, opportunities to connect with peers in the community, and recognition for their service on the ACM CAIS [website](https://www.caisconf.org/) and, potentially, in the conference proceedings.
+
+How to Apply
+------------
+
+If you are interested in taking part in the AEC, please nominate yourself using [this form](https://forms.gle/tc7xca6PaYFqyYqW7) by **Sunday, April 12, 2026**.
+
+You can reach us at [aec-chairs@caisconf.org](mailto:aec-chairs@caisconf.org) with any questions.
+
+AEC responsabilities
+--------------------
+
+As an AEC member, you will contribute to the reproducibility of experimental results in systems research by evaluating artifacts associated with papers accepted for publication at ACM CAIS'26. For each artifact, you will be asked to evaluate its public availability, functionality, and ability to reproduce the results reported in the paper. You will be able to discuss artifacts with other AEC members and interact anonymously with the authors when necessary. Finally, you will submit a review with constructive feedback, discuss the artifact with fellow reviewers, and help determine artifact evaluation badge outcomes.
+
+We expect each AEC member to evaluate 1-2 artifacts, and we estimate that each evaluation will take around 10-20 hours. AEC members are expected to allocate time to indicate their artifact preferences and, once assigned, read the corresponding papers, evaluate the artifacts, and participate in online discussions through each notification deadline. Please ensure that you have sufficient time and availability during the AEC [reviewing period](call) and can carry out the evaluation confidentially and independently, without sharing artifacts or related information outside the AEC.
+
+We expect evaluators to be able to use conference-provided cloud resources when available, particularly for artifacts that require substantial compute capacity or specialized hardware. Evaluators may also use their own machines when artifacts support local execution (including via containers or virtual machines, where appropriate). If neither option is suitable, authors may provide remote access to their own systems (e.g., via SSH) with proper anonymization. Commercial cloud services should be used only as a last resort and only with prior coordination with the authors and conference organizers to minimize unnecessary costs.
diff --git a/_conferences/cais2026/badges.md b/_conferences/cais2026/badges.md
@@ -0,0 +1,69 @@
+---
+title: Badges
+order: 10
+---
+
+<style>
+img { width: 4em; }
+</style>
+
+CAIS is an ACM conference and thus uses [ACM's badges](https://www.acm.org/publications/policies/artifact-review-and-badging-current). Authors can apply for, and be awarded, one of following three badge combinations:
+
+| ![Available](/images/acm_available_1.1.png) ![Functional](/images/acm_functional_1.1.png) ![Reproduced](/images/acm_reproduced_1.1.png)<br>**Available, Functional, and Reproduced** | For the vast majority of software artifacts,<br>and for hardware artifacts whenever possible.  |
+| ![Available](/images/acm_available_1.1.png) ![Functional](/images/acm_functional_1.1.png)<br>**Available and Functional** | For data sets,<br>as well as artifacts that require custom environments authors can't give access to. |
+| ![Functional](/images/acm_functional_1.1.png) ![Reproduced](/images/acm_reproduced_1.1.png)<br>**Functional and Reproduced** | For software and hardware artifacts that the authors cannot make public. |
+
+Artifacts submitted for the first target may be awarded one of the other two if availability or reproducibility evaluations fail, respectively. Authors cannot apply for other combinations of badges as these make little sense (e.g., an artifact that is not public, does not appear functional, but outputs the expected measurements).
+
+
+## Checklists
+
+To provide a fair evaluation across artifacts, help authors prepare, and enable evaluators to work efficiently, each badge has an associated checklist.
+
+### "Available" checklist
+
+- The artifact is available on a **public archive with irrevocable versioning and long-term storage**, such as Zenodo but not GitHub
+- The artifact has a **license that allows comparison and extension**, such as the [CC-BY](https://creativecommons.org/licenses/by/4.0/) or [MIT](https://opensource.org/license/mit/) licenses
+- The artifact has a **"read me" file referencing the paper**
+
+Note that these criteria must be met *at the time artifact evaluation finishes*, but development may take place on a platform like GitHub. Authors only need to save their artifact to long-term storage once evaluators are otherwise satisfied. However, promises of future availability are *not* acceptable, such as uploading the artifact to a private repository with the goal of "eventually" making it public.
+
+### "Functional" checklist
+
+- The artifact has a **"read me" file** with:
+  - A description of each artifact component and how it relates to the paper
+  - A description of the exact environment the authors used, such as OS version and hardware
+  - If the artifact includes code that deliberately performs malicious or destructive operations, appropriate warnings and context
+- The artifact includes **all code and data relevant to the paper**, and only those
+  - The artifact must not include obsolete or unrelated code nor data
+  - If existing code or data has been modified, the artifact should clearly separate the modifications from the original
+  - If the paper makes soundness claims, such as proofs, there should be simple scripts to verify these, such as listing proof assumptions
+  - If the paper makes quantifiable claims, such as code size per module, there should be simple scripts to output these
+- For data, **modifications made to the raw data are documented**
+  - For instance, whether parts of the raw data were anonymized or discarded
+- For executable artifacts, the "read me" file also contains **documentation** to:
+  - Run and extend a "minimal working example"
+  - Compile and execute the artifact, including pre-installation steps
+  - Configure the artifact, such as selecting IP addresses or disks
+  - Know the expected resource use per kind of experiment, such as "5 minutes, 10 GB of disk space"
+  - Know what unusual behavior to expect, such as warning messages emitted by another system used as baseline for experiments
+- For executable artifacts, the artifact includes a **precise list of dependencies**:
+  - Whenever possible, it should be usable by a package manager
+  - Exotic dependencies must have associated automation to download and build them
+  - OS-level dependencies must involve a VM/container, accompanied by a script to generate the VM/container
+  - Proprietary dependencies must have associated instructions to obtain them along with "mock" versions to demonstrate their use
+- The artifact includes an **example input and configuration for each kind of experiment** in the paper
+  - Authors are encouraged, but not required, to provide inputs, configurations, and outputs for all experiments described in the paper
+
+Manual work such as writing configuration files must be minimized. There must be no redundant manual steps such as writing the same configuration values in multiple places, as this inevitably leads to human error. Also, artifacts must be usable on other environments than the authors', though software may require specific hardware (e.g., a particular network card, a specific GPU, etc.).
+
+### "Reproduced" checklist
+
+- The artifact includes a **single script to run each experiment** and output results, given the necessary input and configuration
+  - The scripts must be documented, allowing researchers to ensure they correspond to the claims, merely producing the right output is not enough
+  - The scripts must handle common edge cases in a reasonable fashion, such as forgetting arguments or running the same script twice
+- The artifact includes a **script to convert each experiment's results into human-readable ones** as close to the paper presentation as possible
+  - For simple results presentation such as tables, this and the previous script can be merged into one
+  - The artifact may contain separate installation steps for the dependencies of plotting scripts, subject to the same criteria
+
+The expected workflow for an evaluator or a researcher looking to reuse the artifact is to install the artifact using a handful of commands, run experiments with minimal effort (ideally running a single script per experiment or group of experiments), and display experimental data as necessary. In absence of issues requiring in-depth debugging, active time must not exceed a few minutes.
diff --git a/_conferences/cais2026/call.md b/_conferences/cais2026/call.md
@@ -0,0 +1,51 @@
+---
+title: Call for Artifacts
+order: 20
+---
+
+All papers accepted at ACM CAIS'26 are *encouraged to participate in the artifact evaluation* process.
+
+Artifacts must be consistent with the paper, as complete as possible, reasonably well documented, and easy to reuse. These goals are reflected in the three [badges](badges) that can be awarded to each paper: Available, Functional, and Results Reproduced (or Reproduced, for short). The purpose of the AEC is to help authors meet these goals and to award badges to artifacts that satisfy the criteria. Note that for ACM CAIS'26, the AE is *single-blind* (see below).
+
+Questions about artifact evaluation can be directed to [aec-chairs@caisconf.org](mailto:aec-chairs@caisconf.org).
+
+To be considered in the AE process, at least one contact author for the submission must be reachable and respond to questions in a timely manner during the evaluation period, allowing sufficient back-and-forth between the AEC and the authors. Please check the AEC timeline and important dates [here](dates).
+
+
+## Registration and Submission
+
+**Link to HotCRP portal:** [https://acm-cais26-ae.hotcrp.com/](https://acm-cais26-ae.hotcrp.com/)
+
+Please submit your artifacts to the [AE HotCRP portal](https://acm-cais26-ae.hotcrp.com/) by providing an URL or a packaged artifact, selecting which artifact badges you apply for, and provide an artifact appendix that describes the artifact.
+
+The effort that you put into packaging your artifacts has a direct impact on the committee's ability to make well-informed decisions.
+Please package your artifacts with care to make it as straightforward and easy as possible for the AEC to understand and evaluate their quality.
+
+*Note*: If you need permission from your organization's legal or IT department to publish your artifact or give evaluators access to custom hardware, submit that request as soon as possible, otherwise evaluators may not have sufficient time to audit your artifact.
+
+
+## Process
+
+Authors are invited to submit artifacts shortly after their papers are accepted. Because the time between paper acceptance and artifact submission is short, the AEC Chairs encourage authors to begin preparing their artifacts while their papers are still under review.
+
+At the time of artifact submission, authors choose which badges they want to pursue. Please read  instructions and criteria, [here](badges).
+
+After the artifact submission deadline, evaluators will review each artifact using the corresponding paper and artifact appendix as guides. Evaluators may communicate with authors *exclusively through HotCRP* (to preserve anonymity) to resolve minor issues and ask clarifying questions throughout the evaluation process. Evaluation begins with a "kick-the-tires" period, during which evaluators confirm that they can access their assigned artifacts and perform basic operations, such as compiling and running a minimal working example. Artifact evaluations include feedback, allowing authors to improve both their artifacts and their papers based on that feedback.
+
+
+## Artifact Details
+
+Artifacts can include software, datasets, survey results, test suites, mechanized proofs, and similar materials. Pen-and-paper proofs are not accepted, as evaluators often lack the time and expertise to review them carefully. Physical objects, such as computer hardware, also cannot be accepted because they are difficult to make available to evaluators. To the extent possible, artifacts should be able to run on commodity hardware (e.g., laptop or desktop systems). If this requirement cannot be met, please contact the AEC Chairs in advance so that arrangements can be made for evaluators to access special hardware (e.g., your own). More detailed artifact packaging instructions are available [here](packaging).
+
+When submitting your artifact, please specify which combination of the [three badges](badges) you are applying for. For the Functional and Reproduced badges, AEC members will attempt to use your artifact to run the experiments described in your paper.
+
+Submitting an artifact for evaluation does **not** give the AEC permission to make its contents public or to retain any part of it after the evaluation. Thus, authors are free to include proprietary models, data files, or code in their artifacts. Participating in artifact evaluation does not require public release of the artifact, though public release is highly encouraged.
+
+AEC members may contact authors during the evaluation period, for example, to ask for help if they are unable to get the artifact to work, and authors are expected to respond to such requests. However, your goal as an author should be to present and document your artifact so that AEC members can use it and complete the evaluation successfully on their own (ideally without needing to interact with the authors). To ensure that your instructions are complete, we recommend testing them on a fresh setup before submission, following exactly the instructions you provide.
+
+
+## Reviewing and Anonymity
+
+Artifact evaluation is "***single-blind***", meaning that the identities of authors will be known to reviewers, but authors will not know which Artifact Evaluation Committee (AEC) members reviewed their artifacts. 
+
+To maintain reviewer anonymity, authors should not embed analytics or other tracking tools in the websites hosting their artifacts for the duration of the artifact evaluation period. This helps maintain reviewer confidentiality. In cases where tracking is unavoidable, authors should notify the AEC Chairs in advance so that AEC members can take adequate precautions
diff --git a/_conferences/cais2026/dates.md b/_conferences/cais2026/dates.md
@@ -0,0 +1,11 @@
+---
+title: Important Dates
+order: 25
+---
+
+#### Important dates
+- Acceptance notification to paper authors: **Tue, April 21, 2026**
+- Artifact submission deadline: **Thu, April 23, 2026 (AoE)**
+- Kick-the-tires response period: **Thu, April 23 - Mon, April 27, 2026**
+- Camera-ready deadline: **Mon, April 27, 2026 (AoE)**
+- Artifact decisions announced: **Fri, May 8, 2026**
diff --git a/_conferences/cais2026/index.md b/_conferences/cais2026/index.md
@@ -0,0 +1,11 @@
+---
+title: Artifact Evaluation
+order: 0
+---
+
+
+The Artifact Evaluation (AE) process is a community service intended to strengthen the long-term value of accepted papers by encouraging authors to provide substantial supplementary materials for review. These artifacts enable future researchers to reproduce, compare, and extend prior work more effectively. ACM CAIS invites authors of accepted papers to submit their artifacts for evaluation. Research artifacts are digital objects developed by the authors for use in the reported study or generated during the experimental process.
+
+Artifact evaluation is optional at [ACM CAIS'26](https://www.caisconf.org/). Authors of accepted papers may submit their artifacts shortly after receiving the acceptance notification, but they are strongly encouraged to prepare these materials in advance. If an artifact passes evaluation, the paper will receive one or more official ACM AE badges displayed on its first page, and the final camera-ready version must include an appendix describing the artifact. For more information about the AE timeline, see the [important dates](dates) page.
+
+The AE process for ACM CAIS'26 is *single-blind*, meaning that reviewers will know the identities of the authors, but authors will not know which Artifact Evaluation Committee (AEC) members reviewed their artifacts. To preserve reviewer anonymity, authors should not embed analytics or other tracking tools in websites hosting their artifacts during the artifact evaluation period. In cases where tracking is unavoidable, authors should notify the AEC Chairs in advance so that AEC members can take appropriate precautions.
diff --git a/_conferences/cais2026/organizers.md b/_conferences/cais2026/organizers.md
@@ -0,0 +1,14 @@
+---
+title: Organizers
+order: 50
+---
+
+## Artifact Evaluation Chairs
+
+* [Bogdan "Bo" Stoica](https://bastoica.github.io/) (University of Illinois Urbana-Champaign, USA)
+
+You can reach the AEC chairs at [aec-chairs@caisconf.org](mailto:aec-chairs@caisconf.org).
+
+## Artifact Evaluation Committee
+
+*TBD*
diff --git a/_conferences/cais2026/packaging.md b/_conferences/cais2026/packaging.md
@@ -0,0 +1,30 @@
+---
+title: Artifact Packaging
+order: 30
+---
+
+Artifacts must be packaged to ease evaluation and use, including [instructions](#instructions) for the artifact and an [artifact appendix](#artifact-appendix).
+Packaging is not only about evaluation but about future use of the artifact by other researchers who may want to build on top of it or use it as a baseline.
+
+If you have further questions about how best to package your artifact, contact the AEC chairs at [aec-chairs@caisconf.org](mailto:aec-chairs@caisconf.org).
+
+
+## Instructions
+
+Follow the packaging guide for artifact authors [here](/packaging-guide).
+
+Also see the [badges](badges) page for more details on what the instructions should contain.
+
+
+## Artifact Appendix
+
+
+The artifact appendix is a self-contained document up to **2 pages** that provides a roadmap for evaluators. In particular, it should include a description of the hardware, software, and configuration requirements, as well as a **list of the major claims** made in the paper that can be reproduced using your artifact.
+
+Linking the paper's claims to the artifact is a necessary step that enables evaluators to reproduce your results. It is especially important to state your paper's key results and claims clearly. You should also make your claims about the artifact concrete. This is particularly important if those claims differ from the expectations set by your paper.
+
+The AEC members will still evaluate your artifact relative to your paper, but your explanation can help set expectations up front, especially in cases that might otherwise frustrate evaluators. For example, inform the AEC about difficulties they might encounter when using the artifact or about the artifact's maturity relative to the content of the paper.
+
+An artifact appendix must provide details on the time and hardware resources (e.g., storage) required to run the experiments in your paper. If possible, the appendix should also describe how to compare the results of a reproduced experiment with those reported in the paper (e.g., by providing access to the underlying dataset).
+
+The intention for the artifact appendix is to be published in conjunction with your artifact, using an ACM official [LaTeX template](https://www.acm.org/publications/proceedings-template). For further information, please check ACM CAIS'26 [formatting requirements](https://www.caisconf.org/pages/cfp/).
diff --git a/_conferences/cais2026/results.md b/_conferences/cais2026/results.md