Sync EuroSys 2026 badges with new definition used for SOSP 2025 onwards (#132)

SolalPirelli · web-flow · commit f93820649650 · 2025-08-08T17:27:46.000+02:00
diff --git a/_conferences/eurosys2026/badges.md b/_conferences/eurosys2026/badges.md
@@ -4,64 +4,73 @@ order: 10
 ---
 
 <style>
-img { width: 10em; }
+img { width: 4em; }
 </style>
 
-Submitted artifacts can select to be evaluated against the following badges,
-which are defined in the [ACM Artifact Review and Badging policy v1.1](https://www.acm.org/publications/policies/artifact-review-and-badging-current):
+EuroSys is an ACM conference and thus uses [ACM's badges](https://www.acm.org/publications/policies/artifact-review-and-badging-current).
 
-| ![Artifacts Available (V1.1)](../../images/acm_available_1.1.png) | **Artifacts Available**<br>Author-created artifacts relevant to this paper have been placed on a publicly accessible archival repository. A DOI or link to this repository along with a unique identifier for the object is provided.  |
-| ![Artifacts Evaluated - Functional (V1.1)](../../images/acm_functional_1.1.png) | **Artifacts Evaluated - Functional**<br>The artifacts associated with the research are found to be documented, consistent, complete, exercisable, and include appropriate evidence of verification and validation. |
-| ![Artifacts Reproduced (v1.1)](../../images/acm_reproduced_1.1.png) | **Results Reproduced**<br>The main results of the paper have been obtained in a subsequent study by a person or team other than the authors, using, in part, artifacts provided by the author. |
+Authors can apply for, and be awarded, one of three combinations for their artifacts and associated papers:
 
+| ![Available](/images/acm_available_1.1.png) ![Functional](/images/acm_functional_1.1.png) ![Reproduced](/images/acm_reproduced_1.1.png)<br>**Available, Functional, and Reproduced** | For the vast majority of software artifacts,<br>and for hardware artifacts whenever possible.  |
+| ![Available](/images/acm_available_1.1.png) ![Functional](/images/acm_functional_1.1.png)<br>**Available and Functional** | For data sets,<br>as well as artifacts that require custom environments authors can't give access to. |
+| ![Functional](/images/acm_functional_1.1.png) ![Reproduced](/images/acm_reproduced_1.1.png)<br>**Functional and Reproduced** | For software and hardware artifacts that the authors cannot make public. |
 
-## Checklists
-
-Unfortunately, artifacts sometimes miss badges because they were not tested on a clean setup, or not documented enough, or because running experiments is too error-prone due to complex manual steps.
-Below we provide checklists for authors to minimize the risk of an artifact unnecessarily missing a badge.
-
-
-
-### Artifact Available
-
-- The artifact is available on a public website with a specific version such as a git commit
-- The artifact has a "read me" file with a reference to the paper
-- The artifact has an associated license and ideally one that at least allows use for comparison purposes
-
+Artifacts submitted for the first target may be awarded one of the other two if availability or reproducibility evaluations fail, respectively.
+Authors cannot apply for other combinations of badges as these make little sense, such as "an artifact that is not public, does not appear functional, but outputs the right numbers".
 
-Artifacts must meet these criteria _at the time of evaluation_.
-Promises of future availability, such as artifacts "temporarily" gated behind credentials given to evaluators, do not qualify for the badge.
 
+## Checklists
 
-### Artifact Functional
-
-- The artifact has a "read me" file with high-level documentation:
-  - A description, such as which folders correspond to code, benchmarks, data, ...
-  - A list of supported environments, including OS, specific hardware if necessary, ...
-  - Compilation and running instructions, including dependencies and pre-installation steps,
-    with a reasonable degree of automation such as scripts to download and build exotic dependencies
-  - Configuration instructions, such as selecting IP addresses or disks
-  - Usage instructions, such as analyzing a new data set
-  - Instructions for a "minimal working example"
-- The artifact has documentation explaining the high-level organization of modules, and code comments explaining non-obvious code,
-  such that other researchers can fully understand it
-- The artifact contains all components the paper describes using the same terminology as the paper, and no obsolete code/data
-- If the artifact includes a container/VM, it must also contain a script to create it from scratch
-
-Artifacts must be usable on other machines than the authors', though they may require hardware such as specific network cards.
-Information such as IP addresses must not be hardcoded.
-
-
-### Results Reproduced
-
-- The artifact has a "read me" file that documents:
-  - The exact environment the authors used, including OS version and any special hardware
-  - The exact commands to run to reproduce each claim from the paper
-  - The approximate resources used per claim, such as "5 minutes, 1 GB of disk space"
-  - The scripts to reproduce claims are documented, allowing researchers to ensure they correspond to the claims;
-    merely producing the right output is not enough
-- The artifact's external dependencies are fetched from well-known sources such as official websites or GitHub repositories
-  - Changes to such dependencies should be clearly separated, such as using a patch file or a repository fork with a clear commit history
-
-The amount of manual work, such as writing configuration files, should be reasonably minimized.
-In particular, there should be no redundant manual steps such as writing the same configuration values in multiple places, as this inevitably leads to human error.
+To provide fair evaluation across artifacts, to help authors prepare, and to help evaluators work efficiently, each badge has an associated checklist.
+
+### "Available" checklist
+
+- The artifact is available on a **public archive with irrevocable versioning and long-term storage**, such as Zenodo but not GitHub
+- The artifact has a **license that allows comparison and extension**, such as the [CC-BY](https://creativecommons.org/licenses/by/4.0/) or [MIT](https://opensource.org/license/mit/) licenses
+- The artifact has a **"read me" file referencing the paper**
+
+These criteria must be met *at the time artifact evaluation finishes*.  
+Authors only need to put the data in long-term storage once evaluators are otherwise satisfied. Development may take place on a platform like GitHub.  
+Promises of future availability are *not* acceptable, such as uploading the artifact to a private repository with the goal of "eventually" making it public.
+
+### "Functional" checklist
+
+- The artifact has a **"read me" file** with:
+  - A description of each artifact component and how it relates to the paper
+  - A description of the exact environment the authors used, such as OS version and hardware
+  - If the artifact includes code that deliberately performs malicious or destructive operations, appropriate warnings and context
+- The artifact includes **all code and data relevant to the paper**, and only those
+  - The artifact must not include obsolete or unrelated code nor data
+  - If existing code or data has been modified, the artifact should clearly separate the modifications from the original
+  - If the paper makes soundness claims, such as proofs, there should be simple scripts to verify these, such as listing proof assumptions
+  - If the paper makes quantifiable claims, such as code size per module, there should be simple scripts to output these
+- For data, **modifications made to the raw data are documented**
+  - For instance, whether parts of the raw data were anonymized or discarded
+- For executable artifacts, the "read me" file also contains **documentation** to:
+  - Run and extend a "minimal working example"
+  - Compile and execute the artifact, including pre-installation steps
+  - Configure the artifact, such as selecting IP addresses or disks
+  - Know the expected resource use per kind of experiment, such as "5 minutes, 10 GB of disk space"
+  - Know what unusual behavior to expect, such as warning messages emitted by another system used as baseline for experiments
+- For executable artifacts, the artifact includes a **precise list of dependencies**:
+  - Whenever possible, it should be usable by a package manager
+  - Exotic dependencies must have associated automation to download and build them
+  - OS-level dependencies must involve a VM/container, accompanied by a script to generate the VM/container
+  - Proprietary dependencies must have associated instructions to obtain them along with "mock" versions to demonstrate their use
+- The artifact includes an **example input and configuration for each kind of experiment** in the paper
+  - Authors are encouraged, but not required, to provide inputs, configurations, and outputs for all experiments described in the paper
+
+Artifacts must be usable on other environments than the authors', though software may require specific hardware such as one model of network card.  
+Manual work such as writing configuration files must be minimized. There must be no redundant manual steps such as writing the same configuration values in multiple places, as this inevitably leads to human error.
+
+### "Reproduced" checklist
+
+- The artifact includes a **single script to run each experiment** and output results, given the necessary input and configuration
+  - The scripts must be documented, allowing researchers to ensure they correspond to the claims, merely producing the right output is not enough
+  - The scripts must handle common edge cases in a reasonable fashion, such as forgetting arguments or running the same script twice
+- The artifact includes a **script to convert each experiment's results into human-readable ones** as close to the paper presentation as possible
+  - For simple results presentation such as tables, this and the previous script can be merged into one
+  - The artifact may contain separate installation steps for the dependencies of plotting scripts, subject to the same criteria
+
+The expected workflow for an evaluator or a researcher looking to reuse the artifact is to install the artifact using a handful of commands, run experiments with one command each, and plot data as necessary.  
+In the absence of problems requiring debugging, active time must not exceed a few minutes.
diff --git a/_conferences/eurosys2026/packaging.md b/_conferences/eurosys2026/packaging.md
@@ -6,18 +6,14 @@ order: 30
 Artifacts must be packaged to ease evaluation and use, including [instructions](#instructions) for the artifact and an [artifact appendix](#artifact-appendix).
 Packaging is not only about evaluation but about future use of the artifact by other researchers who may want to build on top of it or use it as a baseline.
 
-We provide a few guidelines about packaging the artifact below.
 If you have further questions about how best to package your artifact, contact the AEC chairs at [aec-2026@eurosys.org](mailto:aec-2026@eurosys.org).
 
-*Note*: Some artifacts may attempt to perform malicious or destructive operations by design.
-These cases should be boldly and explicitly flagged in detail at submission time (in the artifact instructions and appendix) so the AEC can take appropriate precautions before installing and running these artifacts. Please contact the AEC chairs if you believe that your artifacts fall into this category.
-
 
 ## Instructions
 
-Your artifact package must include an obvious "read me" document containing suitable instructions and documentation.
-A tool without a quick tutorial is generally very difficult to use. Similarly, a dataset is useless without some explanation on how to browse the data.
-Please see the [badges](badges) page for more details on what the instructions should contain.
+Follow the packaging guide for artifact authors [here](/packaging-guide).
+
+Also see the [badges](badges) page for more details on what the instructions should contain.
 
 
 ## Artifact Appendix
@@ -39,78 +35,3 @@ The intention for the artifact appendix is to be published in conjunction with y
 A template for the artifact appendix can be found here: [LaTeX Template](appendix/EuroSys26_ArtifactAppendix_template.tex) (to be used in conjuction with the EuroSys'26 template for research papers).
 
 **Artifact Appendices are limited to 2 pages.**
-
-## Formats
-
-Authors should consider one of the following methods to package the software components of their artifacts
-(although the AEC is open to other reasonable formats as well):
-
-- *Container/virtual machine:* This is the preferred method. We recommend using a format that is easy for evaluators to work with, such as Docker images or an OVF virtual machine (e.g., to run in VirtualBox).
-  An AWS EC2 instance is also possible. In any case, the Dockerfile or script used to initialize the virtual machine should be available.
-  A Docker image or virtual machine should already be set up with the right toolchain and runtime environment. For example:
-    - For raw data, the container/VM would contain the data and the analysis scripts.
-    - For mechanized proofs, the container/VM would contain the right version of the relevant theorem prover
-    - For a mobile phone application, the VM would have a phone emulator installed
-
-- *Source code:* If your artifact has few dependencies and can be installed easily on several operating systems,
-  you may submit source code and build scripts. However, if your artifact has a long list of dependencies, please use one of the other formats below.
-
-- *Live instance on the web:* Ensure that it is available during the artifact evaluation process.
-
-- *Internet-accessible hardware:* If your artifact requires special hardware (e.g., SGX or another trusted execution environment), or if your artifact is actually a piece of hardware, please make sure that evaluators can access the device. SSH or VPN-based access to the device might be an option. Authors must ensure anonymity of the reviewers while evaluating the artifacts. No web-forms or access requests that require the reviewers personal details are an acceptable way for giving access to remote infrastructure. An example approach for SSH access could be that of creating a single user (e.g., `eurosys-aec-review`) on the target machine and then collecting reviewers' SSH keys for granting access. No one technique would serve all purposes, and we leave the final choice to the authors. Should you have issues in granting access to your remote infrastructure, please contact the AEC chairs at [aec-2026@eurosys.org](mailto:aec-2026@eurosys.org) as soon as possible.
-
-- *Screencast:* A detailed screencast of the tool along with the results can be an option if one of the following special cases applies:
-   - The artifact needs proprietary/commercial software or proprietary data that is not easily available or cannot be distributed to the committee.
-   - The artifact requires significant computation resources (e.g., more than 24 hours of execution time to produce the results) or requires huge data sets.
-   - The artifact requires specific hardware or software that is not generally available in a typical lab and where no access can be provided in a reasonable way.
-   - If your artifact falls in this category please contact the AEC chairs at [aec-2026@eurosys.org](mailto:aec-2026@eurosys.org) as soon as possible.
-
-
-
-## Tools
-
-[GitHub](https://github.com) and [GitLab](https://gitlab.com) are good options to host a Git repository for your artifact.
-[Zenodo](https://zenodo.org) provides long-term storage and DOI for a specific version, which is useful once your artifact has been evaluated.
-[Docker](https://docs.docker.com/get-started/overview/) allows you to create a lightweight container with all of your artifact's dependencies, and even write scripts to manage multiple containers locally instead of using a cloud provider.
-
-Please see the [tips](#tips) section for specific tips.
-
-There are also a growing number of tools and mechanisms that are designed specifically to meet the needs of research reproducibility; authors may want to consider using such tools when appropriate. A partial list includes:
-- *[Chameleon Jupyter Notebooks](https://chameleoncloud.readthedocs.io/en/latest/technical/sharing.html):*
-  A framework for packaging and sharing research artifacts on the [Chameleon Cloud](https://www.chameleoncloud.org/) facility. Chameleon Cloud supports a wide range of [hardware and software environments](https://chameleoncloud.org/hardware/) (please **contact the AEC Chairs beforehand** if you plan to use Chameleon Cloud). Check out a demo video [here](https://www.youtube.com/watch?v=grUOrKkiYuQ).
-- *[CloudLab Profiles](https://docs.cloudlab.us/repeatable-research.html):* A mechanism for encapsulating and sharing research environments on the [CloudLab](https://cloudlab.us) facility (please **contact the AEC Chairs beforehand** if you plan to use CloudLab)
-- *[Popper](https://getpopper.io/):* A container-native system for automating workflow
-
-## Tips
-
-*We thank the EuroSys'22 AEC chairs for the materials below.*
-
-The following guides will be useful when preparing your artifact:
-- [HOWTO for AEC Submitters](https://docs.google.com/document/d/1pqzPtLVIvwLwJsZwCb2r7yzWMaifudHe1Xvn42T4CcA/edit),  
-  by Dan Barowy, Charlie Curtsinger, Emma Tosch, John Vilk, and Emery Berger
-- [Artifact Evaluation: Tips for Authors](https://blog.padhye.org/Artifact-Evaluation-Tips-for-Authors/),  
-  by Rohan Padhye
-
-You may also find these examples of past artifacts useful:
-- [Bundler](https://github.com/bundler-project/evaluation), a middlebox and its multi-machine benchmarks (EuroSys'21)
-- [Serval](https://unsat.cs.washington.edu/projects/serval/sosp19-artifact.html), a verification tool with correct and buggy code to test it (SOSP'19)
-- [TinyNF](https://github.com/dslab-epfl/tinynf), a network driver with low-level multi-machine benchmarks (OSDI'20)
-
-
-Here are some general tips to make life easier for both artifact authors and evaluators:
-
-- **Automate as much as possible**, you will save a lot of time in the end from not having to re-run experiments that suffered from human error.
-  This is feasible even for artifacts that need multiple nodes or to replicate configuration in multiple places.
-  See [this repository](https://github.com/SolalPirelli/docker-artifact-eval) for an example of artifact automation with Docker.
-
-- **Try out your own artifact on a blank environment**, following the steps you documented.
-  One lightweight way to do this is to create a Docker container from a base OS image, such as `ubuntu:latest`.
-  You can also use a virtual machine or even provision a real machine if you have the infrastructure to do so.
-
-- **Log both successes and failures**, so that users know that something happened.
-  Avoid logging unnecessary or confusing information, such as verbose output or failures that are actually expected.
-  Log potential issues, such as an optional but recommended library not being present.
-
-- **Measure resource use** using tools such as `mpstat`, `iostat`, `vmstat`, and `ifstat` to measure CPU, I/O, memory, and network use respectively on Linux,
-  or `/usr/bin/time -v` to measures the time and memory used by a command also on Linux.
-  This lets users know what to expect.