Skip to content

jpata/particleflow

Repository files navigation

Summary

ML-based particle flow (MLPF) focuses on developing full event reconstruction for particle detectors using computationally scalable and flexible machine learning models. The project aims to improve particle flow reconstruction across various detector environments, including CMS, as well as future detectors via Key4HEP. We build on existing, open-source simulation software by the experimental collaborations.

High-level overview


Publications

Below is the development timeline of MLPF by our team, ranging from initial proofs of concept to full detector simulations and fine-tuning studies.

2021: First full-event GNN demonstration of MLPF

2021: First demonstration in CMS Run 3

2022: Improved performance in CMS Run 3

  • Detector Performance Note: CERN-CMS-DP-2022-061
  • Focus: We showed that training against a generator-level target can improve performance in CMS.

2024: Improved performance with full simulation for future colliders

2025: Fine-tuning across detectors

2026: CMS Run 3 full results


Datasets

Software & Dataset Compatibility

Please ensure you use the correct version of the jpata/particleflow software with the corresponding dataset version.

Code Version CMS Dataset CLIC Dataset CLD Dataset
1.9.0 2.4.0 2.2.0 NA
2.0.0 2.4.0 2.3.0 NA
2.1.0 2.5.0 2.5.0 NA
2.2.0 2.5.0 2.5.0 2.5.0
2.3.0 2.5.0 2.5.0 2.5.0
2.4.0 2.6.0 2.5.0 2.5.0

Getting Started with Pixi & Snakemake

The full data generation, model training, and validation workflow are managed using Pixi for environment management and Snakemake for job orchestration.

1. Install Pixi

curl -fsSL https://pixi.sh/install.sh | bash
# Restart your shell or source your .bashrc

2. Initialize Your Site

Configure the environment for your specific cluster. This sets up the necessary Snakemake profiles and site defaults.

  • Tallinn (Slurm):
pixi run -e tallinn init
  • lxplus (HTCondor):
pixi run -e lxplus init

3. Generate the Workflow

Generate the Snakefile for a production campaign corresponding to your site.

PROD=cms_run3 STEPS=gen,post,tfds,train pixi run -e lxplus generate

You can inspect snakemake_jobs/cms_run3/Snakefile and the related scripts to understand the workflow.

4. Execute the Workflow

Launch the workflow on the batch system. It is recommended to run this inside a tmux or screen session.

PROD=cms_run3 STEPS=gen,post,tfds,train pixi run -e lxplus run

5. Validation & Plots

To run the validation plotting workflow:

PROD=cms_run3 pixi run -e lxplus validation

Citations and Reuse

You are welcome to reuse the code in accordance with the LICENSE.

How to Cite

  1. Academic Work: Please cite the specific papers listed in the Publications section above relevant to the method you are using (e.g., initial GNN idea, fine-tuning, or specific detector studies).
  2. Code Usage: If you use the code significantly for research, please cite the specific tagged version from Zenodo.
  3. Dataset Usage: Cite the appropriate dataset via the Zenodo link and the corresponding paper.

Contact

For collaboration ideas that do not fit into the categories above, please get in touch via GitHub Discussions.

About

Machine-learned, GPU-accelerated particle flow reconstruction

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors