Citations and Reuse

Summary

ML-based particle flow (MLPF) focuses on developing full event reconstruction for particle detectors using computationally scalable and flexible machine learning models. The project aims to improve particle flow reconstruction across various detector environments, including CMS, as well as future detectors via Key4HEP. We build on existing, open-source simulation software by the experimental collaborations.

Publications

Below is the development timeline of MLPF by our team, ranging from initial proofs of concept to full detector simulations and fine-tuning studies.

2021: First full-event GNN demonstration of MLPF

Paper: MLPF: efficient machine-learned particle-flow reconstruction using graph neural networks (Eur. Phys. J. C)
Focus: Initial idea with a GNN and scalable graph building.
Code: v1.1
Dataset: Zenodo Record

2021: First demonstration in CMS Run 3

Paper: Machine Learning for Particle Flow Reconstruction at CMS (J. Phys. Conf. Ser.)
Focus: First demonstration of feasibility within CMS.
Detector Performance Note: CERN-CMS-DP-2021-030

2022: Improved performance in CMS Run 3

Detector Performance Note: CERN-CMS-DP-2022-061
Focus: We showed that training against a generator-level target can improve performance in CMS.

2024: Improved performance with full simulation for future colliders

Paper: Improved particle-flow event reconstruction with scalable neural networks for current and future particle detectors (Communications Physics)
Focus: Improved event-level performance in full simulation for future colliders.
Code: v1.6.2
Results: Zenodo Record

2025: Fine-tuning across detectors

Paper: Fine-tuning machine-learned particle-flow reconstruction for new detector geometries in future colliders (Phys. Rev. D)
Focus: Showing that the amount of training data can be reduced by 10x by fine-tuning.
Code: v2.3.0

2026: CMS Run 3 full results

Detector Performance Note: CERN-CMS-DP-2025-033
Focus: Improve jet performance over baseline, first validation on real data.
Paper: CMS Run 3 paper (submitted to EPJC)
Code: v2.4.0

Datasets

Software & Dataset Compatibility

Please ensure you use the correct version of the jpata/particleflow software with the corresponding dataset version.

Code Version	CMS Dataset	CLIC Dataset	CLD Dataset
1.9.0	2.4.0	2.2.0	NA
2.0.0	2.4.0	2.3.0	NA
2.1.0	2.5.0	2.5.0	NA
2.2.0	2.5.0	2.5.0	2.5.0
2.3.0	2.5.0	2.5.0	2.5.0
2.4.0	2.6.0	2.5.0	2.5.0

Getting Started with Pixi & Snakemake

The full data generation, model training, and validation workflow are managed using Pixi for environment management and Snakemake for job orchestration.

1. Install Pixi

curl -fsSL https://pixi.sh/install.sh | bash
# Restart your shell or source your .bashrc

2. Initialize Your Site

Configure the environment for your specific cluster. This sets up the necessary Snakemake profiles and site defaults.

Tallinn (Slurm):

pixi run -e tallinn init

lxplus (HTCondor):

pixi run -e lxplus init

3. Generate the Workflow

Generate the Snakefile for a production campaign corresponding to your site.

PROD=cms_run3 STEPS=gen,post,tfds,train pixi run -e lxplus generate

You can inspect snakemake_jobs/cms_run3/Snakefile and the related scripts to understand the workflow.

4. Execute the Workflow

Launch the workflow on the batch system. It is recommended to run this inside a tmux or screen session.

PROD=cms_run3 STEPS=gen,post,tfds,train pixi run -e lxplus run

5. Validation & Plots

To run the validation plotting workflow:

PROD=cms_run3 pixi run -e lxplus validation

Citations and Reuse

You are welcome to reuse the code in accordance with the LICENSE.

How to Cite

Academic Work: Please cite the specific papers listed in the Publications section above relevant to the method you are using (e.g., initial GNN idea, fine-tuning, or specific detector studies).
Code Usage: If you use the code significantly for research, please cite the specific tagged version from Zenodo.
Dataset Usage: Cite the appropriate dataset via the Zenodo link and the corresponding paper.

Contact

For collaboration ideas that do not fit into the categories above, please get in touch via GitHub Discussions.

Name		Name	Last commit message	Last commit date
Latest commit History 1,318 Commits
.github/workflows		.github/workflows
data/cms/run3		data/cms/run3
habana		habana
images		images
mlpf		mlpf
notebooks		notebooks
parameters		parameters
scripts		scripts
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
particleflow_spec.yaml		particleflow_spec.yaml
pixi.toml		pixi.toml
requirements.txt		requirements.txt
validation_cms.yaml		validation_cms.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Summary

Publications

Datasets

Software & Dataset Compatibility

Getting Started with Pixi & Snakemake

1. Install Pixi

2. Initialize Your Site

3. Generate the Workflow

4. Execute the Workflow

5. Validation & Plots

Citations and Reuse

About

Uh oh!

Releases 16

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Summary

Publications

Datasets

Software & Dataset Compatibility

Getting Started with Pixi & Snakemake

1. Install Pixi

2. Initialize Your Site

3. Generate the Workflow

4. Execute the Workflow

5. Validation & Plots

Citations and Reuse

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages