First implementation of exporting input variables and infer the ParT network with onnxruntime by suehara · Pull Request #77 · lcfiplus/LCFIPlus

suehara · 2025-12-10T06:23:52Z

BEGINRELEASENOTES
This is automatic generation written in ReleaseNotes.md

2025-12-05 SUEHARA Taikan
- Implement backward compatibility for MC-PFO assignment
  - Add default parameter to InitMCPPFOCollections for backward compatibility
  - Use simple method (first element) when Track-MC relation is not available
  - Use improved method (weight-based, multi-track support) when available
  - Maintain full backward compatibility with upstream v00-11
2025-12-03 SUEHARA Taikan
- Add optional ONNX support with backward compatibility
  - Enable ONNX Runtime support as optional feature (ENABLE_ONNX CMake option, default: OFF)
  - Make onnxruntime and nlohmann_json optional dependencies
  - Conditionally compile ONNX-related source files
  - Full backward compatibility when ENABLE_ONNX=OFF
SUEHARA Taikan and collaborators (2024-2025)
- Machine Learning and ONNX integration
  - Add MLInputGenerator, MLMakeNtuple, MLInferenceWeaver for ML-based flavor tagging
  - Add WeaverInterface and ONNXRuntime for ONNX model inference
  - Add DNNProvider2 for DNN-based vertex finding
  - Implement event-based classification with jets
  - Add dEdx support for particle identification
- Flavor tagging improvements
  - Improve PFA-track assignment and track-MC assignment
  - Implement true jet flavor assignment from MC
  - Add MC-to-jet assignment algorithm (AssignJetsToMC)
  - Bugfixes on MC flavor assignment
  - Add sorted track and neutral accessors (getAllTracksSorted, getNeutralsSorted)
- Code quality and compatibility
  - Update C++ standard to C++17 with CMake 3.5+ requirement
  - Compatibility fixes for key4hep environment and onnxruntime
  - Various bugfixes in weaver output and neutral PF candidate masking
  - Add event-based input support

ENDRELEASENOTES

…put variables

… ML input variables

Enable ONNX Runtime support as an optional feature that can be enabled via ENABLE_ONNX CMake option (default: OFF). Changes: - Add ENABLE_ONNX CMake option (default OFF) for backward compatibility - Make onnxruntime and nlohmann_json optional dependencies - Conditionally compile ONNX-related source files based on ENABLE_ONNX - Conditionally include ONNX-related headers in ROOT dictionary - Remove ONNX source files from build when ENABLE_ONNX=OFF When ENABLE_ONNX=OFF (default): - No ONNXRuntime dependency required - ONNX-related files not compiled (ONNXRuntime.cc, MLInferenceWeaver.cc, etc.) - Full backward compatibility with existing builds When ENABLE_ONNX=ON: - Requires onnxruntime and nlohmann_json - Enables ML inference features via ONNX Both build configurations tested successfully. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Add default parameter to InitMCPPFOCollections to maintain backward compatibility with upstream while supporting improved Track-MC relation based assignment. Changes: - Add default parameter mctrkColName="" to InitMCPPFOCollections() - Use simple method (first element) when Track-MC relation is not available - Use improved method (weight-based, multi-track support) when Track-MC relation is available - Maintain full backward compatibility with upstream v00-11 Implementation details: - navTrks.size() == 0: Use simple PFO-MC relation[0] (upstream compatible) - navTrks.size() > 0: Use improved Track[max Pt] -> MC[max weight] method Benefits: - No code changes required for existing users - Gradual migration path via steering file parameter - Same binary supports both modes - Transparent fallback to compatible mode Testing: - Successfully built with ONNX disabled (26M library) - Successfully built with ONNX enabled (34M library) - Zero compilation errors - API symbols correctly exported 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Merge official LCFIPlus v00-11 release with development branch containing ML/ONNX support and flavor tagging improvements. From upstream v00-11: - Add key4hep-build CI workflow for automated testing - Make building against ROOT 6.38 possible - Explicitly include LCIO headers for better compatibility - Fix exception handling (use reference instead of value) - Backport key4hep-spack patches From development branch: - Optional ONNX Runtime support (ENABLE_ONNX CMake option, default: OFF) - ML-based flavor tagging infrastructure (MLInputGenerator, MLMakeNtuple, MLInferenceWeaver) - Improved MC-PFO assignment with backward compatibility - DNN-based vertex finding (DNNProvider2, VertexFinderDNN) - Event-based classification support - Enhanced flavor tagging algorithms - C++17 standard with CMake 3.5+ requirement Merge resolution: - Keep upstream CMAKE requirements where possible - Restore ${LCIO_INCLUDE_DIRS} in ROOT_DICT_INCLUDE_DIRS (from upstream) - Add ROOT_DICT_CINT_DEFINITIONS (from upstream) - Maintain ONNX conditional compilation (development branch) - Keep both upstream v00-11 and development additions in release notes - Preserve key4hep CI workflow from upstream Backward compatibility: - Default behavior matches upstream when ONNX disabled - Optional features enabled via CMake flags and steering parameters - No breaking changes to existing physics logic 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Move "Development branch additions" section to the top of the release notes, making it the first section readers see. This better reflects the chronological order and highlights the most recent changes. Changes: - Move development branch section from subsection (##) to main section (#) - Place it before v00-11 upstream release notes - Improves readability by showing newest changes first 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…nderdnn Replace unsafe memset() calls with proper C++ value-initialization. - DNNProvider2.cc: Use DNNData() constructor instead of memset - vertexfinderdnn.cc: Use TracksData() constructor instead of memset This fixes compiler warnings about clearing objects with non-trivial types (std::vector members) and ensures proper initialization. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Comment out unused variable 'nall' in the process() method to clean up the code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

tmadlener

I have started to have a look here and I was wondering whether you could add some example steering files for

Running data collection for training
Running the inference with the trained model

That would also help me judge if all of the classes are in the end necessary. It's not entirely clear to me if all of the different ways of doing inference are necessary or if we could drop them.

There are a few comments below which I found while having a first quick look.

tmadlener · 2025-12-15T10:38:40Z

src/lcfiplus.cc

+  vector<const Track*> tracks_unsorted = getAllTracks(withoutV0);
+
+  // order particles by energy
+  vector<std::pair<float, int> > order_tr;
+  order_tr.resize(tracks_unsorted.size());
+  for(size_t i=0; i<tracks_unsorted.size(); ++i){
+    order_tr[i] = std::pair<float, int>(tracks_unsorted[i]->E(), i);
+  }
+  std::sort(order_tr.begin(),order_tr.end(),[](std::pair<float,int>a,std::pair<float,int> b){
+    return a.first > b.first;
+  });
+
+  vector<const Track*> tracks;
+  tracks.resize(tracks_unsorted.size());
+
+  for (size_t i=0; i<tracks_unsorted.size(); ++i) {
+    tracks[i] = tracks_unsorted[order_tr[i].second];
+  }
+  return tracks;


Suggested change

vector<const Track*> tracks_unsorted = getAllTracks(withoutV0);

// order particles by energy

vector<std::pair<float, int> > order_tr;

order_tr.resize(tracks_unsorted.size());

for(size_t i=0; i<tracks_unsorted.size(); ++i){

order_tr[i] = std::pair<float, int>(tracks_unsorted[i]->E(), i);

}

std::sort(order_tr.begin(),order_tr.end(),[](std::pair<float,int>a,std::pair<float,int> b){

return a.first > b.first;

});

vector<const Track*> tracks;

tracks.resize(tracks_unsorted.size());

for (size_t i=0; i<tracks_unsorted.size(); ++i) {

tracks[i] = tracks_unsorted[order_tr[i].second];

}

return tracks;

auto tracks_unsorted = getAllTracks(withoutV0);

std::ranges::sort(tracks_unsorted, {}, &Track::E);

Since we are building with c++20 in any case, we can also use std::ranges::sort and not create an intermediate vector just for soting. Similar for the neutrals below.

tmadlener · 2025-12-15T10:41:21Z

src/LcfiplusProcessor.cc

+  registerInputCollection(LCIO::LCRELATION, "MCTrackRelation", "Relation between MC and tracks, usually better in terms of assignment of tracks",
+                          _mctrkRelationName, std::string(""));


Will these relations be persisted to files in the end? Maybe I am just missing it, but are the FromType and ToType parameters set on this collection so that it is easily possible to identify what is being linked in this collection?

tmadlener · 2025-12-15T12:16:42Z

include/MLInputGenerator.h

+namespace MLInputGenerator {
+
+  extern map<string, variant<


Why not make this a class and mark the map as static? As far as I can tell that should have the same effect?

tmadlener · 2025-12-15T12:19:30Z

src/MLInputGenerator.cc

+
+  // copied from FCCANalyses/analyzers/dataframe/src/ReconstructedParticle2Track.cc
+  float calc_dxy(float D0_wrt0, float Z0_wrt0, float phi0_wrt0, TVector3 p, TVector3 privtx, int charge){
+    double Bz = 3.5;


This should probably be an input parameter, and configurable somehow in the long run.

tmadlener · 2025-12-15T12:45:56Z

src/MLMakeNtuple.cc

+       || std::holds_alternative<function<double(const Track*, const Vertex*)> >(v.second)
+       || std::holds_alternative<function<double(const Neutral*)> >(v.second)
+       || std::holds_alternative<function<double(const Neutral*, const Vertex*)> >(v.second)){
+	_tree->Branch( key.c_str(), &_data.newDataVec(key) );


Does this construction work properly? I remember in the past, adding a vector to a tree, that later has to re-allocated because it grows made the pointer stored here invalid. I don't see any explicit calls to reserve or resize here for these vectors that would make them "stable" when adding elements.

tmadlener · 2025-12-15T12:48:23Z

src/MLMakeNtuple.cc

+  if (_outEvent && _outEventNoJets) {
+    cout << "Skipping due to ambiguous setting: MLMakeNtuple.EventClassification and MLMakeNtuple.EventClassificationNoJets are both turned on" << endl;
+    return;
+  }


I think this should be checked in the initialization and it should also be a hard error already there, not something that will just create a lot of log messages but still continues to run.

tmadlener · 2025-12-15T13:00:56Z

src/MLMakeNtuple.cc

+  TrackVec &tracks_orig = event->getTracks();
+  NeutralVec &neutrals_orig = event->getNeutrals();
+
+  vector<const Track *> tracks(tracks_orig.size());
+  vector<const Neutral *> neutrals(neutrals_orig.size());
+
+  std::partial_sort_copy(tracks_orig.begin(),tracks_orig.end(),tracks.begin(), tracks.end(), [](const Track *a, const Track *b){
+    return a->E() > b->E();
+  });
+  std::partial_sort_copy(neutrals_orig.begin(),neutrals_orig.end(),neutrals.begin(), neutrals.end(),[](const Neutral *a, const Neutral *b){
+    return a->E() > b->E();
+  });


What is the reason to partial_sort_copy here? As far as I can see, tracks_orig and neutrals_orig are only used here and no longer after sorting has happened, and even though it says partial_sort here the effect will be that the whole vector will be sorted. sort is usually quicker than partial_sort. So I would propose something along the lines of

auto tracks = event->getTracks(); // NOTE: the explicit copy we do here by omitting '&' std::ranges::sort(tracks, std::greater{}, &Track:E);

tmadlener · 2025-12-15T13:05:52Z

include/MLInferenceTorch.h

Is this header used anywhere? It looks like it can be removed?

tmadlener · 2025-12-15T13:08:03Z

doc/ReleaseNotes.md

+# Development branch additions (merged 2025-12-06)
+
+* 2025-12-05 SUEHARA Taikan
+  - Implement backward compatibility for MC-PFO assignment
+    - Add default parameter to InitMCPPFOCollections for backward compatibility
+    - Use simple method (first element) when Track-MC relation is not available
+    - Use improved method (weight-based, multi-track support) when available
+    - Maintain full backward compatibility with upstream v00-11
+
+* 2025-12-03 SUEHARA Taikan
+  - Add optional ONNX support with backward compatibility
+    - Enable ONNX Runtime support as optional feature (ENABLE_ONNX CMake option, default: OFF)
+    - Make onnxruntime and nlohmann_json optional dependencies
+    - Conditionally compile ONNX-related source files
+    - Full backward compatibility when ENABLE_ONNX=OFF
+
+* SUEHARA Taikan and collaborators (2024-2025)
+  - Machine Learning and ONNX integration
+    - Add MLInputGenerator, MLMakeNtuple, MLInferenceWeaver for ML-based flavor tagging
+    - Add WeaverInterface and ONNXRuntime for ONNX model inference
+    - Add DNNProvider2 for DNN-based vertex finding
+    - Implement event-based classification with jets
+    - Add dEdx support for particle identification
+
+  - Flavor tagging improvements
+    - Improve PFA-track assignment and track-MC assignment
+    - Implement true jet flavor assignment from MC
+    - Add MC-to-jet assignment algorithm (AssignJetsToMC)
+    - Bugfixes on MC flavor assignment
+    - Add sorted track and neutral accessors (getAllTracksSorted, getNeutralsSorted)
+
+  - Code quality and compatibility
+    - Update C++ standard to C++17 with CMake 3.5+ requirement
+    - Compatibility fixes for key4hep environment and onnxruntime
+    - Various bugfixes in weaver output and neutral PF candidate masking
+    - Add event-based input support
+


Suggested change

# Development branch additions (merged 2025-12-06)

* 2025-12-05 SUEHARA Taikan

- Implement backward compatibility for MC-PFO assignment

- Add default parameter to InitMCPPFOCollections for backward compatibility

- Use simple method (first element) when Track-MC relation is not available

- Use improved method (weight-based, multi-track support) when available

- Maintain full backward compatibility with upstream v00-11

* 2025-12-03 SUEHARA Taikan

- Add optional ONNX support with backward compatibility

- Enable ONNX Runtime support as optional feature (ENABLE_ONNX CMake option, default: OFF)

- Make onnxruntime and nlohmann_json optional dependencies

- Conditionally compile ONNX-related source files

- Full backward compatibility when ENABLE_ONNX=OFF

* SUEHARA Taikan and collaborators (2024-2025)

- Machine Learning and ONNX integration

- Add MLInputGenerator, MLMakeNtuple, MLInferenceWeaver for ML-based flavor tagging

- Add WeaverInterface and ONNXRuntime for ONNX model inference

- Add DNNProvider2 for DNN-based vertex finding

- Implement event-based classification with jets

- Add dEdx support for particle identification

- Flavor tagging improvements

- Improve PFA-track assignment and track-MC assignment

- Implement true jet flavor assignment from MC

- Add MC-to-jet assignment algorithm (AssignJetsToMC)

- Bugfixes on MC flavor assignment

- Add sorted track and neutral accessors (getAllTracksSorted, getNeutralsSorted)

- Code quality and compatibility

- Update C++ standard to C++17 with CMake 3.5+ requirement

- Compatibility fixes for key4hep environment and onnxruntime

- Various bugfixes in weaver output and neutral PF candidate masking

- Add event-based input support

These will be automatically added by our tagging script if you put them between the BEGINRELEASENOTES and ENDRELEASENOTES in the PR as you have already done.

Replace hardcoded 3.5 Tesla with Globals::Instance()->getBField() in MLInputGenerator and DNNProvider2 to support different experimental setups and allow user configuration via steering files. Affected files: - src/MLInputGenerator.cc (calc_dxy, calc_dz) - src/DNNProvider2.cc (calc_dxy, calc_dz) Addresses PR lcfiplus#77 review comment from tmadlener regarding hardcoded magnetic field value that should be configurable. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Move EventClassification conflict check from process() to init() to throw an error early instead of silently skipping events at runtime. This ensures users are immediately notified of configuration errors during initialization rather than discovering the issue after processing begins. Affected files: - src/MLMakeNtuple.cc (init, process) Addresses PR lcfiplus#77 review comment from tmadlener requesting that ambiguous configuration settings should throw errors during init rather than emit warnings at runtime. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Replace partial_sort_copy with direct std::sort for better readability and performance. Use vector range constructor instead of resize + partial_sort_copy for cleaner code. Changes: - src/MLMakeNtuple.cc: Use vector(begin, end) constructor and std::sort - include/VertexFinderTearDown.h: Same improvement for Chi2 track sorting This addresses PR lcfiplus#77 review comment suggesting to use direct sorting instead of partial_sort_copy with intermediate vectors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Add type information (MCParticle to Track/ReconstructedParticle) to MCPFORelation and MCTrackRelation parameter descriptions for better clarity. Changes: - src/LcfiplusProcessor.cc: Update parameter descriptions Addresses PR lcfiplus#77 review comment requesting clearer type information for relation collection parameters. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Implement EventStore flag management to enable ParticleID output for jets read from LCIO files. This fixes the issue where jets imported from LCIO were not written back with their assigned ParticleID values. Changes: - Add EventStore::AddFlags() and RemoveFlags() for runtime flag management - Mark jet collections as PERSIST in MLInferenceWeaver and FlavorTag init() - Update WriteJets() to add ParticleIDs to existing LCIO collections - Maintain backward compatibility for internally created jets 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Add optional parameter to create a new jet collection with ParticleID instead of modifying jets read from LCIO files. This avoids errors when trying to add ParticleID to read-only LCIO collections. Changes: - Add UpdateJetCollectionName parameter (default: empty string) - When empty, use original behavior (add ParticleID to existing jets) - When specified, create new jet collection with copied jets and ParticleID - Use EventStore::Register() to create new collection with PERSIST flag - Copy jets using Jet copy constructor without vertex extraction 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Add sample steering files for ML inference and training data collection: - ml_inference_test.xml: Example for running MLInferenceWeaver - ml_training_data_collection.xml: Example for MLMakeNtuple data collection Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Refactor MLInputGenerator to use a static class with private members instead of namespace-level variables. This improves encapsulation and follows C++ best practices. Changes: - Convert calcInput map and _initialized flag to private static members - Add public getCalcInput() accessor method - Convert helper functions to public static methods - Update MLMakeNtuple.cc and MLInferenceWeaver.cc to use new API Addresses review comment: lcfiplus#77... Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

SUEHARA Taikan and others added 30 commits February 26, 2019 10:40

VertexAnalysis on testproc updated

9b22996

Merge branch 'master' of https://github.com/lcfiplus/LCFIPlus

77c77f8

DNNProvider2 (version 1)

3387685

DNNProvider2 modified

caad803

Merge branch 'master' of https://github.com/suehara/LCFIPlus

329a268

add vertexfinderdnn.h

f301eca

add vertexfinderdnn.cc/h

a4ae6de

Add dEdx

9eb2a30

new version

788bdec

Merge /home/ilc/rtagami/LCFIPlus

8e77718

Improve PFA-track assignment and track-MC assignment

e8476b5

DNNProvider2

f0f9627

add MLInputGenerator and TorchInference (WIP)

0b33110

implemented MLInputGenerator template functions for calculating ML in…

9da8573

…put variables

Changed functors types to use std::variant

e710595

implemented MLMakeNtuple as a new algorithm for ROOT format output of…

a9162ab

… ML input variables

File renamed

892cb66

Added ONNX/Weaver interface from FCCAnalyses

e7fbd75

MLInferenceWeaver added

b850fc8

Implement MLInferenceWeaver

6c4c9ff

fix prefix

6f10600

merged variable changes from ongoing strange tagging study

b7278bb

variable names are automatically read from JSON

2cf41b4

clean up code

ec3dba3

WeaverReader implemented

c780a83

suppress warning

ebb58d3

Fix variable computation for weaver input

f828463

Changes to make input variables match the previously trained weights

d06c8a2

Use new PID

c38bceb

CMakeLists modified

08d2926

tomohikosan and others added 21 commits January 21, 2025 17:38

dump weaver input

3dddeb3

fix neu_pfcand_mask

1c3f06a

Merge branch 'onnx'

63a7e4b

Label on MLMakeNtuple, modification on FlavtagReader

0255ebe

Merge branch 'onnx' of 202.13.202.54:~/LCFIPlus.ml

703ac5c

merge error fixed

ce5ac05

Remove data dump option

af1a9cd

True jet flavor assignment implemented

93bfb94

bugfix on MC flavor assignment

6f85374

Change of MLMakeNtuple for event categorization

9296e21

add event based classification with jets

f911b72

Add event-based input

bae2ea9

SGV new training

ec3a38e

Compatibility for key4hep onnxruntime

e4384a5

Remove unused variable in DNNProvider2

700465b

Comment out unused variable 'nall' in the process() method to clean up the code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Remove onnx weight files

7a26be3

tmadlener reviewed Dec 15, 2025

View reviewed changes

suehara and others added 8 commits December 25, 2025 15:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First implementation of exporting input variables and infer the ParT network with onnxruntime#77

First implementation of exporting input variables and infer the ParT network with onnxruntime#77
suehara wants to merge 68 commits intolcfiplus:masterfrom
suehara:master

suehara commented Dec 10, 2025

Uh oh!

tmadlener left a comment

Uh oh!

tmadlener Dec 15, 2025

Uh oh!

tmadlener Dec 15, 2025

Uh oh!

tmadlener Dec 15, 2025

Uh oh!

tmadlener Dec 15, 2025

Uh oh!

tmadlener Dec 15, 2025

Uh oh!

tmadlener Dec 15, 2025

Uh oh!

tmadlener Dec 15, 2025

Uh oh!

tmadlener Dec 15, 2025

Uh oh!

tmadlener Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		registerInputCollection(LCIO::LCRELATION, "MCTrackRelation", "Relation between MC and tracks, usually better in terms of assignment of tracks",
		_mctrkRelationName, std::string(""));

Conversation

suehara commented Dec 10, 2025

Uh oh!

tmadlener left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants