Skip to content

Add TrackingValidation workflow for tracking performance studies#4

Open
ArinaPon wants to merge 15 commits intokey4hep:mainfrom
ArinaPon:pr-tracking-validation
Open

Add TrackingValidation workflow for tracking performance studies#4
ArinaPon wants to merge 15 commits intokey4hep:mainfrom
ArinaPon:pr-tracking-validation

Conversation

@ArinaPon
Copy link
Copy Markdown

BEGINRELEASENOTES

  • Add TrackingValidation consumer for track-finder and track-fitter performance studies
  • Add finder-level validation through particle-to-track and track-to-particle association trees
  • Provide fitter-level validation with residuals computed with respect to MC truth
  • Support optional comparison to perfectly associated fitted tracks
  • Add summary plots written to the output ROOT file (tracking efficiency, d0 resolution, momentum resolution, transverse-momentum resolution)
  • Support configurable validation modes (full, finder-only, fitter-only)
  • Support configurable finder-efficiency definitions and purity threshold
  • Add helper and plotting utilities used by the validation workflow
  • Add test steering and end-to-end test for reconstruction and validation
  • Add documentation in the README
    ENDRELEASENOTES

Hello ! This PR introduces the TrackingValidation workflow for studying tracking performance in a flexible and reusable way.

The implementation includes:

  • validation of track-finder output through truth-based hit associations,
  • validation of fitted tracks against MC truth,
  • optional comparison with perfect tracking, i.e. tracks fitted using the correct simhits from the particle truth information,
  • summary performance plots stored directly in the output ROOT file.

The test setup was also designed to run the full chain, with configurable options to enable or disable simulation, digitization, track finding, track fitting, perfect tracking, validation, and DCH usage.

A follow-up improvement that I plan to add is a more robust treatment of the residual distributions, using double-Gaussian approach for a more stable description of core and tail behaviour.

@andread3vita
Copy link
Copy Markdown

andread3vita commented Mar 20, 2026

Thank you @ArinaPon for this PR! Just a comment on the tests: @BrieucF, should we also include the CI tests here? Of course they will fail because we don't have the fitter and perfect track finder yet, but as soon as the other PR is accepted, they should succeed

@BrieucF
Copy link
Copy Markdown
Collaborator

BrieucF commented Mar 23, 2026

Hi! I would indeed prefer that we add the CI tests directly. And that we get the track fitting merged before to merge this one.

@BrieucF
Copy link
Copy Markdown
Collaborator

BrieucF commented Mar 23, 2026

We also need to add the truth track finding to k4RecTracker

Copy link
Copy Markdown
Collaborator

@BrieucF BrieucF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this! I had a first quick look and made already some comments. I propose to wait to have everything needed in the nightlies before to do a more thorough review. Thanks again!

MODEL_FILE="${1:-}"

# Optional shell-level control of the simulation step
RUN_SIM="${RUN_SIM:-1}"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the same comment applies to the other variables) Shall we ass some more "scope" to these environment variables? I mean that similar names might be used from somewhere else and we might end-up with conflicts. Something like TrackingPerf_RUN_SIM?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the custom shell override variables RUN_SIM and INPUT_FILE_OVERRIDE to TRACKINGPERF_RUN_SIM and TRACKINGPERF_INPUT_FILE_OVERRIDE to avoid possible conflicts, thank you.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To minimize maintenance, it would be nice to take this file directly from $FCCCONFIG/FullSim/IDEA/IDEA_o1_v03/run_digi_reco.py and apply there the needed modifications

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I understand the maintenance point. I’d like to discuss this part with Andrea first and then come back with the best way to align it with the standard steering.

Copy link
Copy Markdown
Member

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prompted by your presentation today, I had a quick look and I have left a few comments below.

It's mostly minor things, but I think the considerations about the inputs might be worth thinking about also in light of Davids comments at the meeting.

Some others are probably mainly on a "good to know" for now basis, but should probably be fixed at some point (e.g. the threading issues the current code has).

Comment on lines +52 to +53
/// Build a unique integer key from a podio object identifier
uint64_t oidKey(const podio::ObjectID& id);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

podio::ObjectID is hashable (since podio v01-03) and it should be possible to directly use it in e.g. std::unordered_map as a key type. There should be no need for constructing a uint64_t for this purpose.

float refX, float refY, float refZ);

/// Retrieve the track state stored at the interaction point
bool getAtIPState(const edm4hep::Track& trk, edm4hep::TrackState& out);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could return a std::optional instead of using an in-out parameter via reference. I have also opened key4hep/EDM4hep#493 which should make this obsolete.

Comment on lines +139 to +140
const std::vector<const edm4hep::TrackerHitSimTrackerHitLinkCollection*>& planarLinksVec,
const std::vector<const edm4hep::TrackerHitSimTrackerHitLinkCollection*>& dchLinksVec,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at how this is handled below I don't see the need for two dedicated inputs here. Both of them go through the same loop and populate the same maps. Since the steering already has a --useDCH flag in any case already, just building the combined vector in python and then passing only one should work the same, but use less code below.

If one wants to take this one step further, one could in principle even make this a single collection argument only since there is a CollectionMerger which would allow to merge the collections up-front (again in python). The approach of using ObjectIDs to identify the elements will still work unchanged. However, that will only make sense if there is a canonical set of merged collections, and I don't know if that is the case here.

const edm4hep::TrackCollection& fittedTracks,
const std::vector<const edm4hep::TrackCollection*>& perfectFittedTracksVec) const override {

const int event = m_evt++;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need / want to link back this event to the event number from the input file? In that case you would probably have to add an EventHeaderCollection input and take the event number from there. If this is ever run on multiple threads, the current implementation will essentially make a random event assignment.

Comment on lines +731 to +737
mutable AssocTree m_finder_p2t;
mutable AssocTree m_finder_t2p;
mutable AssocTree m_perf_p2t;
mutable AssocTree m_perf_t2p;

mutable FitterTree m_fit_vs_mc;
mutable FitterTree m_fit_vs_perfect;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These do not look like they are synchronized in any way. Trying to run this algorithm on multiple threads will probably lead to a crash (or potentially just silently written garbage data). I am not sure marking the algorithm as non re-entrant will fix the problem because Gaudi might than just create two instances which both try to write to the same file.

Comment on lines +35 to +37
float safeAtan2(float y, float x) {
return std::atan2(y, x);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't do any safety checks on top of std::atan2 (as the docstring claims)?

Comment on lines +26 to +37
parser.add_argument("--runDigi", type=int, default=1,
help="0=skip digitization, 1=run digitization")
parser.add_argument("--runFinder", type=int, default=1,
help="0=skip track finder, 1=run track finder")
parser.add_argument("--runFitter", type=int, default=1,
help="0=skip reco fitter, 1=run reco fitter")
parser.add_argument("--runPerfectTracking", type=int, default=1,
help="0=skip perfect tracking/perfect fitter, 1=run them")
parser.add_argument("--runValidation", type=int, default=1,
help="0=skip validation, 1=run TrackingValidation")
parser.add_argument("--useDCH", type=int, default=1,
help="0=disable DCH collections, 1=use DCH collections")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could potentially be

parser.add_argument("--runXXX", action="store_true", default=False, help="Do the XXX")

The main disadvantage of that is that explicitly specifying the other value becomes harder (would need more python code).

Comment on lines +40 to +43
parser.add_argument("--mode", type=int, default=0,
help="Validation mode: 0=Full, 1=FinderOnly, 2=FitterOnly")
parser.add_argument("--doPerfectFit", type=int, default=1,
help="0=do not fill fitter_vs_perfect, 1=fill fitter_vs_perfect")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a choices argument here would make it harder to provide non-supported values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants