Add TrackingValidation workflow for tracking performance studies#4
Add TrackingValidation workflow for tracking performance studies#4ArinaPon wants to merge 15 commits intokey4hep:mainfrom
Conversation
|
Hi! I would indeed prefer that we add the CI tests directly. And that we get the track fitting merged before to merge this one. |
|
We also need to add the truth track finding to k4RecTracker |
BrieucF
left a comment
There was a problem hiding this comment.
Thank you very much for this! I had a first quick look and made already some comments. I propose to wait to have everything needed in the nightlies before to do a more thorough review. Thanks again!
| MODEL_FILE="${1:-}" | ||
|
|
||
| # Optional shell-level control of the simulation step | ||
| RUN_SIM="${RUN_SIM:-1}" |
There was a problem hiding this comment.
(the same comment applies to the other variables) Shall we ass some more "scope" to these environment variables? I mean that similar names might be used from somewhere else and we might end-up with conflicts. Something like TrackingPerf_RUN_SIM?
There was a problem hiding this comment.
I updated the custom shell override variables RUN_SIM and INPUT_FILE_OVERRIDE to TRACKINGPERF_RUN_SIM and TRACKINGPERF_INPUT_FILE_OVERRIDE to avoid possible conflicts, thank you.
There was a problem hiding this comment.
To minimize maintenance, it would be nice to take this file directly from $FCCCONFIG/FullSim/IDEA/IDEA_o1_v03/run_digi_reco.py and apply there the needed modifications
There was a problem hiding this comment.
Thanks, I understand the maintenance point. I’d like to discuss this part with Andrea first and then come back with the best way to align it with the standard steering.
tmadlener
left a comment
There was a problem hiding this comment.
Prompted by your presentation today, I had a quick look and I have left a few comments below.
It's mostly minor things, but I think the considerations about the inputs might be worth thinking about also in light of Davids comments at the meeting.
Some others are probably mainly on a "good to know" for now basis, but should probably be fixed at some point (e.g. the threading issues the current code has).
| /// Build a unique integer key from a podio object identifier | ||
| uint64_t oidKey(const podio::ObjectID& id); |
There was a problem hiding this comment.
podio::ObjectID is hashable (since podio v01-03) and it should be possible to directly use it in e.g. std::unordered_map as a key type. There should be no need for constructing a uint64_t for this purpose.
| float refX, float refY, float refZ); | ||
|
|
||
| /// Retrieve the track state stored at the interaction point | ||
| bool getAtIPState(const edm4hep::Track& trk, edm4hep::TrackState& out); |
There was a problem hiding this comment.
This could return a std::optional instead of using an in-out parameter via reference. I have also opened key4hep/EDM4hep#493 which should make this obsolete.
| const std::vector<const edm4hep::TrackerHitSimTrackerHitLinkCollection*>& planarLinksVec, | ||
| const std::vector<const edm4hep::TrackerHitSimTrackerHitLinkCollection*>& dchLinksVec, |
There was a problem hiding this comment.
Looking at how this is handled below I don't see the need for two dedicated inputs here. Both of them go through the same loop and populate the same maps. Since the steering already has a --useDCH flag in any case already, just building the combined vector in python and then passing only one should work the same, but use less code below.
If one wants to take this one step further, one could in principle even make this a single collection argument only since there is a CollectionMerger which would allow to merge the collections up-front (again in python). The approach of using ObjectIDs to identify the elements will still work unchanged. However, that will only make sense if there is a canonical set of merged collections, and I don't know if that is the case here.
| const edm4hep::TrackCollection& fittedTracks, | ||
| const std::vector<const edm4hep::TrackCollection*>& perfectFittedTracksVec) const override { | ||
|
|
||
| const int event = m_evt++; |
There was a problem hiding this comment.
Do you need / want to link back this event to the event number from the input file? In that case you would probably have to add an EventHeaderCollection input and take the event number from there. If this is ever run on multiple threads, the current implementation will essentially make a random event assignment.
| mutable AssocTree m_finder_p2t; | ||
| mutable AssocTree m_finder_t2p; | ||
| mutable AssocTree m_perf_p2t; | ||
| mutable AssocTree m_perf_t2p; | ||
|
|
||
| mutable FitterTree m_fit_vs_mc; | ||
| mutable FitterTree m_fit_vs_perfect; |
There was a problem hiding this comment.
These do not look like they are synchronized in any way. Trying to run this algorithm on multiple threads will probably lead to a crash (or potentially just silently written garbage data). I am not sure marking the algorithm as non re-entrant will fix the problem because Gaudi might than just create two instances which both try to write to the same file.
| float safeAtan2(float y, float x) { | ||
| return std::atan2(y, x); | ||
| } |
There was a problem hiding this comment.
This doesn't do any safety checks on top of std::atan2 (as the docstring claims)?
| parser.add_argument("--runDigi", type=int, default=1, | ||
| help="0=skip digitization, 1=run digitization") | ||
| parser.add_argument("--runFinder", type=int, default=1, | ||
| help="0=skip track finder, 1=run track finder") | ||
| parser.add_argument("--runFitter", type=int, default=1, | ||
| help="0=skip reco fitter, 1=run reco fitter") | ||
| parser.add_argument("--runPerfectTracking", type=int, default=1, | ||
| help="0=skip perfect tracking/perfect fitter, 1=run them") | ||
| parser.add_argument("--runValidation", type=int, default=1, | ||
| help="0=skip validation, 1=run TrackingValidation") | ||
| parser.add_argument("--useDCH", type=int, default=1, | ||
| help="0=disable DCH collections, 1=use DCH collections") |
There was a problem hiding this comment.
These could potentially be
parser.add_argument("--runXXX", action="store_true", default=False, help="Do the XXX")The main disadvantage of that is that explicitly specifying the other value becomes harder (would need more python code).
| parser.add_argument("--mode", type=int, default=0, | ||
| help="Validation mode: 0=Full, 1=FinderOnly, 2=FitterOnly") | ||
| parser.add_argument("--doPerfectFit", type=int, default=1, | ||
| help="0=do not fill fitter_vs_perfect, 1=fill fitter_vs_perfect") |
There was a problem hiding this comment.
Using a choices argument here would make it harder to provide non-supported values.
BEGINRELEASENOTES
ENDRELEASENOTES
Hello ! This PR introduces the TrackingValidation workflow for studying tracking performance in a flexible and reusable way.
The implementation includes:
The test setup was also designed to run the full chain, with configurable options to enable or disable simulation, digitization, track finding, track fitting, perfect tracking, validation, and DCH usage.
A follow-up improvement that I plan to add is a more robust treatment of the residual distributions, using double-Gaussian approach for a more stable description of core and tail behaviour.