Bug report
Steps to reproduce
- Run two or more gateway instances (primary + one or more peers) with aggregation enabled on the primary.
- Declare a shared parent Component (e.g.
robot-alpha) across all manifests, with each ECU declaring its own child Component via parent_component_id. This is the topology used by the multi_ecu_aggregation demo.
- After discovery, issue sub-resource requests through the primary, for example:
GET /api/v1/components/{leaf_ecu_id}/logs
GET /api/v1/components/{leaf_ecu_id}/hosts
GET /api/v1/components/{leaf_ecu_id}/data
Expected behavior
The aggregation layer treats Components with the same symmetry as Areas:
- A leaf Component maps to exactly one ECU. When the same ID is announced by a peer, the peer owns the runtime state (data, logs, hosts, operations, faults) and the primary should transparently forward every request - detail endpoint and all sub-resources - to that peer.
- A hierarchical parent Component (referenced as
parent_component_id by any other Component in the merged set) has no runtime state of its own; it only groups its children. The primary should serve it locally with the merged view, exactly like an Area whose Components come from different peers.
- When more than one peer legitimately or accidentally announces the same leaf ID, the primary should flag the collision to operators without taking the gateway down.
Actual behavior
EntityMerger::merge_components previously skipped inserting a routing entry for any collision-merged Component, treating the merged entity as locally owned. That assumption is wrong for leaves: with no routing entry, sub-resource requests were handled on the primary, which has no runtime state for the peer's leaf, so /logs came back empty, /hosts missed peer apps, and /data had no resources.
The naive fix (route every collision to the peer) then broke the hierarchical case: GET /components/{parent_id} forwarded to a random peer's view and the merged cache was lost. Sub-components that arrive from different peers (typical for the multi-ECU demo) compounded the problem.
Additionally, there was no operator-visible signal when two peers declared the same leaf ID, and source on the merged entity could not express the fact that several contributors fed into a single view.
Environment
- ros2_medkit version: main (reproduces on current tip)
- ROS 2 distro: Jazzy / Humble / Rolling
- OS: Linux (Ubuntu 22.04 / 24.04)
Additional information
The aggregation design doc previously described "Remote-only Components get a routing table entry", which matched the implementation but did not reflect the ECU-ownership model the merge logic otherwise followed. The fix needs to classify merged Components into hierarchical parents (served locally with merged view) and leaves (routed to the owning peer), emit structured warnings on multi-peer leaf collisions, and expose provenance to clients so they can distinguish local-only, peer-only, and merged entities.
Bug report
Steps to reproduce
robot-alpha) across all manifests, with each ECU declaring its own child Component viaparent_component_id. This is the topology used by themulti_ecu_aggregationdemo.GET /api/v1/components/{leaf_ecu_id}/logsGET /api/v1/components/{leaf_ecu_id}/hostsGET /api/v1/components/{leaf_ecu_id}/dataExpected behavior
The aggregation layer treats Components with the same symmetry as Areas:
parent_component_idby any other Component in the merged set) has no runtime state of its own; it only groups its children. The primary should serve it locally with the merged view, exactly like an Area whose Components come from different peers.Actual behavior
EntityMerger::merge_componentspreviously skipped inserting a routing entry for any collision-merged Component, treating the merged entity as locally owned. That assumption is wrong for leaves: with no routing entry, sub-resource requests were handled on the primary, which has no runtime state for the peer's leaf, so/logscame back empty,/hostsmissed peer apps, and/datahad no resources.The naive fix (route every collision to the peer) then broke the hierarchical case:
GET /components/{parent_id}forwarded to a random peer's view and the merged cache was lost. Sub-components that arrive from different peers (typical for the multi-ECU demo) compounded the problem.Additionally, there was no operator-visible signal when two peers declared the same leaf ID, and
sourceon the merged entity could not express the fact that several contributors fed into a single view.Environment
Additional information
The aggregation design doc previously described "Remote-only Components get a routing table entry", which matched the implementation but did not reflect the ECU-ownership model the merge logic otherwise followed. The fix needs to classify merged Components into hierarchical parents (served locally with merged view) and leaves (routed to the owning peer), emit structured warnings on multi-peer leaf collisions, and expose provenance to clients so they can distinguish local-only, peer-only, and merged entities.