168 mid level diagrams for pipelines#169
Conversation
|
Much better, but I would also highlight the modular organization of the "raw data", and ideally, the acquisition too. |
Is this what you meant? |
bruno-f-cruz
left a comment
There was a problem hiding this comment.
- What is aind schematized metadata? why not call it aind-data-schema?
- Metadata is not split by modality in s3/docdb. At that point, there is already a "merged" metadata.
- I thought the thing you felt was missing from the previous diagram was a specific callout to which files are generated where. This diagram has the same issue, no? It should have a Rig schematic somewhere that says acquisition.json/instrument.json (these can be split by modality). The pipelines should say processing.json/qc.json, etc....
- You should not call out "plots" specifically; these should always be called "artifacts". What if people want to save tables, sounds, videos, etc...?
- It is not clear to me what processing.json -> Aggregate processing -> bracket is doing. Can you describe it to see if there is a better way to go about this?
- Overall, I think the pink box is a bit too chaotic and could use more structure.
I think it would help if you added a few bullet points of what you think this diagram should be describing / what are the major features of the architecture are that you want to highlight. I don't think the current diagram is super faithful to what I feel the current architecture is.
I updated the PR description with my goals. To reiterate, the goal is to create a mid-level diagram to illustrate how pipeline processing is done |
|
@bruno-f-cruz - I tried to do a better job on points 5 and 6 |
Almost there. I think it is still misleading to have metadata per modality. That is something that SciComp has made quite a big deal of in the past: When data lands in the public bucket, metadata must be merged. |
but there will be metadata per modality. I am trying to represent the general metadata in each modality that gets used to make the final schema. |
I don't follow. Formally, metadata merging is a strictly destructive operation. After merging, how do you know which metadata corresponds to which modality? For instance, how do you know to which modality a specific piece of hardware belongs? |
Don't you merge your acquisition files? |
|
@bruno-f-cruz removed the modality metadata |
This one needs to be removed too. Also, the dashed line represents a zoomed-in version of the blue boxes if I am understanding correctly. You may want to call that out with some sort of visual indication. (using color, an arrow, name "Pipeline Modality X", etc...) It is also worth making it clear that the pipelines should ideally run modality specific and not coupled to ALL modalities like it is shown in the diagram. |
|
|
@bruno-f-cruz - can you take one last look and approve? |

What this should show: