When writing output files, we default to having a column named "group" in the header
https://github.com/PankratzLab/kd-match/blob/435d6d159e399723974071eaddfe3cdb0792cf9d/src/main/java/org/pankratzlab/kdmatch/KDMatch.java#L114 .
The group column corresponds to a category that must be matched within … so if cases and controls had to be the same sex, the group would either be male/female or 1/2 etc. If they had to be the same sex, from the same sequencing center, and ancestry, the groups would be appended and would look like 2_UMGC_AFR or 1_UMGC_AFR. In this example, we could consider changing the group header to instead be something like "Group_Matched_within_Sex_SequencingCenter_Ancestry" (or whatever the corresponding input files contained in the header). But after writing that out, it starts to look like a very long header... so maybe it isn't the best way forward.
The updated header would potentially be passed to the methods writing the output files here
|
KDMatch.writeToFile(naiveMatches.stream(), outputBaseFileName, |
|
setConvert.stream().toArray(String[]::new), |
|
setConvert.stream().toArray(String[]::new), initialNumSelect); |
or here
|
KDMatch.writeToFile(optimizedMatches.stream(), outputOptFileName, |
|
setConvert.stream().toArray(String[]::new), |
|
setConvert.stream().toArray(String[]::new), finalNumSelect); |