Does not work with UNIQUEKEY on longitudinal data

For longitudinal data, the dbGaP Submission Guide indicates the following:

> A second subject phenotypes DS may include all the variables that change per event or time for a person. For example, when a dataset has a single SUBJECT_ID listed multiple times due to measures collected at different events, this would be considered a longitudinal dataset. To make a row unique, unique (composite) keys should have scientific significance and aid in searching for covariate data. Unique keys should not be marked for every single variable in the dataset. Going back to the example, in the corresponding DD, mark an "X" under the [UNIQUEKEY](https://www.ncbi.nlm.nih.gov/gap/docs/submissionguide/#uniqkey) column for the variables SUBJECT_ID + EVENT. This means that for each subject at some particular event, there are some set of relevant data collected.

However, when gaptools is run on a longitudinal file where there are two X's in the UNIQUEKEY column as directed, it erroneously complains that there is an error:

ERROR: E0102_Duplicated_Id (n=71) 
DESCRIPTION: IDs are duplicated. Each person should only have a single subject ID; each sample ID should be represented in a si
ngle row. Remove repeating IDs. 
 Example(s): <ul>
 <li>SUBJECT_ID | Rows</li>
 <li>SUBJ060 | 2,8,120</li>
 <li>SUBJ054 | 3,7</li>
 <li>SUBJ102 | 4,6,98</li>
 <li>SUBJ002 | 5,9</li>
 <li>SUBJ052 | 10,14,138</li>
 <li>SUBJ088 | 11,17,141</li>
 <li>SUBJ092 | 12,15,99</li>
 <li>SUBJ074 | 13,16,205</li>
 <li>SUBJ037 | 18,20</li>
 <li>SUBJ038 | 19,21</li></ul>

Thank you,
Dan Weeks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does not work with UNIQUEKEY on longitudinal data #22

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Does not work with UNIQUEKEY on longitudinal data #22

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions