NAs in feature matrix

When generating a matrix of features for RIVER, how do the developers handle situations where no variant near a particular gene has a CADD annotation for features like TFBS or EncOCCombPVal? glmnet cannot handle NAs, but n my dataset 95% of genes have at least one missing feature annotation, so removing such cases would waste most of the data. 

Ex:
|                           | cHmmTx| cHmmTssBiv| cHmmHet| cHmmBivFlnk| cHmmTxFlnk| TFBS| EncOCCombPVal|
|:--------------------------|------:|----------:|-------:|-----------:|----------:|----:|-------------:|
|GTEX-111YS:ENSG00000007923 |  0.016|          0|       0|           0|      0.000|   NA|            NA|
|GTEX-117YW:ENSG00000007923 |  0.000|          0|       0|           0|      0.000|   NA|            NA|
|GTEX-1192X:ENSG00000007923 |  0.000|          0|       0|           0|      0.000|   NA|            NA|
|GTEX-11EM3:ENSG00000007923 |  0.000|          0|       0|           0|      0.008|   NA|            NA|
|GTEX-11EQ8:ENSG00000007923 |  0.000|          0|       0|           0|      0.000|   NA|            NA|
|GTEX-11EQ9:ENSG00000007923 |  0.016|          0|       0|           0|      0.000|   NA|            NA|


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NAs in feature matrix #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	cHmmTx	cHmmTxFlnk	TFBS	EncOCCombPVal
GTEX-111YS:ENSG00000007923	0.016	0.000	NA	NA
GTEX-117YW:ENSG00000007923	0.000	0.000	NA	NA
GTEX-1192X:ENSG00000007923	0.000	0.000	NA	NA
GTEX-11EM3:ENSG00000007923	0.000	0.008	NA	NA
GTEX-11EQ8:ENSG00000007923	0.000	0.000	NA	NA
GTEX-11EQ9:ENSG00000007923	0.016	0.000	NA	NA

NAs in feature matrix #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions