NPCNet: Navigator-Driven Pseudo Text for Deep Clustering of Early Sepsis Phenotyping

Data Preparation

All csv files should be placed in the same directory as main.py.

1. texts.csv

This file contains pseudo texts (text) that preserve the chronological order of clinical examinations, along with the corresponding tokenized inputs (token and mask) required by NPCNet.
In the example shown below, the maximum number of examinations for all patients is four, and the sequences are padded accordingly to ensure all sequences have the same maximum length across patients. mask ensures padded positions (0s) are ignored during the training stage.
Example:

subject_id| event|                                                                      text|                          token|                mask
     12345|     1| glucose glucose-6 FiO2 FiO2-9 lactate lactate-8 bicarbonate bicarbonate-9| 1 9 309 23 155 26 193 25 184 2| 1 1 1 1 1 1 1 1 1 1
     12457|     1|         lactate lactate-8 chloride chloride-4 glucose glucose-1 BUN BUN-4| 1 26 193 10 297 9 304 11 209 2| 1 1 1 1 1 1 1 1 1 1
     12457|     2|                                         hemoglobin hemoglobin-4 BUN BUN-9|      1 31 258 11 214 2 0 0 0 0| 1 1 1 1 1 1 0 0 0 0
     12548|     1|                                  glucose glucose-10 FiO2 FiO2-2 pO2 pO2-2|    1 9 313 23 149 22 139 2 0 0| 1 1 1 1 1 1 1 1 0 0
     12548|     2|                       bands bands-10 lymphocytes lymphocytes-10 WBC WBC-1|       1 4 56 17 88 18 89 2 0 0| 1 1 1 1 1 1 1 1 0 0

2. single_points.csv

This file includes the variables listed in static_var_cols and the target navigator.
Example:

subject_id| event| gender| anchor_age| comorbidity_1| ...| comorbidity_n| target
     12345|     1|      M|         68|             1| ...|             0|      1
     12457|     1|      M|         80|             0| ...|             0|      1
     12457|     2|      F|         34|             0| ...|             0|      0
     12548|     1|      M|         57|             1| ...|             1|      0
     12548|     2|      F|         63|             1| ...|             0|      0

Getting Started

1. Clone the repository:

git clone https://github.com/DHLab-TSENG/NPCNet.git

2. Change directory to the project directory:

cd NPCNet

3. Install dependencies:

pip install -r requirements.txt

4. Modify config.py to match your dataset:

Update hyperparameters in args as needed, and adjust the static variables to match your dataset:

static_var_cols: names of the static variables in your single_points.csv
static_var_catnums: number of categories for each variable in static_var_cols

Example:

static_var_cols = [gender, comorbidity_1, ..., comorbidity_n, age]
static_var_catnums = [2, 2, ..., 2, 10]

5. Run the model:

After placing texts.csv and single_points.csv in the same directory as main.py, run the following command:

python main.py

6. Get the output.csv:

This file outputs the deep representations of the patients and their cluster assignments.
Example:

subject_id| event| embedding_1| ...| embedding_n| cluster
     12345|     1|       0.007| ...|      -0.158|       2
     12457|     1|       0.218| ...|      -0.232|       0
     12457|     2|       0.148| ...|      -0.205|       1
     12548|     1|       0.169| ...|      -0.222|       2
     12548|     2|       0.148| ...|      -0.208|       3

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
LICENSE		LICENSE
README.md		README.md
config.py		config.py
dataset.py		dataset.py
main.py		main.py
model.py		model.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NPCNet: Navigator-Driven Pseudo Text for Deep Clustering of Early Sepsis Phenotyping

Data Preparation

1. texts.csv

2. single_points.csv

Getting Started

1. Clone the repository:

2. Change directory to the project directory:

3. Install dependencies:

4. Modify config.py to match your dataset:

5. Run the model:

6. Get the output.csv:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NPCNet: Navigator-Driven Pseudo Text for Deep Clustering of Early Sepsis Phenotyping

Data Preparation

1. texts.csv

2. single_points.csv

Getting Started

1. Clone the repository:

2. Change directory to the project directory:

3. Install dependencies:

4. Modify config.py to match your dataset:

5. Run the model:

6. Get the output.csv:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages