The code for training models and generating new samples using trained models is found in the Graph_Framework folder.
configsis for configuration files formain.py,generate.py, andprocess_dataset.datasetsis for python files for each dataset.diffusioncontains the implemented types of diffusion and noise schedules.modelsis for all the models used in the framework and must follow the specified structure below.runscontains the runs and contains a folder for each name specified in used config file.utilscontains files with predefined helper functions used by the framework and to be used by any users.
Setup requirements that must be followed to use the framework.
For an example see the models/modelname folder.
- Create a folder with the name of the model in the
modelsfolder. - The name of the folder must match the lower case name of the model class, e.g.
models/transformerfor a class calledTransformer. - Create these files in the folder:
model.py: the file with the model classtrain.py: the file with methods called during trainingsample.py: the file with methods called during generation
- Add the following methods in the
train.pyfile:loss_fn(x_0, con_diffusion, cat_diffusion, model) -> lossval_fn(val_dataset, con_diffusion, cat_diffusion, model, decode_atom, decode_bond, epoch, config) -> None
- Add the following methods in the
sample.pyfile:sample_batch(n_samples, dataset, model, device) -> x_Tsample_reverse(con_diffusion, cat_diffusion, model, t, x_t) -> x_t-1sample_mols(x_0, dataset) -> mols
- Create a file in the
datasetsfolder that contains a class for a PyTorch dataloader. - The name of the file must match the lower case name of the class, e.g.
dataset.pyand the classDataset. - The dataset class must include two accessable dictionaries,
decode_atomanddecode_bond, that are used to go from an integer to the correct atom or bond.
In the configs folder:
- See
template_train.ymlfor a config file for training - See
template_generate.ymlfor a config file for generation