Add pLDDT analysis script for FoldX5 vs FoldX5.1 classification agree…#70
Add pLDDT analysis script for FoldX5 vs FoldX5.1 classification agree…#70angelikivliora wants to merge 3 commits into
Conversation
| except Exception as e: | ||
| print(f" could not process {protein_name}: {e}") | ||
| continue |
There was a problem hiding this comment.
here it would make more sense to exit with error - otherwise you just skip over cases that might have problems
There was a problem hiding this comment.
Updated to raise a RuntimeError for unexpected failures, while still skipping proteins that fail due to a missing column with a warning. I checked all proteins in the FoldX5.1 dataset and found 7 cases where an expected column is missing from the CSV ( 6 proteins: CNOT3, DEFB108B, FUS, GPC3, MLF1, ZNF738 are missing the pLDDT column, and 1 protein: STKLD1 is missing the ddG column. So i treated them as expected failures and skipped while anything else will raise an error
| plot_plddt_distribution(df_mutations, args.output_dir) | ||
| plot_plddt_scatter(df_mutations, args.output_dir) | ||
| print("\n[3] per-protein summary...") | ||
| plot_per_protein_delta(df_mutations, args.output_dir) |
There was a problem hiding this comment.
more of a note for the future:
if you're running 1 call per function, they usually don't need to be functions unless they have a more general use e.g. being imported
overusing function encapsulation can complicate things and makes the script more opaque
There was a problem hiding this comment.
I adapted the code to the comment
| # Matching is done by UniProt ID + protein name | ||
| # RMSD is computed only on overlapping residues between the two structures | ||
| # Run: python rmsd_matrix.py --foldx5_dir /path/to/folder --foldx51_dir /path/to/folder --output_dir ./rmsd_results | ||
| # Run: python rmsd_matrix.py -f /data/user/shared_projects/mavisp_ensemble_sim_length/foldx5.1_evaluation/foldx5_initial_structures -i /data/user/shared_projects/mavisp_ensemble_sim_length/foldx5.1_evaluation/data_collection_foldx5.1 -o ./rmsd_results |
There was a problem hiding this comment.
this should be in a readme file, not in the code
There was a problem hiding this comment.
i removed it and also removed the scatter,violin, and barplot as discussed previously
… functions in plddt script
No description provided.