A skillset for teaching AI agents to generate valid InChI identifiers from MDL MOL files using pure algorithms — without external cheminformatics libraries.
This project contains attempts at creating an InChI skill set for AI agents. There are currently 3 attempts:
- mol-to-inchi-ver1/ — First attempt at the skill
- mol-to-inchi-ver2/ — Second attempt (refined approach)
- mol-to-inchi-ver3/ — Third attempt (skill-creator and python code allowed, self-improving skill didnt work)
InChI-skill-set/
├── mol-to-inchi-ver1/ # First attempt
├── mol-to-inchi-ver2/ # Second attempt
├── mol-to-inchi-ver3/ # Third attempt
├── src/formula # Python code for hill formula
├── data/ # .mol and .inchi files
└── README.md
The data/ folder contains paired .mol molecule files and their corresponding .inchi reference files for testing and validation.
The docs/ folder contains documentation about the InChI codebase.
The src/formula/ folder contains AI generate Python code to see whether code can be generate to replicate the InChI C code for the Hill formula.
This skillset teaches generation without cheminformatics libraries. Do NOT use:
- OpenBabel
- Datamol
- RDKit
- CDK
- Python chemistry libraries
- InChI binaries
Only implement the algorithms described in the skill documentation.
MIT