AItom consists of three core layers:
- LLM-Based Structured Extraction
- Ontology-Aligned Knowledge Graph Construction
- Graph Retrieval-Augmented Generation (Graph RAG)
- Transformer-MLP Architecture for Safety Check
The system enables an end-to-end pipeline:
Raw Literature (PMID: 35614129)
β
Ontology Design
β
LLM Extraction
β
Ontology Mapping
β
Graph Database
β
Graph Retrieval
β
LLM Generation (Graph RAG) + Safety Check (Transformer-MLP)
- "Dataset of solution-based inorganic materials synthesis procedures extracted from the scientific literature"
- PMID: 35614129
- Protege software
Node
ChemicalEntity
InorganicMaterial
Precursor
Solvent
Media
Abrasive
Product
Additive
Process
SynthesisMethod
SynthesisStep
ConditionSet
Condition
Edge
usesPrecursor (SynthesisStep β Precursor)
usesSolvent (SynthesisStep β Solvent)
producesProduct (SynthesisStep β Product)
usesAdditive (SynthesisStep β Addictive)
usesMedia (SynthesisStep β Media)
usesAbrasive (SynthesisStep β Abrasive)
hasSynthesisMethod (InorganicMaterial β SynthesisMethod)
performedUnder (SynthesisStep β Condition)
nextStep (SynthesisStep β SynthesisStep)
consistOfStep (SynthesisMethod β SynthesisStep)
hasName(ChemicalEntity β xsd:string)
hasAcronym(InorganicMaterial β xsd:string)
hasPhase(InorganicMaterial β xsd:string)
isOxygenDeficiency(InorganicMaterial β xsd:float)
hasReaction(InorganicMaterial β xsd:string)
hasID (SynthesisMethod β xsd:integer)
hasTemperature (Condition β xsd:string)
hasTime (Condition β xsd:string)
haspH (Condition β xsd:string)
hasPressure (Condition β xsd:string)
hasAction(SynthesisStep β xsd:string)
hasNote (SynthesisStep β xsd:string)
- Transformer + MLP Architecture
- Transformer: CrabNet
pick top12 properties (LightGBM using)
β
12 checkpoints of CrabNet loading
β
concat 12 x embedding vector to single embedding vector
β
MLP Design
β
Safe / Unsafe Prediction

