goldilocks-models

ML and LLM training, tracking, and evaluation for the UKRI Goldilocks DFT-recommendation system.

This package consumes ML-ready Parquet from goldilocks-data and ships versioned per-task model artefacts plus manifests to goldilocks-core, which orchestrates recommendation at inference time.

Documentation

PLAN.md — full design source of truth (contract, architecture, decisions, roadmap).
docs/ — derived chapters, added on demand.

Phase 1 scope

DFT code: Quantum ESPRESSO (pw.x)
Calculation type: SCF only
Structures: Materials Cloud MC3D PBEsol v2
Pseudopotentials: PseudoDojo NC + PAW-JTH (15 active families)
Active task: kpoints (k-mesh recommendation as kindex regression)
Other tasks (ecutwfc / smearing / pseudo / xc / resources / explanation) are placeholders awaiting upstream data sweeps.

Quick start

uv sync                         # core deps only
uv sync --extra nn              # plus PyTorch + Lightning
uv sync --extra gnn             # plus PyTorch Geometric
uv sync --extra llm             # plus HuggingFace transformers + PEFT
uv sync --all-extras            # everything

Repository layout

src/goldilocks_models/
├── data/        # Parquet IO, feature engineering, splits, dataset wrappers
├── tasks/       # 7 prediction problems — what to predict
├── models/      # 4 algorithm families — how to predict
├── training/    # train loops, callbacks, losses
├── evaluation/  # metrics and slice reports
├── tracking/    # MLflow adapter
├── registry/    # versioned artefacts + manifests for handoff to goldilocks-core
└── cli/         # gm train / eval / predict / register

Sibling repositories

Repo	Role	Relationship
`goldilocks-data`	DFT sweeps + Parquet datasets	input
`goldilocks-models` (here)	ML / LLM training	—
`goldilocks-core`	Recommendation + parsing + LLM explanation	output (models + manifests)
`goldilocks-webapp`	Frontend	indirect (via core)

UKRI Goldilocks grant EP/Z530657/1.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
src/goldilocks_models		src/goldilocks_models
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

goldilocks-models

Documentation

Phase 1 scope

Quick start

Repository layout

Sibling repositories

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

goldilocks-models

Documentation

Phase 1 scope

Quick start

Repository layout

Sibling repositories

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages