reFlow

A Metal Soul In My Hand — A feature-decoupled Transformer architecture with native interpretability.

reFlow factorizes the embedding matrix $E \in \mathbb{R}^{V \times d}$ into a Recipe Matrix $W_{recipe} \in \mathbb{R}^{V \times S}$ and a Signal Basis Matrix $W_{basis} \in \mathbb{R}^{S \times d}$, forcing the model to maintain a set of continuous, low-redundancy signal bases in latent space. The same factored product $W_{recipe} \times W_{basis}$ serves as both the input embedding and the output projection, forming an end-to-end signal-manifold computation loop without a separate LM head.

Key Results

Convergence. At matched depth and scale (36 layers, ~515M parameters), reFlow-1-Big achieves a validation loss within ~1% of GPT-2-New (514M). Three scale points — Small (46.47M), reFlow-1 (463.67M), Big (515.06M) — confirm strict scaling law compliance (val loss: 3.55 → 3.01 → 2.92).

Emergent Interpretable Structure (pure language modeling objective, no auxiliary loss):

Recipe-space semantic algebra: king + woman − man → queen (rank #1), 3/3 tests passed
Natural sparsity: each token activates ~11% of signals (mean 117/1024), Gini coefficient 0.085
Causal traceability: single-signal ablation collapses target probability from 8.31% to 0.03%
Information crystallization boundary: semantic interventions are effective at L0–L12 but inert beyond L18
Hard sparsity (Top-64) systematically destroys recipe-space semantic structure (algebra 3/3 → 0/3, silhouette +0.11 → −0.02)

Paper: English (PDF) | 中文 (PDF) — Theoretical derivation, 12 interpretability experiments, and scaling/ablation analysis.

Pretrained Weights: HuggingFace

Project Structure

reFlow/
├── train.py              # Training script (single GPU / DDP)
├── sample.py             # Text generation from trained models
├── experiment.py          # 12-experiment interpretability suite (Chinese)
├── experiment_en.py       # 12-experiment interpretability suite (English)
├── check.py              # Checkpoint parameter inspector
├── bench.py              # Performance benchmarking
├── models/
│   ├── gpt2.py           # Standard GPT-2 baseline
│   ├── gpt2-new.py       # Modernized GPT-2 (RoPE + SwiGLU + RMSNorm)
│   ├── reflow.py         # reFlow base architecture
│   ├── reflow-topk.py    # reFlow with ReLU + Top-K hard sparsity
│   └── reflow-lite.py    # reFlow with GQA + reduced MLP
├── config/               # Training / sampling / eval configurations
├── data/
│   ├── openwebtext/      # OpenWebText dataset preparation
│   └── sft-lima/         # LIMA SFT dataset preparation
└── out/                  # Checkpoints and experiment reports

Installation

Prerequisites

Python 3.10+
CUDA-compatible GPU (tested on Tesla T4 x4)

1. PyTorch (CUDA 12.8)

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128

Adjust the CUDA version in the URL to match your driver. See PyTorch Get Started.

2. Core Dependencies

pip install datasets tiktoken wandb tqdm

3. Experiment Suite Dependencies

The interpretability experiments (experiment.py) require additional packages:

pip install numpy matplotlib seaborn scikit-learn scipy adjustText

Quick Install (All-in-One)

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
pip install datasets tiktoken wandb tqdm numpy matplotlib seaborn scikit-learn scipy adjustText

Data Preparation

OpenWebText

python data/openwebtext/prepare.py

This downloads the OpenWebText corpus (~54 GB) and tokenizes it with the GPT-2 BPE tokenizer. Output: data/openwebtext/train.bin (~17 GB, ~9B tokens) and val.bin.

Training

All configurations are in config/. No CLI overrides — all hyperparameters must be set in the config file.

Single GPU

python train.py config/train_reflow_1.py

Multi-GPU (DDP)

torchrun --standalone --nproc_per_node=4 train.py config/train_reflow_1.py

Available Training Configs

Config	Architecture	Layers	Params	Notes
`train_gpt2.py`	GPT-2	36	505.62M	Standard baseline
`train_gpt2_new.py`	GPT-2-New	36	514.01M	+ RoPE, SwiGLU, RMSNorm
`train_reflow_1.py`	reFlow	32	463.67M	Base reFlow, constant lr
`train_reflow_1_big.py`	reFlow	36	515.06M	lr decay, for interpretability
`train_reflow_1_topk_big.py`	reFlow-TopK	36	515.06M	+ ReLU + Top-64 sparsity
`train_reflow_1_lite.py`	reFlow-Lite	32	413.34M	+ GQA, reduced MLP
`train_reflow_1_small.py`	reFlow	6	46.47M	Small-scale validation

Resume Training

Append _resume to the config name (e.g., train_reflow_1_big_resume.py).

Text Generation

python sample.py config/sample_reflow_1.py

Edit the config file to change the prompt, temperature, top-k, etc.

Interpretability Experiments

The experiment suite runs 12 analyses on a trained reFlow model. Both Chinese and English versions are available:

python experiment_en.py config/train_reflow_1_big.py   # English
python experiment.py config/train_reflow_1_big.py      # Chinese

An interactive menu will appear:

#	Experiment	Group
1	Recipe Atlas — recipe-space nearest neighbors	A. Signal Identity
2	Sparsity Profile — activation sparsity analysis	A. Signal Identity
3	Basis Geometry — singular value & effective rank	A. Signal Identity
4	Semantic Galaxy — PCA clustering visualization	B. Semantic Properties
5	Semantic Algebra — vector arithmetic (king − man + woman = queen)	B. Semantic Properties
6	Typo Resilience — robustness to spelling errors	B. Semantic Properties
7	Layer Evolution — per-layer probability crystallization	C. Mechanistic Analysis
8	Signal Flow — signal activation heatmaps across layers	C. Mechanistic Analysis
9	Causal Ablation — progressive signal knockout curves	C. Mechanistic Analysis
10	Emotion Surgery — sentiment steering via signal injection	D. Control & Steering
11	Concept Inception — binary-search concept implantation	D. Control & Steering
12	Genetic Hijack — global recipe matrix manipulation	D. Control & Steering

Enter all to run all experiments, or specific numbers (e.g., 1 3 5). Reports are saved to out/<model>/audit_reports/.

Checkpoint Inspection

python check.py config/train_reflow_1.py out/reflow-1/ckpt.pt

License

MIT License. Based on nanoGPT by Andrej Karpathy.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
config		config
data		data
logs		logs
models		models
paper		paper
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
bench.py		bench.py
check.py		check.py
experiment.py		experiment.py
experiment_en.py		experiment_en.py
sample.py		sample.py
scaling_laws.ipynb		scaling_laws.ipynb
train.py		train.py
transformer_sizing.ipynb		transformer_sizing.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

reFlow

Key Results

Project Structure

Installation

Prerequisites

1. PyTorch (CUDA 12.8)

2. Core Dependencies

3. Experiment Suite Dependencies

Quick Install (All-in-One)

Data Preparation

OpenWebText

Training

Single GPU

Multi-GPU (DDP)

Available Training Configs

Resume Training

Text Generation

Interpretability Experiments

Checkpoint Inspection

License

About

Uh oh!

Releases 2

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

reFlow

Key Results

Project Structure

Installation

Prerequisites

1. PyTorch (CUDA 12.8)

2. Core Dependencies

3. Experiment Suite Dependencies

Quick Install (All-in-One)

Data Preparation

OpenWebText

Training

Single GPU

Multi-GPU (DDP)

Available Training Configs

Resume Training

Text Generation

Interpretability Experiments

Checkpoint Inspection

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Contributors

Uh oh!

Languages