Quadtrix.cpp is a local GPT-style language model project with multiple runtime paths:
- Native C++ inference and training through
Quadtrix.exe/main.cpp - PyTorch checkpoint inference through
engine/inference.pyandengine/best_model .pt - FastAPI middleware in
backend/ - React + TypeScript chat UI in
frontend/
The web interface can chat with both model backends:
C++: calls the C++ HTTP server on port8080.pt: loads the PyTorch checkpoint directly fromengine/best_model .pt
Quadtrix.cpp/
Quadtrix.exe
main.cpp
config/
include/
data/
engine/
inference.py
main.py
fine-tune/main.py
best_model .pt
fineweb_30mb.txt
backend/
main.py
inference.py
requirements.txt
frontend/
package.json
src/
- Python 3.10+
- Node.js 18+
- npm
- C++17 compiler if you want to rebuild the C++ executable
From the repo root:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp
python -m venv .venv
.\.venv\Scripts\python.exe -m pip install --upgrade pipInstall backend and PyTorch inference dependencies:
cd backend
..\.venv\Scripts\python.exe -m pip install -r requirements.txtcd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp\frontend
npm.cmd install
npm.cmd run buildRun the frontend:
npm.cmd run devFrontend URL:
http://localhost:5173
The frontend is configured as an installable PWA. It includes:
frontend/manifest.webmanifestfrontend/sw.jsfrontend/public/manifest.webmanifestfrontend/public/sw.js- service worker registration in
frontend/src/registerServiceWorker.ts
For the clean installable version, build and preview the frontend:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp\frontend
npm.cmd run build
npm.cmd run previewOpen the preview URL, usually:
http://localhost:4173
Then install from the browser:
- Chrome / Edge: click the install icon in the address bar
- Or open browser menu -> Apps -> Install this site as an app
The installed app still talks to the backend at:
http://localhost:3001
So keep the FastAPI backend running when chatting.
The .pt model does not need a separate model server. The FastAPI backend loads it directly from:
engine/best_model .pt
Start the backend:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp\backend
..\.venv\Scripts\python.exe -m uvicorn main:app --host 127.0.0.1 --port 3001Start the frontend in another terminal:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp\frontend
npm.cmd run devOpen:
http://localhost:5173
Select .pt in the top bar.
Start the C++ inference server:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp
.\Quadtrix.exe data\input.txt --server --port 8080Start the backend:
cd backend
..\.venv\Scripts\python.exe -m uvicorn main:app --host 127.0.0.1 --port 3001Start the frontend:
cd ..\frontend
npm.cmd run devOpen:
http://localhost:5173
Select C++ in the top bar.
Use three terminals.
Terminal 1:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp
.\Quadtrix.exe data\input.txt --server --port 8080Terminal 2:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp\backend
..\.venv\Scripts\python.exe -m uvicorn main:app --host 127.0.0.1 --port 3001Terminal 3:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp\frontend
npm.cmd run devOpen:
http://localhost:5173
Switch between C++ and .pt from the model selector.
Base URL:
http://localhost:3001
Routes:
GET /api/health
GET /api/stats
POST /api/chat
GET /api/sessions
POST /api/sessions
DELETE /api/sessions/{id}
GET /api/sessions/{id}/messages
POST /api/feedback
Example .pt chat request:
Invoke-RestMethod `
-Uri http://localhost:3001/api/chat `
-Method Post `
-ContentType "application/json" `
-Body '{
"session_id": null,
"prompt": "Once upon a time",
"max_tokens": 100,
"temperature": 1.0,
"stream": false,
"model_backend": "torch"
}'Example C++ chat request:
Invoke-RestMethod `
-Uri http://localhost:3001/api/chat `
-Method Post `
-ContentType "application/json" `
-Body '{
"session_id": null,
"prompt": "Once upon a time",
"max_tokens": 100,
"temperature": 1.0,
"stream": false,
"model_backend": "cpp"
}'Backend defaults are in backend/.env.example:
API_PORT=3001
CORS_ORIGINS=http://localhost:5173
REDIS_URL=
LOG_LEVEL=INFO
MAX_SESSIONS=1000
SESSION_TTL_HOURS=24
CPP_SERVER_URL=http://localhost:8080
TORCH_CHECKPOINT_PATH=../engine/best_model .pt
REQUEST_TIMEOUT_SECONDS=60
Create backend/.env if you want overrides.
Frontend defaults are in frontend/.env.example:
VITE_API_BASE_URL=http://localhost:3001
Interactive chat:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp
.\.venv\Scripts\python.exe engine\inference.py --checkpoint "engine\best_model .pt"Generate once:
.\.venv\Scripts\python.exe engine\inference.py --checkpoint "engine\best_model .pt" --prompt "Hello" --max-new-tokens 100 --temperature 1.0Main training:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp
.\.venv\Scripts\python.exe engine\main.pyFine-tuning:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp
.\.venv\Scripts\python.exe engine\fine-tune\main.pyBuild manually:
g++ -std=c++17 -O2 -I. -Iinclude -o Quadtrix.exe main.cppTrain from scratch:
.\Quadtrix.exe data\input.txtTerminal chat:
.\Quadtrix.exe data\input.txt --chatRaw generation:
.\Quadtrix.exe data\input.txt --generateHTTP server:
.\Quadtrix.exe data\input.txt --server --port 8080Backend:
Invoke-RestMethod http://localhost:3001/api/healthC++ server:
Invoke-RestMethod http://localhost:8080/healthFrontend:
http://localhost:5173
When only .pt is available, backend health should show:
{
"status": "degraded",
"api": "ok",
"cpp_server": "unreachable",
"torch_model": "ok"
}When both are available, backend health should show:
{
"status": "ok",
"api": "ok",
"cpp_server": "ok",
"torch_model": "ok"
}Use npm.cmd:
npm.cmd run dev
npm.cmd run buildCheck that this file exists:
engine/best_model .pt
Then check Python dependencies:
cd backend
..\.venv\Scripts\python.exe -c "import torch, tiktoken; print(torch.__version__)"Install dependencies into the repo venv:
cd backend
..\.venv\Scripts\python.exe -m pip install -r requirements.txtStart the C++ server:
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp
.\Quadtrix.exe data\input.txt --server --port 8080Check:
http://localhost:3001/api/health
Make sure frontend config points to:
VITE_API_BASE_URL=http://localhost:3001
Get-NetTCPConnection -LocalPort 3001
Get-NetTCPConnection -LocalPort 5173
Get-NetTCPConnection -LocalPort 8080# Terminal 1
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp
.\Quadtrix.exe data\input.txt --server --port 8080# Terminal 2
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp\backend
..\.venv\Scripts\python.exe -m uvicorn main:app --host 127.0.0.1 --port 3001# Terminal 3
cd C:\Users\Admin\Documents\GitHub\Quadtrix.cpp\frontend
npm.cmd run devOpen:
http://localhost:5173
MIT