LongPack packs the right context for long-horizon tasks and calls gpt-oss-inference via Groq. It includes a packing policy (recency > pinned > salience > semantic similarity > dedup), local memory (SQLite + FAISS-compatible vector store), JSON contracts (Zod), and a minimal API/CLI/playground.
- Install deps
npm i- Configure env
cp .env.example .env
# edit .env and set GROQ_API_KEY
# optionally set MODEL_NAME=openai/gpt-oss-20b (overrides MODEL_PROFILE)- Run the server
npm run dev- Try health and playground
curl http://localhost:8787/health
open http://localhost:8787/- Seed demo data (optional but recommended)
npm run seed-demoPOST /pack— Apply packing policy to messages + memory and return selected context.POST /answer— Produce an answer using packed context (uses Groq gpt-oss-inference).POST /memory/update— Upsert memory items.POST /memory/retrieve— Retrieve memory items.POST /policy/update— Update policy weights.
See src/schemas.ts for exact Zod contracts.
- Default:
openai/gpt-oss-20bvia Groq (OpenAI-compatible baseURL). - Override: set
MODEL_NAMEin.env(takes precedence overMODEL_PROFILE). - Profiles: use
MODEL_PROFILE=dev|demoas a hint; when unspecified, still defaults to gpt-oss.
Configure via .env:
GROQ_BASE_URL=https://api.groq.com/openai/v1MODEL_PROFILE=devfor iteration, orMODEL_PROFILE=demofor compliance demo.MODEL_NAME=openai/gpt-oss-20bto force gpt-oss regardless of profile.
Artifacts emitted by /answer route (and CLI) include: pack.json, prompt.json, answer.json, validation.json.
In demo mode (MODEL_PROFILE=demo), these artifacts are also written to disk under data/artifacts/<timestamp>/.
- Metadata: SQLite (
data/longpack.db). - Vector index: JSON/FAISS-compatible fallback at
data/index.json. If FAISS is available, it will be used; otherwise a portable cosine index is used.
- Embeddings:
Xenova/all-MiniLM-L6-v2via@xenova/transformers(no external calls). - Optional reranker:
BAAI/bge-reranker-basevia@xenova/transformers.
- Uses gpt-oss-inference via Groq with OpenAI-compatible client.
- Demo uses
gpt-oss-20bexclusively. This is documented and enforced byMODEL_PROFILE=demo. - License: MIT.
npm run eval— synthetic evaluation harness (seescripts/synth-eval.ts).npm run seed-demo— loaddata/seeds.jsoninto memory (capabilities, shipping date, glossary).npm run demo— run server withMODEL_PROFILE=demoso artifacts are saved to disk.npm run ingest-docs -- docs— bulk-ingest local Markdown/Text into memory (chunked) from thedocs/directory.
src/
index.ts
config.ts
schemas.ts
routes/
pack.ts
answer.ts
memory.ts
policy.ts
services/
groqClient.ts
embeddings.ts
reranker.ts
vectorStore.ts
sqliteStore.ts
tokenizer.ts
packer.ts
public/
index.html
main.js
bin/
longpack.js
scripts/
synth-eval.ts
seed-demo.ts
data/
.gitkeep
seeds.json
artifacts/
- First run will download small ONNX models for embeddings/reranker; they are cached locally.
- Tokenization is an approximation (chars/4); adjust to your target model later if needed.
- Edit
data/seeds.jsonand runnpm run seed-demoto preload persistent memories used in the demo (capabilities, milestones, etc.). - You can also POST directly to
/memory/update. - For broader knowledge, run
npm run ingest-docs -- docsto chunk and ingest files underdocs/.
- Run:
npm run eval - Output: writes a JSON report to
data/eval/report-<timestamp>.jsonwith basic metrics:- Needle recall
- Long-dialog retention
- Faithfulness (refusal when out-of-context)
- Budget audit (totalTokens <= budget)
cp .env.example .envand setGROQ_API_KEY. Optionally setMODEL_NAME=openai/gpt-oss-20b.npm i && npm run demo(demo mode saves artifacts to disk).npm run seed-demoto load capabilities + shipping date.- Ask: "What is our shipping plan" → Answer comes from memory; check
data/artifacts/<ts>/. - Ask: "How can you help me?" → Capabilities memory is used.
- Ask: unrelated question → Model refuses (faithfulness). See artifacts for exact context.
Use this section to fill the Devpost form quickly.
- Project title: LongPack — Context & Memory Kernel for gpt-oss
- Short description: A transparent, auditable context+memory kernel that packs the most relevant information into a fixed token budget and calls gpt-oss via Groq with structured outputs and artifacts.
- What it does: LongPack selects messages and persistent memories under a hard token budget using a clear policy (recency > pinned > salience > semantic > dedup + optional BM25 and reranker), then answers strictly from packed context. It validates structured outputs and emits artifacts (pack, prompt, validation, answer) for audit.
- How we built it: TypeScript Fastify + Zod; SQLite + local vector index (Xenova embeddings); optional reranker; Groq OpenAI-compatible client to
openai/gpt-oss-20b. JSON Schema (Ajv) + Zod repair loop. Scripts for seeding, ingestion, and evaluation. - Challenges: Long-horizon recall and budget discipline. Solved with a deterministic packing policy, semantic+lexical fusion, and strict prompting.
- Accomplishments: Deterministic packing, structured outputs with repair, demo-mode artifact saving, evaluation harness, and local-first retrieval.
- What’s next: Optional local-inference mode (Ollama/vLLM), auto-memory extraction, BM25+Reranker tuning.
- Category choice: Best Overall (+ Wildcard). We showcase a reusable kernel for any long-horizon LLM workflow, built for gpt-oss.
Submission checklist:
- Text description: Use the bullets above.
- Video (≤3 min): Show seeding → packing → answering → artifacts → evaluation report.
- Public repo: Link this repo; README includes run instructions and compliance.
- Test access: Provide local run steps (npm i; cp .env; npm run demo) and example curl commands.
- Evidence of gpt-oss use:
/healthshowsopenai/gpt-oss-20b, and README’s Model Usage section explains the setup.
Artifacts for submission:
- Run
npm run demoand then trigger one/answercall (playground/CLI) to writedata/artifacts/<timestamp>/. - Optionally:
npm run gen-artifactsto trigger a demo call automatically.
Testing instructions (for Devpost form):
git clone <repo-url>
cd longpack
npm i
cp .env.example .env
# add GROQ_API_KEY; optional MODEL_NAME=openai/gpt-oss-20b
npm run demo
# open http://localhost:8787
npm run seed-demo
# Ask: "What is our shipping plan" then inspect data/artifacts/<timestamp>/Licensed under the MIT License. See LICENSE for details.
Built for the OpenAI DevDay Hackathon — gpt-oss track.