Skip to content

ik-labs/longpack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LongPack — Context & Memory Kernel for gpt-oss (MVP)

LongPack packs the right context for long-horizon tasks and calls gpt-oss-inference via Groq. It includes a packing policy (recency > pinned > salience > semantic similarity > dedup), local memory (SQLite + FAISS-compatible vector store), JSON contracts (Zod), and a minimal API/CLI/playground.

Quickstart

  1. Install deps
npm i
  1. Configure env
cp .env.example .env
# edit .env and set GROQ_API_KEY
# optionally set MODEL_NAME=openai/gpt-oss-20b (overrides MODEL_PROFILE)
  1. Run the server
npm run dev
  1. Try health and playground
curl http://localhost:8787/health
open http://localhost:8787/
  1. Seed demo data (optional but recommended)
npm run seed-demo

API Endpoints

  • POST /pack — Apply packing policy to messages + memory and return selected context.
  • POST /answer — Produce an answer using packed context (uses Groq gpt-oss-inference).
  • POST /memory/update — Upsert memory items.
  • POST /memory/retrieve — Retrieve memory items.
  • POST /policy/update — Update policy weights.

See src/schemas.ts for exact Zod contracts.

Model Usage (gpt-oss)

  • Default: openai/gpt-oss-20b via Groq (OpenAI-compatible baseURL).
  • Override: set MODEL_NAME in .env (takes precedence over MODEL_PROFILE).
  • Profiles: use MODEL_PROFILE=dev|demo as a hint; when unspecified, still defaults to gpt-oss.

Configure via .env:

  • GROQ_BASE_URL=https://api.groq.com/openai/v1
  • MODEL_PROFILE=dev for iteration, or MODEL_PROFILE=demo for compliance demo.
  • MODEL_NAME=openai/gpt-oss-20b to force gpt-oss regardless of profile.

Artifacts emitted by /answer route (and CLI) include: pack.json, prompt.json, answer.json, validation.json.

In demo mode (MODEL_PROFILE=demo), these artifacts are also written to disk under data/artifacts/<timestamp>/.

Local-first storage

  • Metadata: SQLite (data/longpack.db).
  • Vector index: JSON/FAISS-compatible fallback at data/index.json. If FAISS is available, it will be used; otherwise a portable cosine index is used.

Embeddings & Reranker (OSS, CPU)

  • Embeddings: Xenova/all-MiniLM-L6-v2 via @xenova/transformers (no external calls).
  • Optional reranker: BAAI/bge-reranker-base via @xenova/transformers.

Hackathon compliance

  • Uses gpt-oss-inference via Groq with OpenAI-compatible client.
  • Demo uses gpt-oss-20b exclusively. This is documented and enforced by MODEL_PROFILE=demo.
  • License: MIT.

Scripts

  • npm run eval — synthetic evaluation harness (see scripts/synth-eval.ts).
  • npm run seed-demo — load data/seeds.json into memory (capabilities, shipping date, glossary).
  • npm run demo — run server with MODEL_PROFILE=demo so artifacts are saved to disk.
  • npm run ingest-docs -- docs — bulk-ingest local Markdown/Text into memory (chunked) from the docs/ directory.

Repo layout

src/
  index.ts
  config.ts
  schemas.ts
  routes/
    pack.ts
    answer.ts
    memory.ts
    policy.ts
  services/
    groqClient.ts
    embeddings.ts
    reranker.ts
    vectorStore.ts
    sqliteStore.ts
    tokenizer.ts
    packer.ts
public/
  index.html
  main.js
bin/
  longpack.js
scripts/
  synth-eval.ts
  seed-demo.ts
data/
  .gitkeep
  seeds.json
  artifacts/

Dev notes

  • First run will download small ONNX models for embeddings/reranker; they are cached locally.
  • Tokenization is an approximation (chars/4); adjust to your target model later if needed.

Seeding

  • Edit data/seeds.json and run npm run seed-demo to preload persistent memories used in the demo (capabilities, milestones, etc.).
  • You can also POST directly to /memory/update.
  • For broader knowledge, run npm run ingest-docs -- docs to chunk and ingest files under docs/.

Evaluation

  • Run: npm run eval
  • Output: writes a JSON report to data/eval/report-<timestamp>.json with basic metrics:
    • Needle recall
    • Long-dialog retention
    • Faithfulness (refusal when out-of-context)
    • Budget audit (totalTokens <= budget)

3-minute demo script

  1. cp .env.example .env and set GROQ_API_KEY. Optionally set MODEL_NAME=openai/gpt-oss-20b.
  2. npm i && npm run demo (demo mode saves artifacts to disk).
  3. npm run seed-demo to load capabilities + shipping date.
  4. Ask: "What is our shipping plan" → Answer comes from memory; check data/artifacts/<ts>/.
  5. Ask: "How can you help me?" → Capabilities memory is used.
  6. Ask: unrelated question → Model refuses (faithfulness). See artifacts for exact context.

Devpost Submission (copy-paste guidance)

Use this section to fill the Devpost form quickly.

  • Project title: LongPack — Context & Memory Kernel for gpt-oss
  • Short description: A transparent, auditable context+memory kernel that packs the most relevant information into a fixed token budget and calls gpt-oss via Groq with structured outputs and artifacts.
  • What it does: LongPack selects messages and persistent memories under a hard token budget using a clear policy (recency > pinned > salience > semantic > dedup + optional BM25 and reranker), then answers strictly from packed context. It validates structured outputs and emits artifacts (pack, prompt, validation, answer) for audit.
  • How we built it: TypeScript Fastify + Zod; SQLite + local vector index (Xenova embeddings); optional reranker; Groq OpenAI-compatible client to openai/gpt-oss-20b. JSON Schema (Ajv) + Zod repair loop. Scripts for seeding, ingestion, and evaluation.
  • Challenges: Long-horizon recall and budget discipline. Solved with a deterministic packing policy, semantic+lexical fusion, and strict prompting.
  • Accomplishments: Deterministic packing, structured outputs with repair, demo-mode artifact saving, evaluation harness, and local-first retrieval.
  • What’s next: Optional local-inference mode (Ollama/vLLM), auto-memory extraction, BM25+Reranker tuning.
  • Category choice: Best Overall (+ Wildcard). We showcase a reusable kernel for any long-horizon LLM workflow, built for gpt-oss.

Submission checklist:

  • Text description: Use the bullets above.
  • Video (≤3 min): Show seeding → packing → answering → artifacts → evaluation report.
  • Public repo: Link this repo; README includes run instructions and compliance.
  • Test access: Provide local run steps (npm i; cp .env; npm run demo) and example curl commands.
  • Evidence of gpt-oss use: /health shows openai/gpt-oss-20b, and README’s Model Usage section explains the setup.

Artifacts for submission:

  • Run npm run demo and then trigger one /answer call (playground/CLI) to write data/artifacts/<timestamp>/.
  • Optionally: npm run gen-artifacts to trigger a demo call automatically.

Testing instructions (for Devpost form):

git clone <repo-url>
cd longpack
npm i
cp .env.example .env
# add GROQ_API_KEY; optional MODEL_NAME=openai/gpt-oss-20b
npm run demo
# open http://localhost:8787
npm run seed-demo
# Ask: "What is our shipping plan" then inspect data/artifacts/<timestamp>/

License

Licensed under the MIT License. See LICENSE for details.

Credits

Built for the OpenAI DevDay Hackathon — gpt-oss track.

About

LongPack packs the right context for long-horizon tasks and calls gpt-oss-inference via Groq.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors