devpipe

LLM-powered text compression and spec generation from the command line.

Features

Surprisal-based compression — remove low-information tokens using a local quantized Llama model or via Groq API
Spec generation — produce detailed technical specifications via the Groq API
Generate + compress pipeline — generate a spec then compress the output in a single command

Prerequisites

Rust 1.75+ (rustup recommended)
macOS with Apple Silicon recommended (Metal GPU acceleration, no extra setup)
Internet access on first run (downloads ~700 MB GGUF model from HuggingFace, cached locally)
GROQ_API_KEY for the generate and generate-compress commands (free at console.groq.com)

Installation

From source

git clone https://github.com/Amir-Zecharia/devpipe
cd devpipe
cargo install --path .

From crates.io (planned)

cargo install devpipe

macOS 26+ note: If you hit a cstdint build error, set:
export CXXFLAGS="-isystem $(xcrun --show-sdk-path)/usr/include/c++/v1"

Usage

Compress text

# Compress a file, keeping 70% of tokens (default)
devpipe compress input.txt --stats

# Compress from stdin with a custom ratio
cat input.txt | devpipe compress --keep-ratio 0.5

# Auto-detect optimal ratio via elbow detection
devpipe compress input.txt --auto --stats

# Keep exactly 200 output tokens
devpipe compress input.txt --target-tokens 200

# Use Groq API for compression instead of local model
devpipe compress input.txt --groq --stats

Generate a technical spec

export GROQ_API_KEY=your_key_here

# Generate to stdout
devpipe generate "Payment processing microservice"

# Generate to file
devpipe generate "A real-time collaborative code editor" -o spec.md

# Output as JSON with metadata
devpipe generate "Auth system with OAuth2" --json

# Print generation stats
devpipe generate "Payment service" --stats

Generate then compress

# Generate a spec and compress the output
devpipe generate-compress "Payment processing microservice"

# With custom keep ratio and stats
devpipe generate-compress "Auth system" --keep-ratio 0.5 --stats

# Auto-detect optimal compression ratio
devpipe generate-compress "CLI tool" --auto

# Use Groq API for compression step
devpipe generate-compress "API gateway" --groq --stats

# Output to file as JSON
devpipe generate-compress "Auth system" -o spec.md --json

How It Works

Compression uses surprisal scoring: each token is assigned a score representing how unlikely it was given the preceding context. High-surprisal tokens carry novel information; low-surprisal tokens are predictable filler. devpipe compress runs a quantized Llama model locally via llama.cpp, scores every token, and discards the bottom fraction — keeping only the most informative ones. Alternatively, use --groq to compress via the Groq API.

Spec generation sends a structured prompt to the Groq API (Llama 3.3 70B) and returns a comprehensive, developer-ready technical specification.

Generate-compress chains both steps: generate a spec from a prompt, then compress the output for downstream consumption.

Architecture

Module	Purpose
`main.rs`	CLI entry point (clap derive) and subcommand dispatch
`compress.rs`	Surprisal scoring, token filtering, elbow detection
`generate.rs`	Groq API client, spec generation, and Groq-based compression
`model.rs`	GGUF model downloading and loading via HuggingFace Hub

CLI Reference

`compress` options

Flag	Default	Description
`[input]`	stdin	Input text or file path
`--keep-ratio`	`0.7`	Fraction of tokens to keep (0.0-1.0)
`--stats`	off	Print token counts and compression ratio to stderr
`--auto`	off	Auto-detect keep ratio via elbow detection
`--target-tokens`	-	Keep exactly this many output tokens
`--groq`	off	Use Groq API for compression instead of local model

`generate` options

Flag	Default	Description
`<prompt>`	required	Short description for the spec
`-o, --output`	stdout	Output file path
`--json`	off	Output as JSON with metadata
`--stats`	off	Print generation stats to stderr

`generate-compress` options

Flag	Default	Description
`<prompt>`	required	Short description for the spec
`--keep-ratio`	`0.7`	Fraction of tokens to keep (0.0-1.0)
`--stats`	off	Print stats to stderr
`--auto`	off	Auto-detect keep ratio via elbow detection
`--target-tokens`	-	Keep exactly this many output tokens
`--groq`	off	Use Groq API for compression
`-o, --output`	stdout	Output file path
`--json`	off	Output as JSON with metadata

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
skills/devpipe		skills/devpipe
src		src
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
build.rs		build.rs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

devpipe

Features

Prerequisites

Installation

From source

From crates.io (planned)

Usage

Compress text

Generate a technical spec

Generate then compress

How It Works

Architecture

CLI Reference

`compress` options

`generate` options

`generate-compress` options

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

devpipe

Features

Prerequisites

Installation

From source

From crates.io (planned)

Usage

Compress text

Generate a technical spec

Generate then compress

How It Works

Architecture

CLI Reference

compress options

generate options

generate-compress options

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`compress` options

`generate` options

`generate-compress` options

Packages