Coldbones

Upload an image or PDF. Get intelligent analysis back. Inference runs locally on a desktop RTX 5090 via LM Studio, reachable from AWS via Tailscale Funnel.

Architecture

Browser
  │  (1) POST /api/presign  →  presigned S3 PUT URL
  │  (2) PUT file directly to S3  (no Lambda, full throughput)
  │  (3) POST /api/analyze  →  202 + jobId
  │  (4) GET  /api/status/{jobId}  (poll every 4 s)
  │
  ▼
CloudFront (app.omlahiri.com)
  ├── /api/*  →  API Gateway  →  Lambda (Python)
  │                                    │
  │                     fast path: async Lambda invoke → analyze_orchestrator
  │                    (desktop unreachable: fallback to SQS queue below)
  │
  │                     offline path: SQS queue
  │                          │
  │                     Desktop worker (home RTX 5090)
  │                          │  long-polls SQS
  │                          └→ LM Studio (Qwen3.5-35B, via Tailscale Funnel)
  │                                    │
  │                          DynamoDB job record ← writes result
  │
  └── /*  →  S3 (React SPA)

How it works

Presign — /api/presign returns a signed S3 PUT URL scoped to a single key, content type, and 5-minute expiry.
Upload — browser PUTs the file directly to S3, bypassing Lambda's 10 MB limit.
Analyze — /api/analyze accepts {s3Key, lang, mode} and always returns 202 + jobId immediately.
- fast mode: analyze_router checks if the desktop is alive (pings /v1/models). If yes, it invokes analyze_orchestrator asynchronously and returns. If the desktop is offline, it falls back to the SQS queue.
- offline mode: always enqueues to SQS; the desktop worker picks it up when available.
Status — browser polls /api/status/{jobId} which reads DynamoDB. Terminal states: COMPLETED (with result) or FAILED (with error message).
Desktop worker — worker/worker.py runs on the RTX 5090. It long-polls SQS, downloads uploads from S3, calls LM Studio locally, and writes results back to DynamoDB.

CDK Stacks

Stack	Resources
ColdbonesStorage	S3 (uploads + site), CloudFront, DynamoDB, Route53, ACM
ColdbonesQueue	SQS (main + DLQ), SNS
ColdbonesApi	Lambda × 5, API Gateway, IAM

Prerequisites

AWS CLI configured (aws configure) for us-east-1
Node.js 20+ and Python 3.12+ installed
CDK dependencies installed:
```
cd infrastructure && npm install
```
Desktop worker set up — see worker/SETUP.md

Deploy to AWS

# From the repo root — deploys Storage → Queue → Api in order:
./scripts/deploy.sh

# Or deploy individual stacks:
./scripts/deploy.sh storage
./scripts/deploy.sh queue
./scripts/deploy.sh api

First-time deploy order:

1. deploy.sh storage   → creates S3, CloudFront, DynamoDB
2. deploy.sh queue     → creates SQS queue (needed by Api lambdas)
3. deploy.sh api       → creates Lambdas + API Gateway
4. Set cdk.json:       coldbones.apiGatewayDomain = <domain from cdk-outputs.json>
5. deploy.sh storage   → adds CloudFront /api/* behavior pointing at API Gateway
6. deploy-frontend.sh  → builds React app and syncs to S3

After deploying, the app is live at https://app.omlahiri.com.

Deploy the Frontend

./scripts/deploy-frontend.sh

Builds the Vite app, syncs it to S3, and invalidates the CloudFront cache. Reads bucket name and distribution ID from scripts/cdk-outputs.json.

Local Development

Backend (FastAPI — local dev only)

The backend/ server is a local dev shim. It accepts the same API contract as the Lambda functions but runs entirely in-process: uploads are stored in memory, and inference goes directly to LM Studio via Tailscale.

cd backend
pip install -r requirements.txt
LM_STUDIO_URL=https://seratonin.tail40ae2c.ts.net uvicorn main:app --reload --port 8000

Frontend

cd frontend
npm install
npm run dev          # http://localhost:5173

Set VITE_API_BASE_URL=http://localhost:8000 (or leave empty to use CloudFront in production).

Desktop Worker

The worker runs on the home RTX 5090. See worker/SETUP.md for full setup instructions including Tailscale Funnel config, LM Studio, and AWS SSM parameters.

Quick start once the desktop is configured:

cd worker
pip install -r requirements.txt
cp .env.example .env   # fill in ANALYZE_QUEUE_URL, UPLOAD_BUCKET, JOBS_TABLE
python worker.py

The worker long-polls SQS, downloads each uploaded file from S3, converts images/PDFs to base64 PNGs, calls LM Studio, and writes the result back to DynamoDB.

Testing

Backend (pytest)

cd backend
pip install -r requirements.txt
pytest --cov=. --cov-report=term-missing

228 tests, 95.19% coverage. Uses moto for AWS mocks (S3, DynamoDB, SQS, SSM).

Frontend (Vitest)

cd frontend
npm install
npm test

352 tests, all passing. Coverage thresholds enforced:

Metric	Threshold	Actual
Statements	97%	97.65%
Branches	90%	91.10%
Functions	97%	97.47%
Lines	97%	98.80%

Infrastructure (Jest)

cd infrastructure
npm install
npm test

CDK snapshot tests for all three stacks.

Project Structure

coldbones/
├── backend/               FastAPI local-dev server (not deployed to AWS)
├── frontend/              React + Vite + TypeScript SPA
│   └── src/
│       ├── components/    AnalysisPanel, FilePreview, UploadZone, …
│       ├── hooks/         useUpload, useAnalysis
│       ├── i18n/          EN / HI / ES / BN translations
│       └── types/
├── infrastructure/        AWS CDK (TypeScript)
│   └── lib/               storage-stack, queue-stack, api-stack
├── lambdas/               Python Lambda handlers
│   ├── analyze_orchestrator/   Downloads S3 file → LM Studio → DynamoDB
│   ├── analyze_router/         Routes fast (async invoke) vs offline (SQS)
│   ├── batch_processor/        Tombstone — messages handled by desktop worker
│   ├── get_presigned_url/      S3 presigned PUT URL generation
│   ├── job_status/             DynamoDB job state polling
│   └── desktop_client.py       Shared: SSM-cached LM Studio OpenAI client
├── worker/                Desktop SQS worker (runs on RTX 5090)
│   ├── worker.py
│   ├── SETUP.md
│   └── requirements.txt
└── scripts/
    ├── deploy.sh          CDK deploy (Storage / Queue / Api)
    ├── deploy-frontend.sh S3 sync + CloudFront invalidation
    └── validate.sh        End-to-end API smoke test

Documentation

Comprehensive documentation is in the docs/ directory:

Document	Description
ARCHITECTURE.md	System architecture diagrams, component hierarchy, 7-layer overview
HAPPY_PATHS.md	Complete data flow walkthroughs for all user workflows
DEVELOPMENT_HISTORY.md	Git commit timeline and development narrative
FRONTEND.md	React component tree, hooks, contexts, state management
BACKEND.md	Lambda functions, inference clients, FastAPI dev server, worker
DATABASE.md	DynamoDB schema, status lifecycle, access patterns
CLOUD_INFRASTRUCTURE.md	CDK stacks, S3, CloudFront, WAF, SQS, Route53, cost estimates
UI_UX.md	Material Design 3 theme, accessibility, animations, error handling
BEDROCK_COST_ANALYSIS.md	Bedrock pricing deep dive, On-Demand vs EC2 vs physical hardware

Environment Variables

Backend (`backend/.env`)

Variable	Default	Description
`LM_STUDIO_URL`	`http://localhost:1234`	LM Studio base URL (Tailscale Funnel for remote)
`LM_STUDIO_API_KEY`	`lm-studio`	API key (LM Studio ignores value, must be non-empty)
`MODEL_NAME`	`qwen/qwen3.5-35b-a3b`	Model identifier
`MAX_INFERENCE_TOKENS`	`8192`	Max tokens per response
`MAX_PDF_PAGES`	`20`	Max PDF pages to render and send

Worker (`worker/.env`)

Variable	Required	Description
`ANALYZE_QUEUE_URL`	✓	SQS queue URL (from `cdk-outputs.json`)
`UPLOAD_BUCKET`	✓	S3 bucket name for uploads
`JOBS_TABLE`	✓	DynamoDB table name
`LM_STUDIO_URL`	✓	LM Studio base URL (usually `http://localhost:1234`)
`MODEL_NAME`		Defaults to `Qwen/Qwen3.5-35B-A3B-AWQ`

Lambda environment (set via CDK / cdk.json)

Variable	Description
`UPLOAD_BUCKET`	S3 uploads bucket
`JOBS_TABLE`	DynamoDB jobs table
`ORCHESTRATOR_FUNCTION_ARN`	ARN of `analyze_orchestrator` Lambda
`ANALYZE_QUEUE_URL`	SQS queue URL
`/coldbones/desktop-url` (SSM)	Tailscale Funnel URL for LM Studio
`/coldbones/desktop-port` (SSM)	LM Studio port (443 via Funnel)

Supported Languages

English (en)
Hindi (hi)
Spanish (es)
Bengali (bn)

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github/workflows		.github/workflows
backend		backend
docs		docs
frontend		frontend
infrastructure		infrastructure
lambdas		lambdas
scripts		scripts
worker		worker
.gitignore		.gitignore
AWS-Certified-AI-Practitioner_Exam-Guide.pdf		AWS-Certified-AI-Practitioner_Exam-Guide.pdf
README.md		README.md
package.json		package.json
pdftest.pdf		pdftest.pdf
ruff.toml		ruff.toml
test1.jpeg		test1.jpeg
testvideo.mov		testvideo.mov

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Coldbones

Table of Contents

Architecture

How it works

CDK Stacks

Prerequisites

Deploy to AWS

Deploy the Frontend

Local Development

Backend (FastAPI — local dev only)

Frontend

Desktop Worker

Testing

Backend (pytest)

Frontend (Vitest)

Infrastructure (Jest)

Project Structure

Documentation

Environment Variables

Backend (`backend/.env`)

Worker (`worker/.env`)

Lambda environment (set via CDK / cdk.json)

Supported Languages

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Coldbones

Table of Contents

Architecture

How it works

CDK Stacks

Prerequisites

Deploy to AWS

Deploy the Frontend

Local Development

Backend (FastAPI — local dev only)

Frontend

Desktop Worker

Testing

Backend (pytest)

Frontend (Vitest)

Infrastructure (Jest)

Project Structure

Documentation

Environment Variables

Backend (backend/.env)

Worker (worker/.env)

Lambda environment (set via CDK / cdk.json)

Supported Languages

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Backend (`backend/.env`)

Worker (`worker/.env`)

Packages