DJRHails · BeaHack · Jan 17, 2026 · Jan 17, 2026 · Jan 17, 2026 · Jan 17, 2026
diff --git a/.gitignore b/.gitignore
@@ -10,10 +10,24 @@
 .env*
 !.env*.example
 
-# # Old expo
-# .expo/
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+venv/
+.venv/
 
 # uv
 .uv/
-# Agent environment variables
-agents/.env
+
+# SSL certificates
+*.pem
+
+# Database
+*.db
+*.sqlite
+
+# Logs
+*.log
diff --git a/README.md b/README.md
@@ -1,151 +1,125 @@
-# Blindsighted (Sample App)
+# Julie
 
-**A hackathon-ready template for building AI-powered experiences with Ray-Ban Meta smart glasses.**
+**AI-powered shopping assistant for visually impaired users.**
 
-Blindsighted is a **sample app** that connects Ray-Ban Meta smart glasses to AI agents via LiveKit. The context is for a visual assistance app for blind/visually impaired users, but the architecture works for any AI-powered glasses experience.
+## The Problem
 
-The integration setup with Meta's wearables SDK and LiveKit streaming was finicky to get right. This template gives you a working foundation so you can skip that part and jump straight to the interesting bits.
+Grocery shopping is a significant challenge for blind and visually impaired individuals. Identifying products on shelves, reading labels,  and locating specific items typically requires assistance from others—limiting independence and privacy.
 
-## Architecture Overview
+## The Solution
 
-```
-iOS App (Swift) → LiveKit Cloud (WebRTC) → AI Agents (Python)
-     ↓                                           ↑
-     └──────→ FastAPI Backend (optional) ───────┘
-              (sessions, storage, etc.)
-```
-
-**Three independent components:**
-
-- **`ios/`** - Native iOS app using Meta Wearables DAT SDK
-
-  - Streams video/audio from Ray-Ban Meta glasses to LiveKit
-  - Receives audio/data responses from agents
-  - Works standalone if you just want to test the glasses SDK
+Julie combines **Ray-Ban Meta smart glasses** with **AI vision and voice** to give users complete autonomy when shopping, providing them with enough information to make qualitative, subjective choices about product selection. No screen interaction required—everything works through natural voice and audio feedback.
 
-- **`agents/`** - LiveKit agents (Python)
+## How It Works
 
-  - Join LiveKit rooms as peers
-  - Process live video/audio streams with AI models
-  - Send responses back via audio/video/data channels
-  - **This is where the magic happens** - build your AI features here
+1. **Point** — User faces a shelf wearing the glasses
+2. **Scan** — Gemini [via Elevenlabs TTS] guides positioning until the full shelf is visible
+3. **Identification** — Gemini identifies all products
+4. **Discuss** — User has back and forth conversation with Elevenlabs Agent to determine item selection
+5. **Reach** — AI guides their hand directly to the product using real-time camera feedback
 
-- **`api/`** - FastAPI backend (Python)
-  - Session management and room creation
-  - R2 storage for life logs and replays
-  - Optional but useful for anything ad hoc you need a backend for
+The entire experience is **eyes-free**.
 
-**You can work on just one part.** Want to build a cool agent but not touch iOS? Great. Want to experiment with the glasses SDK without running agents? Also fine. Want to add interesting storage/indexing features? The backend's there for you.
+## Key Features
+- **Voice-first interaction** — No buttons, no screens, just conversation
+- **Real-time guidance** — Continuous audio feedback using clock positions ("move to 2 o'clock")
+- **Product identification** — Recognizes items, brands, prices, and shelf locations
+- **Hand guidance** — Guides user's hand to the exact product location
+- **Works with existing hardware** — Ray-Ban Meta glasses + iPhone
 
-## Quick Start
-
-### iOS App
+## System Architecture
 
-```bash
-cd ios
-open Blindsighted.xcodeproj
-# Build and run in Xcode (⌘R)
 ```
-
-**Requirements**: Xcode 26.2+, iOS 17.0+, Swift 6.2+
-
-See [ios/README.md](ios/README.md) for detailed setup.
-
-### Agents
-
-```bash
-cd agents
-uv sync
-uv run example_agent.py dev
+                         👓 RAY-BAN META GLASSES
+                                  │
+                                  │ photos
+                                  ▼
+                           ┌─────────────┐
+                           │   iOS App   │
+                           └──────┬──────┘
+                                  │
+          ┌───────────────────────┼───────────────────────┐
+          │                       │                       │
+          ▼                       ▼                       ▼
+   ┌─────────────┐        ┌─────────────┐        ┌─────────────┐
+   │ LOW photos  │        │ HIGH photo  │        │ LOW photos  │
+   │ (position)  │        │ (identify)  │        │ (guidance)  │
+   └──────┬──────┘        └──────┬──────┘        └──────┬──────┘
+          │                      │                      │
+          ▼                      ▼                      ▼
+┌──────────────────────────────────────────────────────────────────────────────┐
+│                           GEMINI VISION AI                                   │
+│                                                                              │
+│  ① Navigation Mode      ② Identification Mode      ③ Hand Guidance Mode    │
+│  "Move camera right"    "Found 12 products"        "Move hand to 2 o'clock" │
+│         │                       │                           │               │
+│         ▼                       ▼                           ▼               │
+│   🔊 TTS Audio            CSV Product List            🔊 TTS Audio          │
+└─────────┬───────────────────────┬───────────────────────────┬────────────────┘
+          │                       │                           │
+          │                       ▼                           │
+          │     ┌─────────────────────────────────┐           │
+          │     │       FASTAPI BACKEND           │           │
+          │     │                                 │           │
+          │     │  POST /csv/upload ←── Gemini   │           │
+          │     │  GET /csv/get-summary ──→ 11L  │           │
+          │     │  POST /user-choice ←── 11L     │◄──────────┘
+          │     │  GET /user-choice/latest ──→ Gemini        │
+          │     │                                 │           │
+          │     └────────────────┬────────────────┘           │
+          │                      │                            │
+          │                      ▼                            │
+          │     ┌─────────────────────────────────┐           │
+          │     │  ELEVENLABS CONVERSATIONAL AI   │           │
+          │     │                                 │           │
+          │     │  🎤 User: "What's available?"   │           │
+          │     │  📋 Agent: Reads product list   │           │
+          │     │  🎤 User: "I want the Coca Cola"│           │
+          │     │  ✅ Agent: Posts choice to API ─┼───────────┘
+          │     │                                 │  triggers ③
+          │     └─────────────────────────────────┘
+          │
+          ▼
+    🔊 AUDIO OUTPUT (via glasses speakers)
 ```
 
-**Test without hardware**: Use the [LiveKit Agents Playground](https://agents-playground.livekit.io/) to test agents with your webcam/microphone instead of glasses.
+**Flow Summary:**
+1. **LOW photos** → Gemini guides camera positioning → Audio feedback
+2. **HIGH photo** → Gemini identifies products → CSV uploaded to API
+3. **ElevenLabs Agent** reads products, user selects via voice → Choice posted to API
+4. **LOW photos** → Gemini reads user choice from API → Hand guidance mode → Audio feedback
 
-See [agents/README.md](agents/README.md) for agent development.
+| Component | Purpose |
+|-----------|---------|
+| `ios/` | Captures photos from Ray-Ban Meta glasses |
+| `agents/` | Gemini AI for vision analysis + ElevenLabs TTS for audio output |
+| `api/` | Backend storing product data and user selections |
 
-### API Backend (Optional)
+## Quick Start
 
 ```bash
-cd api
-uv sync
-uv run main.py
-```
-
-API docs at `http://localhost:8000/docs`
-
-## What's Included
-
-### iOS App Features
-
-- Live video streaming from Ray-Ban Meta glasses
-- Audio routing to/from glasses (left/right channel testing)
-- Photo capture during streaming
-- Video recording and local storage
-- Video gallery with playback
-- LiveKit integration with WebRTC
-- Share videos/photos
+# API
+cd api && uv sync && uv run main.py
 
-### Agent Template
+# Agent
+cd agents && uv sync && uv run shelf_assistant.py
 
-- LiveKit room auto-join based on session
-- Audio/video stream processing
-- AI model integration examples (vision, TTS)
-- Bidirectional communication (receive video, send audio)
-
-### Backend API
-
-- Session management endpoints
-- LiveKit room creation with tokens
-- R2 storage integration for life logs
-- FastAPI with dependency injection patterns
-
-## Use It Your Way
-
-**Feel free to:**
-
-- Rip out everything you don't need
-- Replace the AI models with your own
-- Change the entire agent architecture
-- Use a different backend (or no backend)
-- Build something completely different on top of the glasses SDK
-
-**This is over-engineered for a hackathon.** The three-component architecture exists because I found the initial setup painful and wanted to provide options. If you have a better approach or this feels too complicated, throw it away! The point is to give you working examples to learn from, not to force an architecture on you.
-
-## Environment Variables & API Keys
-
-The app needs a few API keys to work:
-
-- **LiveKit**: Server URL, API key, API secret (for WebRTC streaming)
-- **OpenRouter API Key** (optional, for AI models)
-- **ElevenLabs API Key** (optional, for TTS)
-
-**Having trouble getting something running?** Reach out and I'll unblock you.
+# iOS
+cd ios && open Blindsighted.xcodeproj
+```
 
-See `ios/Config.xcconfig.example` and `api/.env.example` for configuration details.
+**Required API keys** (in `.env` files):
+- `GOOGLE_API_KEY` — Gemini vision AI
+- `ELEVENLABS_API_KEY` — Voice synthesis
 
-## Documentation
+## Accessibility by Design
 
-- **CLAUDE.md** - Full development guide with architecture details, code patterns, troubleshooting
-- **ios/README.md** - iOS-specific setup and configuration
-- **agents/README.md** - Agent development guide
-- **api/** - Backend API with OpenAPI docs at `/docs`
+- **No visual interface required** — All feedback is audio
+- **Natural language** — "I want the orange juice" not menu navigation
+- **Spatial audio cues** — Clock positions for intuitive direction
+- **Confirmation feedback** — "Got it!" when item is reached
+- **Error recovery** — Graceful re-prompting if something goes wrong
 
 ## License
 
-**In short:** Keep it open source, it's fine to make money with it. I'd love to see what you build with it.
-
-**Exception**: The iOS app incorporates sample code from Meta's [meta-wearables-dat-ios](https://github.com/facebook/meta-wearables-dat-ios) repository, which has its own license terms. Check that repo for Meta's SDK license.
-
-## Why Does This Exist?
-
-I built this because:
-
-1. Getting Meta's wearables SDK working took a bit of time without being 'fun'.
-2. Originally I had custom WebRTC streaming (which took a lot of time); Pentaform showed me LiveKit which seems much more suitable for a hackathon use-case so I swapped over to that for this project, but also has it's own pain points.
-3. Unlikely typical hackathons which are one-and-done, it'd be great to have something people can iterate on.
-
-If this helps you build something cool, that's awesome. If you find a better way to do any of this, even better.
-
-## Contributing
-
-Found a bug? Have a better pattern? PRs welcome. This is meant to help people, so improvements that make it easier to use or understand are great.
+MIT License — See [LICENSE](LICENSE)
diff --git a/Screenshot 2026-01-17 at 1.46.20 PM.png b/Screenshot 2026-01-17 at 1.46.20 PM.png
diff --git a/agents/__init__.py b/agents/__init__.py
@@ -1,5 +1,5 @@
-"""LiveKit Agents for Blindsighted - Vision-based AI assistance."""
+"""Julie Agents - Gemini-powered vision assistance for supermarket shopping."""
 
-from agents.vision_agent import VisionAssistant, vision_agent, server
+from shelf_assistant import ShelfAssistant, LocalPhotoManager
 
-__all__ = ["VisionAssistant", "vision_agent", "server"]
+__all__ = ["ShelfAssistant", "LocalPhotoManager"]
diff --git a/agents/config.py b/agents/config.py
@@ -10,22 +10,15 @@ class Settings(BaseSettings):
         extra="ignore",
     )
 
-    # LiveKit Agent Configuration
-    livekit_agent_name: str = "vision-agent"
-    livekit_url: str = ""
-    livekit_api_key: str = ""
-    livekit_api_secret: str = ""
+    # Google AI API (for Gemini)
+    google_api_key: str = ""
 
-    # OpenRouter API
-    openrouter_api_key: str = ""
-    openrouter_base_url: str = "https://openrouter.ai/api/v1"
+    # API Backend URL
+    api_base_url: str = "https://localhost:8000"
 
-    # ElevenLabs API
+    # ElevenLabs Conversational AI (for reference)
     elevenlabs_api_key: str = ""
-    elevenlabs_voice_id: str = "21m00Tcm4TlvDq8ikWAM"  # Rachel voice
-
-    # Deepgram API
-    deepgram_api_key: str = ""
+    elevenlabs_agent_id: str = "agent_0701kf5rm5s6f7jtnh7swk9nkx0a"
 
 
 settings = Settings()