NoteV

Every AI note-taker can hear. Ours can see.

AI classroom assistant for Meta Ray-Ban smart glasses.
Captures audio + visual content during lectures → generates structured multimodal notes.

Quick Start • Features • Architecture • Roadmap

🎒 No smart glasses? No problem.
NoteV works with just your iPhone camera — same full pipeline, no extra hardware needed.
Point your phone at the whiteboard and go.

Home screen with Ray-Ban connected · Polished transcript timeline · AI chat with course setup

Get Started in 5 Minutes

What you need:

Xcode 15+ on macOS
A free Deepgram API key (for speech-to-text)
Any one LLM API key: Gemini (free tier) / OpenAI / Anthropic
An iPhone or iOS Simulator — Meta Ray-Ban glasses are 100% optional

git clone https://github.com/DearBobby9/NoteV_Glasses.git
cd NoteV_Glasses

# Set up API keys (gitignored — your keys stay local)
cp NoteV/Config/Secrets.xcconfig.example NoteV/Config/Secrets.xcconfig
# Edit Secrets.xcconfig — paste your Deepgram key

open NoteV.xcodeproj
# Select your device/simulator → Cmd+R
# On first launch: tap ⚙️ → select LLM provider → enter API key

Features

Three-Layer Output Pipeline

Layer	What it does
Polished Timeline	AI-cleaned transcript with inline images, bookmark highlights, and sticky section headers
AI Notes	Structured notes organized by slide transitions with timestamps and table of contents
Action Items	Extracted TODOs with categories, priorities, due dates → batch export to iOS Reminders & Calendar

Smart Recording

Dual capture — Meta Ray-Ban glasses via DAT SDK, or iPhone back camera (automatic fallback)
Real-time STT — Deepgram WebSocket streaming with KeepAlive and graceful shutdown
Smart bookmarks — Auto-detects key moments ("this will be on the exam", "important") via 4-tier keyword taxonomy with confidence scoring
Frame intelligence — 5s periodic sampling + SSIM change detection + pHash slide deduplication
Slide analysis — LLM vision reads slide content (titles, bullets, formulas, diagrams) — not OCR

AI Chat System

Unified chat — Ask questions about your notes, set up courses, configure settings, create reminders — all through natural language
Voice input — Deepgram-powered dictation
Action cards — Confirmable UI buttons for adding courses, changing settings, creating reminders
Session context — Chat has access to your transcript, notes, slides, and todos

Course Management

Conversational setup — Tell the AI your schedule naturally ("I have CS229 MWF 10am") — no forms
Auto-detection — Detects which course you're in when recording starts
Weekly calendar — iOS Calendar-style grid view with course blocks and live time indicator

Export

Export Preview — Review and edit action items before batch export (toggle Reminder vs Calendar Event, edit titles/dates/priorities)
PDF generation — Formatted PDF with inline images and section timestamps
iOS Reminders — Batch export to dedicated "NoteV Tasks" list with deep links back to session timestamps

Architecture

graph TD
    subgraph Capture
        A[CaptureProvider Protocol] --> B[Meta Ray-Ban<br/>DAT SDK]
        A --> C[iPhone Camera<br/>AVCaptureSession]
    end

    subgraph Processing
        D[AudioPipeline<br/>Deepgram / Apple Speech]
        E[FramePipeline<br/>SSIM + pHash dedup]
        F[SmartBookmarkDetector<br/>4-tier keyword scoring]
        G[SessionRecorder<br/>orchestrator]
    end

    subgraph Generation
        H[TranscriptPolisher<br/>chunked LLM cleanup]
        I[SlideAnalyzer<br/>LLM vision extraction]
        J[NoteGenerator<br/>multimodal LLM]
        K[TodoExtractor<br/>action item extraction]
    end

    subgraph Presentation
        L["Layer 1: Polished Timeline"]
        M["Layer 2: AI Notes"]
        N["Layer 3: Action Items"]
        O[AI Chat + Course Management]
    end

    A -- "frames + audio" --> D & E
    D & E & F --> G
    G --> H & I
    H & I --> J
    J --> K
    H --> L
    J --> M
    K --> N
    G --> O

Tech Stack

Component	Technology
Platform	iOS 17+ / Swift 6 / SwiftUI
Smart Glasses	Meta Ray-Ban Gen-2 (DAT SDK v0.4.0)
Speech-to-Text	Deepgram nova-3 via native WebSocket (primary) / Apple Speech (fallback)
LLM	OpenAI GPT-4o / Anthropic Claude / Google Gemini (configurable)
Native Frameworks	EventKit, PDFKit, Speech, AVFoundation
Storage	FileManager + JSON + JPEG (no CoreData)

Project Structure

NoteV/
├── App/              # Entry point, AppState (session lifecycle)
├── Capture/          # CaptureProvider protocol, Glasses + Phone providers
├── Processing/       # AudioPipeline, FramePipeline, SessionRecorder, SmartBookmarkDetector
├── NoteGeneration/   # NoteGenerator, TranscriptPolisher, TodoExtractor, SlideAnalyzer, PDFGenerator
├── Services/         # DeepgramService, LLMService, ReminderSyncService
├── Models/           # SessionData, TranscriptSegment, Bookmark, TodoItem, Course
├── Storage/          # SessionStore, CourseStore, ChatStore, ImageStore
├── Views/            # SwiftUI screens + reusable components
└── Config/           # NoteVConfig (all tunable params), xcconfig files

Roadmap

Visual Q&A — Tap any image in the timeline → ask the AI about that specific slide with surrounding transcript context
Smart Review Reminders — Auto-generate review summaries before exams from cross-session notes, spaced repetition nudges
Cross-Session Course Intelligence — Link sessions to courses, build knowledge graph across weeks
Multi-Language STT — Support non-English lectures via Deepgram language params
Android Companion — Extend beyond iOS for broader smart glasses adoption

Have an idea or want to contribute? Open an issue — feature requests and PRs are welcome.

Known Limitations

Glasses features require a physical iPhone (DAT SDK doesn't run in simulator)
Apple Speech fallback has ~1 min recognition limit (auto-restarts)
No permission pre-ask flow — iOS prompts on first camera/mic/speech use

License

MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
NoteV.xcodeproj		NoteV.xcodeproj
NoteV		NoteV
NoteVTests		NoteVTests
screenshots		screenshots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
project.yml		project.yml
simulator_screenshot.png		simulator_screenshot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NoteV

Every AI note-taker can hear. Ours can see.

Get Started in 5 Minutes

Features

Three-Layer Output Pipeline

Smart Recording

AI Chat System

Course Management

Export

Architecture

Tech Stack

Project Structure

Roadmap

Known Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NoteV

Every AI note-taker can hear. Ours can see.

Get Started in 5 Minutes

Features

Three-Layer Output Pipeline

Smart Recording

AI Chat System

Course Management

Export

Architecture

Tech Stack

Project Structure

Roadmap

Known Limitations

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages