⚒️ ScribeForge AI

AI-Powered Intelligent Notes Workspace

🧠 Schema-Guarded Full-Stack Knowledge Engine

Built with FastAPI • Supabase • Gemini 2.5 Flash

A production-grade AI workspace that transforms raw human notes into structured, searchable, and intelligence-ready knowledge assets.

🧠 Research & Engineering Context

ScribeForge AI was designed as a production-oriented AI infrastructure system focused on:

Structured AI generation
Reliable schema-constrained inference
Intelligent note lifecycle management
Scalable backend architecture
Human-AI collaborative workflows

Unlike typical AI note applications, ScribeForge emphasizes:

✅ Deterministic AI outputs
✅ Backend reliability
✅ Structured knowledge extraction
✅ Searchable information systems
✅ Production-grade API architecture

🚀 Executive Overview

Modern note-taking systems suffer from a fundamental problem:

Humans generate unstructured information faster than they can organize it.

ScribeForge AI solves this by converting fragmented raw notes into:

Intelligent summaries
Extracted action items
Searchable tagged entities
AI-generated titles
Structured knowledge records

The platform acts as an AI-powered knowledge refinement engine rather than a simple note editor.

🎯 Problem Statement

Traditional note systems face several limitations:

Notes become unsearchable over time
Raw text lacks structure
Important action items are buried
AI outputs often break JSON parsing pipelines
Scaling AI-assisted systems introduces instability

Most AI-integrated note apps rely on:

❌ Free-form AI responses
❌ Weak validation
❌ Fragile parsing logic
❌ Monolithic backend systems

Resulting in unreliable production behavior.

💡 Solution: Schema-Guarded AI Architecture

ScribeForge introduces a strictly validated AI orchestration pipeline.

Instead of trusting raw LLM text, the system forces AI outputs into validated application schemas.

⚙️ Core Features

🧠 Structured AI Summarization

Transforms unorganized text into:

High-level summaries
Key insights
Structured metadata

✅ Action Item Extraction

Automatically detects:

Tasks
Decisions
Follow-ups
Priority actions

🏷️ Intelligent Tagging & Search

Supports:

Real-time fuzzy search
Tag containment filtering
Server-side querying
Indexed retrieval pipelines

🔒 Schema-Guarded AI Responses

Uses:

Gemini structured output mode
Pydantic v2 contracts
Strict response validation

Ensuring:

✅ Zero malformed AI payloads
✅ Reliable downstream processing
✅ Deterministic backend behavior

🗂️ Soft-Deletion Archive Layer

Instead of destructive deletion:

Notes are archived using is_archived
Historical indexing remains intact
Database fragmentation is minimized

🌐 Anonymous Public Sharing

Implements isolated public-read endpoints for:

Shared notes
Public references
Knowledge distribution

Without exposing protected infrastructure.

🏗️ System Architecture

Raw User Notes
        ↓
FastAPI Backend
        ↓
AI Orchestration Layer
        ↓
Gemini 2.5 Flash
        ↓
Pydantic Schema Validation
        ↓
Structured Knowledge Objects
        ↓
Supabase Persistence Layer
        ↓
Search + Retrieval APIs

🛠️ Tech Stack

Layer	Technology
Backend Framework	FastAPI
Language	Python 3.12
Database	Supabase (PostgreSQL)
AI Engine	Gemini 2.5 Flash
Validation	Pydantic v2
Authentication	JWT + Passlib
API Standard	REST
Runtime Model	Asynchronous Python

📂 Project Structure

backend/
├── app/
│   ├── routes/
│   │   ├── auth.py
│   │   └── notes.py
│   │
│   ├── ai_service.py
│   ├── auth_utils.py
│   ├── main.py
│   └── schemas.py
│
├── .env.example
└── requirements.txt

🚀 Local Setup

1️⃣ Clone Repository

git clone https://github.com/ashhuxt/scribeforge-ai.git
cd backend

2️⃣ Create Virtual Environment

python -m venv venv

Activate:

Windows

.\venv\Scripts\activate

Linux / macOS

source venv/bin/activate

3️⃣ Install Dependencies

pip install -r requirements.txt

4️⃣ Configure Environment Variables

Create .env

GOOGLE_API_KEY=your_google_api_key
GEMINI_API_KEY=your_gemini_api_key
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_KEY=your_supabase_key
JWT_SECRET=your_secret_key

🏃 Running the Application

python -m uvicorn app.main:app --reload

Swagger Documentation:

http://127.0.0.1:8000/docs

🔬 Engineering Decisions & Research Thinking

✅ Schema-Driven AI Reliability

Most AI systems fail because LLMs generate inconsistent outputs.

Solution

Gemini responses are bound directly to:

Pydantic schemas
Typed validation contracts
Structured response enforcement

This removes:

❌ JSON corruption ❌ Parsing instability ❌ Invalid payload crashes

✅ Fully Decoupled Backend Design

Feature-isolated routers:

auth.py
notes.py

Prevent:

Circular imports
Tight coupling
Scaling bottlenecks

✅ Defensive Data Layer Engineering

All retrieval systems include:

Explicit index checks
Empty-state guards
Safe array handling

Protecting the backend from:

Runtime failures
Indexing crashes
Null reference errors

✅ Storage Layer Optimization

Implemented:

is_archived

Instead of destructive deletion.

Benefits:

Historical preservation
Efficient indexing
Lower fragmentation
Better auditability

⚠️ Research Gap & Limitations

While ScribeForge AI achieves strong reliability and structured AI orchestration, several important challenges remain:

AI-generated summaries may still miss contextual nuance
Current retrieval pipeline is keyword-centric, not semantic
No vector embedding search layer currently implemented
Multi-document reasoning is limited
Long-term memory and knowledge graph relationships are not modeled

These limitations highlight the transition required from structured generation systems toward semantic knowledge reasoning systems.

🚀 Research Direction

ScribeForge AI serves as a foundational system for exploring the future of:

AI-assisted productivity systems
Structured knowledge engineering
Semantic retrieval architectures
Human-AI collaborative workflows

Proposed Future Directions

🧠 Semantic Retrieval Systems

Vector embeddings
Hybrid retrieval pipelines
Context-aware ranking

🤖 Agentic AI Workflows

Autonomous knowledge agents
Multi-step reasoning systems
AI task delegation

🕸️ Knowledge Graph Integration

Relationship extraction
Entity linking
Long-term memory systems

⚡ Distributed AI Infrastructure

Queue-based orchestration
Async distributed workers
Horizontal scalability

Key Research Questions

How can AI systems reliably structure human knowledge?
How can semantic retrieval outperform keyword-based systems?
Can AI-generated knowledge systems maintain long-term consistency?
How should AI agents collaborate with human productivity workflows?

This positions ScribeForge AI at the intersection of:

Knowledge Engineering
Information Retrieval
AI Systems Design
Human-Centered AI

📈 Why This Project Stands Out

Unlike basic CRUD note applications, ScribeForge demonstrates:

✅ Production-grade backend engineering ✅ Structured AI integration ✅ Typed schema enforcement ✅ Async API orchestration ✅ Real-world system reliability ✅ Research-oriented architectural thinking

👨‍💻 Developer

Ashish Patel

Focused on building:

AI Infrastructure Systems
Intelligent Backend Architectures
Knowledge Engineering Platforms
Scalable Production APIs

🌟 Final Statement

ScribeForge AI is not just a notes application. It is a production-oriented foundation for intelligent knowledge systems.

⚒️ ScribeForge Turns Raw Information Into Structured Intelligence.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

⚒️ ScribeForge AI

AI-Powered Intelligent Notes Workspace

🧠 Schema-Guarded Full-Stack Knowledge Engine

Built with FastAPI • Supabase • Gemini 2.5 Flash

🧠 Research & Engineering Context

🚀 Executive Overview

🎯 Problem Statement

💡 Solution: Schema-Guarded AI Architecture

⚙️ Core Features

🧠 Structured AI Summarization

✅ Action Item Extraction

🏷️ Intelligent Tagging & Search

🔒 Schema-Guarded AI Responses

🗂️ Soft-Deletion Archive Layer

🌐 Anonymous Public Sharing

🏗️ System Architecture

🛠️ Tech Stack

📂 Project Structure

🚀 Local Setup

1️⃣ Clone Repository

2️⃣ Create Virtual Environment

Windows

Linux / macOS

3️⃣ Install Dependencies

4️⃣ Configure Environment Variables

🏃 Running the Application

🔬 Engineering Decisions & Research Thinking

✅ Schema-Driven AI Reliability

Solution

✅ Fully Decoupled Backend Design

✅ Defensive Data Layer Engineering

✅ Storage Layer Optimization

⚠️ Research Gap & Limitations

🚀 Research Direction

Proposed Future Directions

🧠 Semantic Retrieval Systems

🤖 Agentic AI Workflows

🕸️ Knowledge Graph Integration

⚡ Distributed AI Infrastructure

Key Research Questions

📈 Why This Project Stands Out

👨‍💻 Developer

Ashish Patel

🌟 Final Statement

⚒️ ScribeForge Turns Raw Information Into Structured Intelligence.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages