Automatically syncs NotebookLM notebooks into structured Obsidian notes via a multi-agent LLM pipeline.
I use NotebookLM to organize study material — slides, PDFs, articles — grouped by subject. After each session, I'd manually copy the content into Obsidian to have proper, searchable notes. That got old fast.
nb2ob automates the entire thing: it pulls your notebooks, processes the sources through an agent pipeline, and writes structured markdown notes directly into your vault.
flowchart TD
A["notebooklm-py\npulls notebooks + sources"] --> B["Content Cleaning\nremoves noise via regex"]
B --> C["NotebookLM chat.ask\nsummarizes all sources per notebook"]
C --> D["Clusterizer\ngroups sources by topic"]
D --> E["Orchestrator\ndecides which clusters become files"]
E --> F["Formatter\nwrites structured markdown per file"]
F --> G["Obsidian Local REST API\nsaves notes to vault"]
G --> H["NotebookLM/{notebook}/{topic}.md ✓"]
Each notebook becomes a folder in your vault. Each topic cluster becomes a .md file inside it.
The current pipeline has 3 LLM agents. It evolved from an earlier design that had 5.
| Agent | Role |
|---|---|
| Cleaner | Received raw source content from the API and removed noise — broken URLs, image tokens, floating newlines — before passing it downstream |
| Summarizer | Condensed each cleaned source into a short summary so the Clusterizer could group them without processing full content |
| Clusterizer | Received all summaries and decided which sources belonged together — since multiple sources can cover the same subject |
| Orchestrator | Received the clusters and their raw content, then called the Formatter once per group |
| Formatter | Turned raw content into a readable, study-ready markdown note |
This design worked conceptually but hit a hard wall in practice: free-tier token limits. Sending full source content through a Cleaner LLM, then a Summarizer LLM, meant exhausting Groq's 500k daily quota in a single run — before the pipeline even reached the Clusterizer.
The two most token-heavy agents were replaced with cheaper alternatives:
- Cleaner → regex. URL removal, image token stripping, and newline normalization don't need an LLM.
- Summarizer →
chat.ask. NotebookLM already understands the sources. One API call per notebook summarizes all sources at once, at zero LLM token cost.
The remaining 3 agents — Clusterizer, Orchestrator, and Formatter — still run through an LLM pipeline with a Groq → Cerebras → OpenRouter fallback chain.
## 📋 Resumo
## 📚 Conteúdo
### 🔹 [Topic]
...nb2ob/
├── main.py # CLI entry point (typer)
│
├── agent/
│ ├── graph.py # builds and compiles the LangGraph pipeline
│ ├── llms.py # LLM instances and fallback chain
│ ├── state.py # PipelineState and TypedDicts
│ ├── nodes/ # one file per agent (clusterizer, orchestrator, formatter)
│ └── prompts/ # one file per agent + base.py
│
├── api/
│ ├── __init__.py
│ ├── notebooklm.py # NotebookLM unofficial API wrapper
│ ├── obsidian.py # Obsidian Local REST API wrapper
│ ├── _types.py # shared TypedDicts (Sources, Notebook, SummarizedSource)
│ ├── _cleaning.py # regex-based content cleaning
│ └── _summarizer.py # chat.ask prompt and response parsing
│
├── config/
│ ├── settings.py # Config dataclass with env var validation
│ └── models.py # Model enum and provider maps
│
├── infrastructure/
│ ├── config.py # logger setup
│ ├── decorators.py # log_call decorator
│ └── display.py # banner and spinner
│
└── docs/ # images used in this README
- Python 3.11+
- uv
- Obsidian with the Local REST API plugin installed and active
- A Google account with access to NotebookLM
- API key from at least one supported provider (see table below)
git clone https://github.com/DaviAlcanfor/nb2ob.git
cd nb2obuv sync
playwright install chromiumOpen Obsidian and go to Settings → Community Plugins.
Click Browse and search for Local REST API with MCP.
Install and enable it. Then go to Settings → Local REST API & MCP Server to find your bearer token.
Copy the token — you'll need it in the next step.
notebooklm loginThis opens a browser for you to sign in with your Google account. Credentials are stored locally and reused on subsequent runs. If the session expires, run it again.
cp .env.example .env# Obsidian Local REST API
OBSIDIAN_TOKEN=<your_bearer_token> # from Settings > Local REST API & MCP Server
OBSIDIAN_HOST=https://127.0.0.1 # change only if you modified the plugin settings
OBSIDIAN_PORT=27124 # change only if you modified the plugin settings
OBSIDIAN_FOLDER=NotebookLM # root folder in your vault where notes will be saved
# LLM Providers (at least one required)
GROQ_API_KEY=<your_groq_api_key> # https://console.groq.com/keys
GEMINI_API_KEY=<your_gemini_api_key> # https://aistudio.google.com/apikey
ANTHROPIC_API_KEY=<your_anthropic_key> # https://console.anthropic.com/keys
CEREBRAS_API_KEY=<your_cerebras_key> # https://cloud.cerebras.ai
OPEN_ROUTER_KEY=<your_openrouter_key> # https://openrouter.ai/settings/keys
# OpenRouter (optional — default already set in code)
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1uv run main.pyAt least one API key is required. The pipeline uses Groq as primary with Cerebras and OpenRouter as fallbacks.
| Provider | Model | Free tier | Get key |
|---|---|---|---|
| Groq | llama-3.3-70b-versatile |
500k tokens/day | console.groq.com |
| Cerebras | llama-3.3-70b |
1M tokens/day | cloud.cerebras.ai |
| OpenRouter | deepseek/deepseek-chat-v3-0324:free |
50 req/day | openrouter.ai |
| Gemini | gemini-2.5-flash |
1500 req/day | aistudio.google.com |
| Anthropic | claude-sonnet-4-6 |
paid only | console.anthropic.com |
- notebooklm-py — unofficial NotebookLM Python API
- LangChain + LangGraph — agent pipeline orchestration
- langchain-groq — Groq LLM integration
- langchain-cerebras — Cerebras LLM integration
- langchain-openai — OpenRouter via OpenAI-compatible API
- typer — CLI
- pyfiglet — terminal banner
- yaspin — terminal spinner
- requests — Obsidian API communication
- python-dotenv — environment variables
notebooklm-py is an unofficial library that uses undocumented Google APIs. It is not affiliated with Google and may break without notice. Use at your own risk.
MIT



