Skip to content

Haozhe-Xing/agent_learning

Repository files navigation

Agent Learning Roadmap

🤖 Agent Learning: Learn Agent Development from Scratch

A systematic, comprehensive, and practice-oriented AI Agent development guide

Agent Learning (agent_learning) is an open-source AI Agent learning roadmap and hands-on tutorial covering LLM agents, AI agents, agentic workflows, multi-agent systems, RAG, tool use, memory, LangChain, LangGraph, MCP, and agentic RL.

Daily auto-tracking of arXiv frontier papers — content stays cutting-edge, always.


License: MIT Stars PRs Welcome mdBook Daily arXiv


Read Online Chinese   Read Online English


🐛 Report Issues · 💬 Discussions · 🇨🇳 中文版 README


🚀 Auto-Tracking Frontier: Daily arXiv Paper Updates

🤖 This repository automatically searches arXiv for the latest AI Agent-related papers every day and updates the content accordingly — ensuring you always stay at the cutting edge of research!

  • 📡 Daily Automated Search: A scheduled pipeline scans arXiv daily for new papers on Agent architectures, tool use, memory systems, multi-agent collaboration, reinforcement learning for agents, and more.
  • 📝 Auto-Updated Content: Relevant findings are automatically integrated into the corresponding chapters, keeping the book's frontier sections fresh and up-to-date.
  • 🔔 Never Miss a Breakthrough: No need to manually track dozens of research feeds — this repo does it for you, so you can focus on learning and building.

💡 This means the content you read here is not static — it evolves continuously with the latest advances in the AI Agent field.


✨ Key Features

  • 🎯 Step by Step: From LLM fundamentals to multi-Agent systems, each chapter has a clear knowledge progression
  • 💻 Code First: Every core concept comes with runnable Python code examples
  • 🎨 Rich Illustrations: 120+ hand-drawn SVG architecture diagrams / flowcharts / sequence diagrams for intuitive understanding
  • 🎬 Interactive Animations: 5 built-in interactive HTML animations (Perceive-Think-Act cycle, ReAct reasoning, Function Calling, RAG flow, GRPO sampling)
  • 🔬 Paper Reviews: Key chapters include frontier paper deep-dives (ReAct, Reflexion, MemGPT, GRPO, etc.)
  • 🏗️ Complete Projects: 3 comprehensive hands-on projects (AI Coding Assistant, Intelligent Data Analysis Agent, Multimodal Agent)
  • 🛡️ Production Ready: Covers security, evaluation, deployment, and other production essentials
  • 🧪 Cutting Edge: Covers Context Engineering, Agentic-RL (GRPO/DPO/PPO), MCP/A2A/ANP, and other 2025–2026 latest advances
  • 📐 Formula Support: KaTeX-rendered math formulas for clear reading of policy gradient, KL divergence derivations in RL chapters
  • 🔄 Continuously Updated: Tracking the latest changes in LangChain, LangGraph, MCP, and other frameworks

📸 Selected Content Preview

Below are selected showcases from the book's 120+ hand-drawn SVG illustrations, all original to this book.

🧠 Agent Core Architecture

Perceive-Think-Act Loop (Chapter 1)

Perceive-Think-Act Loop

Agent's core mechanism: Perceive environment → LLM reasoning → Execute action → Loop until goal achieved

ReAct Reasoning Framework (Chapter 6)

ReAct Reasoning Framework

Thought → Action → Observation alternating loop, enabling Agents to think while acting

🛠️ Tool Calling & RAG

Function Calling Complete Flow (Chapter 4)

Function Calling Flow

6-step complete flow from user input to tool invocation to final response, with message structure illustration

RAG Retrieval-Augmented Generation (Chapter 7)

RAG Workflow

Offline indexing + Online retrieval dual-phase architecture, making LLM answers evidence-based

💾 Memory System & Context Engineering

Three-Layer Memory Architecture (Chapter 5)

Three-Layer Memory Architecture

Working memory → Short-term memory → Long-term memory, with important info sinking down and semantic retrieval pulling up

Prompt Engineering vs Context Engineering (Chapter 8)

Prompt Engineering vs Context Engineering

From "how to say it" to "what the LLM sees" — the paradigm shift of the Agent era

🤝 Multi-Agent & Communication Protocols

Three Multi-Agent Communication Patterns (Chapter 14)

Multi-Agent Communication Patterns

Message Queue (async decoupling) / Shared Blackboard (data sharing) / Direct Call (real-time collaboration)

MCP / A2A / ANP Protocol Comparison (Chapter 15)

Three Protocol Comparison

Three-layer protocol stack: ANP for discovery → A2A for task collaboration → MCP for tool invocation

🧪 Reinforcement Learning & Frameworks

GRPO Training Architecture (Chapter 10)

GRPO Training Architecture

No Critic model needed, computes advantage via intra-group normalization, only 1.5× model size in VRAM

LangGraph Three Core Concepts (Chapter 12)

LangGraph Core Concepts

State (shared state) · Node (processing unit) · Edge (execution flow control)

📖 The above is just a selected preview — For the full 120+ architecture diagrams + 5 interactive animations, please read online


🎬 Interactive Animations

This book includes 5 interactive HTML animations to help you intuitively understand the dynamic processes of core concepts:

Animation Chapter Description
🔄 Perceive-Think-Act Cycle Chapter 1 Dynamic demonstration of Agent's core loop
💡 ReAct Reasoning Process Chapter 6 Shows the alternating Thought → Action → Observation process
🔧 Function Calling Chapter 4 Complete tool invocation flow animation
📚 RAG Retrieval Flow Chapter 7 From document chunking to vector retrieval to answer generation
🎯 GRPO Sampling Process Chapter 10 Visualization of intra-group multi-output sampling and reward normalization

💡 Interactive animations are only available in the online e-book. Local builds can also preview them.


🔥 Core Topics at a Glance

🧠 Agent Core Architecture

  • Perceive → Think → Act Loop
  • ReAct Reasoning Framework
  • Task Decomposition & Planning
  • Reflection & Self-Correction

🛠️ Tools & Skills

  • Function Calling Mechanism
  • Custom Tool Design
  • Skill System Construction
  • Tool Description Best Practices

🧪 Reinforcement Learning Training

  • SFT + LoRA Basic Training
  • PPO / DPO / GRPO Algorithm Deep-Dive
  • Complete Training Pipeline Hands-on
  • 2025–2026 Latest Research Advances

💾 Memory, Knowledge & Context

  • Short-term / Long-term / Working Memory
  • Vector Databases (Chroma / FAISS)
  • RAG Retrieval-Augmented Generation
  • Context Engineering & Attention Budget

🤝 Multi-Agent Collaboration & Communication

  • MCP / A2A / ANP Protocol Stack
  • Supervisor vs Decentralized Patterns
  • CrewAI / AutoGen Frameworks
  • LangGraph Stateful Agents

🛡️ Production Full Pipeline

  • Evaluation Benchmarks (GAIA / SWE-bench)
  • Security Defense & Sandbox Isolation
  • Containerized Deployment & Streaming
  • Observability & Cost Optimization

🚀 Quick Start

Local Build

# Install mdBook (choose one)
cargo install mdbook
# Or macOS: brew install mdbook

# Install mdbook-katex plugin (for math formula rendering)
cargo install mdbook-katex

# Clone the repository
git clone https://github.com/Haozhe-Xing/agent_learning.git
cd agent_learning

# Build both Chinese and English versions and start unified server (default port 3000)
./serve.sh

After starting, visit:

  • 🌐 Language Selection Home: http://localhost:3000
  • 🇨🇳 Chinese Version: http://localhost:3000/zh/
  • 🇺🇸 English Version: http://localhost:3000/en/

Environment Setup (For Code Practice)

# Python 3.11+
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install core dependencies
pip install langchain langchain-openai langgraph openai anthropic

# Configure API Key
export OPENAI_API_KEY="your-key-here"

📊 Technology Stack

Python LangChain LangGraph OpenAI Anthropic FastAPI Docker Chroma FAISS mdBook KaTeX


🤝 Contributing

All forms of contribution are welcome!

  • 🐛 Found a bug: Submit an Issue
  • 💡 Content suggestions: Start a Discussion
  • 📝 Improve content: Fork → Edit → Submit PR
  • Support the project: Give this repo a Star!

Contributing Guide

# Fork and clone
git clone https://github.com/YOUR_USERNAME/agent_learning.git

# Create a feature branch
git checkout -b feature/improve-chapter-4

# Local preview
./serve.sh

# Commit and push
git commit -m "feat: improve Chapter 4 tool calling code examples"
git push origin feature/improve-chapter-4

Content Organization Conventions

  • Each chapter is placed in a separate directory src/zh/chapter_xxx/ (Chinese) or src/en/chapter_xxx/ (English)
  • Chapter overview goes in README.md, sections are numbered as 01_xxx.md, 02_xxx.md
  • Chinese SVG illustrations go in src/zh/svg/, English versions in src/en/svg/, naming format: chapter_xxx_description.svg
  • Chinese interactive animations go in src/zh/animations/, English versions in src/en/animations/

📄 License

This project is open-sourced under the MIT License.


⭐ Star History

If this project helps you, please give it a Star ⭐ — it's the greatest encouragement for the author!


Built with ❤️, so that every developer can master AI Agent development

⬆ Back to Top

About

agent learning|从零开始学 AI Agent 开发 | 系统、全面、实战导向的 Agent 开发教程 | 每日自动追踪 arXiv 最新论文 | Learn AI Agent Development from Scratch

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors