By the end of this module, you will:
- ✅ Have a working Codespaces environment
- ✅ Understand the project structure and architecture
- ✅ Configure your OpenAI API key
- ✅ Verify DocumentDB connection
- ✅ Understand the dataset you'll be working with
Before starting, ensure you have:
- GitHub account
- OpenAI API key (Get one here)
- Codespace created from this repository
This workshop is designed to run entirely in GitHub Codespaces, providing a consistent, pre-configured development environment for all participants.
💡 Why Codespaces? No local setup required, consistent environment for everyone, and automatic dependency installation.
-
Navigate to the repository:
- Go to:
https://github.com/documentdb/booking-agents-sample
- Go to:
-
Open in GitHub Codespaces:
- Click the green "Code" button
- Select the "Codespaces" tab
- Click "Create codespace on workshop"
Alternatively, click this badge:
-
Wait for the environment to build (first launch takes 2-3 minutes):
- Python 3.11 environment
- Node.js 20
- Docker-in-Docker
- VS Code extensions (DocumentDB, Python, Docker)
- All dependencies automatically installed
-
Verify Codespace is ready:
- You should see VS Code in your browser
- Extensions should be installed (check the sidebar)
- Terminal should be available at the bottom
-
Open a terminal (Terminal → New Terminal) and proceed to Activity 2
Now that your environment is ready, let's deploy DocumentDB locally using Docker.
-
Pull the DocumentDB Docker image:
docker pull ghcr.io/documentdb/documentdb/documentdb-local:latest
-
Tag the image for convenience:
docker tag ghcr.io/documentdb/documentdb/documentdb-local:latest documentdb
-
Run the DocumentDB container:
docker run -dt -p 10260:10260 --name documentdb-container documentdb --username admin --password password123
The forwarded ports in Codespaces default to Private, which can block connections between services. You need to make them Public so the frontend, backend, and DocumentDB can communicate.
-
Open the Ports panel:
- In the terminal area at the bottom of VS Code, click the "Ports" tab (next to Terminal, Output, etc.)
-
Update port visibility:
- You should see port 10260 (DocumentDB) listed
- Right-click on the port row
- Select "Port Visibility" → "Public"
- Repeat for ports 3000 (frontend) and 8000 (backend) if they are listed
💡 Why Public? In Codespaces, private ports require authentication tokens that automated service-to-service connections don't provide. Setting ports to Public allows the services to reach each other.
-
Verify the container is running:
docker ps
You should see
documentdb-containerrunning on port 10260.Expected output:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES abc123def456 documentdb "./entrypoint.sh --u…" 10 seconds ago Up 9 seconds 0.0.0.0:10260->10260/tcp documentdb-container
Download the 'DocumentDB for VS Code' extension on your codespace using the VS Code Marketplace. Afterwards, follow these steps to connect your DocumentDB container to the extension:
-
Open the DocumentDB extension:
- Click the DocumentDB icon in the left sidebar (database icon)
- Or press
Ctrl+Shift+Pand type "DocumentDB"
-
Add a new connection:
- Click the DocumentDB icon in the VS Code sidebar
- Click "Add New Connection"
- Select "Connection String"
- Paste the connection string:
mongodb://admin:password123@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true&authMechanism=SCRAM-SHA-256 - Verify using username and password. The credentials should already be prefilled using the connection string.
-
Verify the connection - You should see your connection in the DocumentDB explorer
Now that DocumentDB is running and connected, let's load sample data to work with throughout the workshop. You'll use the DocumentDB VS Code extension to import JSON files directly into your database.
The workshop includes a JSON file with sample data that already contains vector embeddings:
data/embedded_data.json- Combined Airbnb listings with pre-generated embeddings
-
Open the DocumentDB extension:
- Click the DocumentDB icon in the left sidebar
- Expand your connection to see databases
- Note: Feel free to delete the database "sampledb" from the extension if you see it.
-
Create the database and collections:
- Right-click on your connection
- Select "Create Database"
- Enter database name:
db - Press Enter
-
Create the customers collection:
- Expand the
dbdatabase - Right-click on the database
- Select "Create Collection"
- Enter collection name:
listings - Press Enter
- Expand the
-
Import customer data:
- Right-click on the
listingscollection - Select "Import Documents"
- Navigate to:
data/embedded_data.json - Click "Open"
- Wait for the import confirmation message
- Right-click on the
You need an OpenAI API key to generate embeddings and use chat completions.
-
Create a
.envfile in the project root if it isn't created already:cp .env.example .env
-
Edit
.envand add your key:OPENAI_API_KEY=sk-your-actual-key-here
-
Save the file
Run this Python snippet to test:
python -c "import os; from dotenv import load_dotenv; load_dotenv(); print('✅ API key configured' if os.getenv('OPENAI_API_KEY') else '❌ API key missing')"Before moving to Module 1, verify:
- ✅ Codespace is running without errors
- ✅ OpenAI API key is configured
- ✅ DocumentDB connection works
- ✅ You understand the project structure
- ✅ You've explored the dataset
- ✅ You understand the architecture
- Converts text to numerical vectors (embeddings)
- Finds similar items by comparing vector distances
- Enables semantic search ("find me something cozy" vs exact keyword match)
- Native vector search operator
- Supports IVF (Inverted File Index) and HNSW algorithms
- Allows combining vector similarity with other filters
- Retrieves relevant documents from a database
- Augments LLM prompts with retrieved context
- Generates accurate, grounded responses
- Multiple specialized AI agents working together
- Each agent has a specific role/expertise
- Agents coordinate to solve complex tasks
- Wait a few minutes (initial build takes 2-3 min)
- Check GitHub status page
- Try rebuilding: Codespaces menu → Rebuild Container
# Check if DocumentDB container is running
docker ps | grep documentdb
# Check logs
docker logs documentdb- Verify your API key is correct
- Check you have credits in your OpenAI account
- Make sure the key has permission to use embeddings and chat APIs
# Reinstall dependencies
pip install -r requirements.txtYou're all set! Time to build your first vector search implementation.
Continue to: Module 1: Vector Search Fundamentals
Questions? Ask your instructor or check the troubleshooting guide.