This guide covers the prerequisites, installation, and configuration of CodeRAG.
- Node.js: Version 18 or higher
- Neo4J Database: Version 5.11 or higher (required for vector indexes and semantic search)
- Git: For cloning the repository
# Clone the repository
git clone https://github.com/JonnoC/CodeRAG.git
cd CodeRAG
# Install dependencies
npm install
# Build the project
npm run buildCreate a .env file in the project root:
# Neo4j Database Configuration
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_neo4j_password
# Project Configuration
PROJECT_ISOLATION_STRATEGY=shared_db
DEFAULT_PROJECT_ID=default
CROSS_PROJECT_ANALYSIS=false
MAX_PROJECTS_SHARED_DB=100
# Semantic Search Configuration (Optional)
# Options: openai, local, disabled
SEMANTIC_SEARCH_PROVIDER=disabled
# OpenAI Configuration (required if SEMANTIC_SEARCH_PROVIDER=openai)
# OPENAI_API_KEY=sk-your-openai-api-key-here
# Embedding Model Configuration
# EMBEDDING_MODEL=text-embedding-3-small
# EMBEDDING_MAX_TOKENS=8000
# EMBEDDING_BATCH_SIZE=100
# SIMILARITY_THRESHOLD=0.7
# Remote Repository Authentication (Optional)
# GitHub
# GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxx
# GitLab
# GITLAB_TOKEN=glpat-xxxxxxxxxxxxxxxx
# GITLAB_HOST=gitlab.company.com # Optional, for self-hosted GitLab
# Bitbucket
# BITBUCKET_USERNAME=your_username
# BITBUCKET_APP_PASSWORD=your_app_password
# Optional: Server configuration
SERVER_PORT=3000
LOG_LEVEL=infoTo enable semantic code search with natural language queries:
-
Get an OpenAI API key from OpenAI Platform
-
Update your
.envfile:SEMANTIC_SEARCH_PROVIDER=openai OPENAI_API_KEY=sk-your-openai-api-key-here EMBEDDING_MODEL=text-embedding-3-small
-
Ensure Neo4j 5.11+ for vector index support (required for semantic search)
-
Initialize semantic search after first project scan:
npm run build node build/index.js --tool initialize_semantic_search
For detailed semantic search configuration, see the Semantic Search Guide.
CodeRAG can directly analyze remote repositories from GitHub, GitLab, and Bitbucket without requiring local cloning. To access private repositories, configure authentication tokens:
-
Create a Personal Access Token:
- Go to GitHub Settings > Developer settings > Personal access tokens
- Click "Generate new token (classic)"
- Select scopes:
repo(for private repos) orpublic_repo(for public repos only) - Copy the generated token
-
Configure in
.env:GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxx
-
Create a Personal Access Token:
- Go to GitLab Settings > Access Tokens
- Create token with
read_repositoryscope - Copy the generated token
-
Configure in
.env:GITLAB_TOKEN=glpat-xxxxxxxxxxxxxxxx # For self-hosted GitLab: GITLAB_HOST=gitlab.company.com
-
Create an App Password:
- Go to Bitbucket Settings > Personal Bitbucket settings > App passwords
- Create password with
Repositories: Readpermission - Copy the generated password
-
Configure in
.env:BITBUCKET_USERNAME=your_username BITBUCKET_APP_PASSWORD=your_app_password
Once authentication is configured, you can scan remote repositories directly:
# Public repositories (no authentication needed)
npm run scan https://github.com/owner/repo.git
# Private repositories (uses configured tokens)
npm run scan https://github.com/private/repo.git
npm run scan https://gitlab.com/private/repo.git
npm run scan https://bitbucket.org/private/repo.git
# Specific branches
npm run scan https://github.com/owner/repo.git -- --branch developCodeRAG requires a running Neo4J instance. Here are several setup options:
- Download Neo4J Desktop
- Install and create a new project
- Create a new database with:
- Name:
coderag(or your preference) - Password: Set a secure password
- Version: 5.11+
- Name:
- Start the database
- Note the connection details (usually
bolt://localhost:7687)
# Run Neo4J in Docker
docker run \
--name neo4j-coderag \
-p 7474:7474 -p 7687:7687 \
-d \
-v $HOME/neo4j/data:/data \
-v $HOME/neo4j/logs:/logs \
-v $HOME/neo4j/import:/var/lib/neo4j/import \
-v $HOME/neo4j/plugins:/plugins \
--env NEO4J_AUTH=neo4j/your_password \
neo4j:5.12- Sign up at Neo4J Aura
- Create a free instance
- Note the connection URI and credentials
- Update your
.envfile with the cloud connection details
CodeRAG supports two modes of operation:
This mode is used when integrating with MCP-compatible AI tools:
# Start in STDIO mode
npm start
# Or use the built version
npm run start:builtThis mode provides an HTTP endpoint for web-based access:
# Start HTTP server on port 3000
npm start -- --sse --port 3000
# Or specify a different port
npm start -- --sse --port 8080For development with auto-reload:
npm run devTo verify your installation:
- Test Neo4J Connection: Ensure your Neo4J instance is running and accessible
- Build Success: Run
npm run buildwithout errors - STDIO Mode: Run
npm startand verify it starts without connection errors - HTTP Mode: Run
npm start -- --sse --port 3000and visithttp://localhost:3000/health
- Integration with AI Tools - Connect CodeRAG to your preferred AI assistant
- Scanner Usage - Learn how to scan your codebase
- Multi-Project Management - Manage multiple codebases