A VSCode extension that replicates GitHub Copilot Agent behavior using local LLMs via Ollama. Fully private β no data leaves your machine.
Key Features:
- π€ Agent Mode β Autonomous coding assistant that reads files, edits code, runs commands
- π¬ Chat Mode β Conversational Q&A about your codebase
- β¨ Inline Completions β Ghost text suggestions as you type
- π Fully Local β All inference runs on your machine via Ollama
- π Deep Code Analysis β Traces through nested service/repository chains
- Requirements
- Installation
- Quick Start
- Features
- Configuration
- Keyboard Shortcuts
- Troubleshooting
- Security
Install Ollama from ollama.com or via terminal:
# macOS/Linux
curl -fsSL https://ollama.com/install.sh | sh
# Windows - Download from https://ollama.com/downloadImportant: Ollama must be running for Ciper Agent to work. It runs on http://localhost:11434 by default.
# Start Ollama (usually runs automatically after install)
ollama serve
# Verify it's running
curl http://localhost:11434/api/tags# Recommended for coding tasks
ollama pull qwen2.5-coder:7b
# For more complex tasks (needs more VRAM)
ollama pull qwen2.5-coder:14b
# Smaller models for inline completions only
ollama pull qwen2.5-coder:1.5bVSCode 1.90 or later. Download here
Node.js v18+ for building the extension. Download here
# 1. Clone the repository
git clone <repo-url>
cd ciper-agent
# 2. Install dependencies
npm install
# 3. Build the extension
npm run build
# 4. Open in VSCode
code .
# 5. Press F5 to launch Extension Development HostThis opens a new VSCode window with Ciper Agent loaded. Changes to source code will rebuild automatically in watch mode.
# 1. Build the .vsix package
npm run package
# 2. Install it
code --install-extension backend/ciper-agent-*.vsix
# 3. Restart VSCodeOr in VSCode: Extensions panel β β― menu β Install from VSIX...
| Method | How |
|---|---|
| Click the β icon | Left sidebar activity bar |
| Keyboard | Ctrl+Alt+I (Mac: Cmd+Alt+I) |
| Command Palette | Ctrl+Shift+P β Ciper: Open Chat |
- Open the chat panel
- Select Agent mode (not Chat) for code tasks
- Type your request, e.g.:
- "Explain the main.go file"
- "Find all TODO comments"
- "Add error handling to the login function"
- Press Enter to send
| Feature | Chat Mode | Agent Mode |
|---|---|---|
| File access | β Read only | β Full read/edit |
| Tool execution | β None | β Read, write, search, run commands |
| Use case | Q&A, explanations | Code changes, refactoring |
| Best for | Understanding code | Implementing features |
You: "Extract the raw SQL from GetUserById"
Agent:
1. π Reads the main file β finds userService
2. π Reads userService β finds UserRepository
3. π Reads UserRepository β finds the SQL query
4. β
Outputs: `SELECT id, name, email FROM users WHERE id = ?`
The agent traces through nested dependencies automatically. No need to specify exact file paths.
As you type, Ciper suggests code in faded grey text:
function calculateTotal(items) {
return items.reduce((sum, item) => {β
Press Tab to accept the suggestion. Completions are debounced (300ms) and timeout after 2 seconds.
When the agent wants to modify files:
- A diff preview appears in the chat panel
- Review the changes
- Click β Apply Changes to confirm
- Or β Discard to cancel
To auto-approve all changes (no preview):
{
"ciperAgent.requireApprovalForEdits": false
}Select code in the editor, right-click:
- Ask Ciper β Send selection + your question to the agent
- Fix with Ciper β Ask the agent to fix issues in selection
Keyboard shortcut: Ctrl+Shift+I
Open VSCode settings (Ctrl+,) and search for Ciper Agent:
| Setting | Default | Description |
|---|---|---|
ollamaEndpoint |
http://localhost:11434 |
Ollama API URL |
model |
qwen2.5-coder:7b |
Primary model for chat & agent |
contextTokenBudget |
8192 |
Max tokens for workspace context |
| Setting | Default | Description |
|---|---|---|
maxAgentIterations |
20 |
Max iterations before auto-stop |
requireApprovalForEdits |
true |
Show diff preview before file changes |
| Setting | Default | Description |
|---|---|---|
enableInlineCompletions |
true |
Enable ghost text suggestions |
completionDebounceMs |
300 |
Delay before triggering completion |
completionModel |
(uses main model) | Separate model for completions |
| Use Case | Chat/Agent Model | Completion Model |
|---|---|---|
| Fast dev (8GB VRAM) | qwen2.5-coder:7b |
qwen2.5-coder:1.5b |
| Balanced (12GB VRAM) | qwen2.5-coder:14b |
qwen2.5-coder:3b |
| Max quality (24GB VRAM) | qwen2.5-coder:32b |
qwen2.5-coder:7b |
{
// Point to remote Ollama if not on localhost
"ciperAgent.ollamaEndpoint": "http://192.168.1.100:11434",
// Use a different default model
"ciperAgent.model": "codellama:13b",
// Faster completions with smaller model
"ciperAgent.completionModel": "qwen2.5-coder:1.5b",
// Allow auto-apply without preview
"ciperAgent.requireApprovalForEdits": false,
// More context for complex projects
"ciperAgent.contextTokenBudget": 16384
}| Shortcut | Action |
|---|---|
Ctrl+Alt+I |
Open Ciper chat panel |
Ctrl+Shift+I |
Ask Ciper about selection |
Tab |
Accept inline completion |
Esc |
Cancel agent running |
All shortcuts can be customized in VSCode settings: Preferences β Keyboard Shortcuts
-
Check Ollama is running:
curl http://localhost:11434/api/tags
-
Pull a model:
ollama pull qwen2.5-coder:7b
-
Verify the endpoint setting matches your Ollama install
- Make sure you have a workspace folder open (
File β Open Folder) - Check the file path in the chat β relative paths are resolved from workspace root
- For files outside workspace, use absolute paths
The agent should automatically read files before analyzing them. If it doesn't:
- Use Agent mode (not Chat mode)
- Be specific: "Read the userRepository.go file and extract the SQL query"
- The agent traces nested dependencies automatically
- The agent uses unified diff format β some models produce imperfect diffs
- Try rephrasing: "rewrite the entire function X" instead of "add a line"
- Check the file hasn't been modified externally
- Completions are debounced β pause briefly after typing
- Check
ciperAgent.enableInlineCompletionsistrue - Try a faster completion model:
"ciperAgent.completionModel": "qwen2.5-coder:1.5b"
- Increase iteration limit:
"ciperAgent.maxAgentIterations": 50 - Use a more capable model
- Break complex tasks into smaller steps
- Check chat panel for parse error messages
- Make sure you ran
npm run buildbefore pressing F5 - Check Extension Development Host console:
Help β Toggle Developer Tools - Try reloading:
Ctrl+Shift+P β Developer: Reload Window
For slower models:
- Reduce context budget:
"ciperAgent.contextTokenBudget": 4096 - Use dedicated fast model for completions
- Close other applications using GPU
| Protection | How it works |
|---|---|
| File writes require approval | Default β user must explicitly approve each file change |
| Dangerous commands blocked | rm -rf, sudo, curl | bash, etc. are prevented |
| Path traversal prevention | Cannot read/write files outside workspace folder |
| No network calls | All inference runs locally via Ollama |
The following patterns are blocked:
rm -rf(recursive delete)sudo(privilege escalation)curl | bash/wget | sh(pipe to shell)dd if=(direct disk write)- Fork bombs (
:(){:|:&};:)
- Review diffs before approving β Don't auto-approve blindly
- Use Agent mode for changes β Chat mode can't modify files
- Start with read-only queries β "Explain X" before "Fix X"
- Keep Ollama updated β
ollama update
# Remove extension
code --uninstall-extension ciper-agent
# Or in VSCode: Extensions β Ciper Agent β UninstallMIT
See docs/ for architecture documentation.
# Watch mode - auto-rebuild on changes
npm run watch
# Build production bundle
npm run build
# Run tests
npm test
# Package as .vsix
npm run package