IMPORTANT WARNING: Using this software may violate Google's Terms of Service. This project is provided for educational and research purposes only. I take no responsibility for any consequences that may arise from using this software, including but not to limited to account suspension, service termination, or legal action by Google. Use at your own risk.
- Overview
- Project History
- System Architecture
- Features
- Setup
- Usage
- Deployment
- Security
- Contributing
- Monitoring
- License
- Support
The Gemini-CLI-Proxy is a high-availability, OpenAI-compatible API endpoint and credential rotator designed to provide seamless access to Google's Gemini AI models through both Gemini CLI OAuth credentials and AI Studio API keys. Built as a Node.js/TypeScript application deployed on Cloudflare Workers, it implements a robust architecture with load balancing, automatic credential rotation, and failover mechanisms to ensure maximum reliability and performance.
This application is a partial rewrite of my previous application, gemini-loadbalance-rotator-worker, which was a fork of Kevin Yuan's Gemini-loadbalance-worker. Kevin Yuan's application itself was a fork of Jaap Gewoon's gemini-cli-openai.
This application is forked from gemini-cli-openai and implements Gemini OAuth credential rotation as well as AI Studio API key rotation. It features load balancing inspired by gemini-loadbalance-worker and rotation logic from gemini-loadbalance-rotator-worker, which itself was based on a personal private project of mine, gemini-key-rotator (a Python AI Studio API key rotator).
We extend our sincere gratitude to the original authors—Jaap Gewoon, Kevin Yuan, and the broader open-source community—for their foundational work that made this project possible. Their contributions have been instrumental in advancing accessible and reliable Gemini API integration.
The application's architecture is designed for high availability and resilience.
graph TD
subgraph "User Request"
A[OpenAI-compatible Request]
end
subgraph "Cloudflare Worker"
B[Core Proxy Engine]
C[Central Orchestrator]
D[Primary Load Balancer]
E[Fallback Rotator]
F[Credential Provider]
end
subgraph "Credential Sources"
G[Gemini OAuth Credentials]
H[AI Studio API Keys]
end
subgraph "External Services"
I[Google Gemini API]
J[Cloudflare KV Store]
end
A --> B;
B --> C;
C --> D;
C --> E;
D --> F;
E --> F;
F -- Reads --> G;
F -- Reads --> H;
F -- Manages State --> J;
B -- Translates & Forwards --> I;
The application consists of several key components:
- Core Proxy Engine: Handles OpenAI-compatible API requests and translates them to Gemini API calls.
- Central Orchestrator: Coordinates request routing between the primary and fallback credential systems.
- Primary Load Balancer: Manages a pool of API credentials with round-robin selection and stateful error handling.
- Fallback Rotator: Provides backup credential management when primary credentials are unavailable.
- Unified Credential Provider: Loads and manages credentials from both the
oauth creds/directory and AI Studio key files.
The project successfully implements the core API proxy functionality as specified in the technical documentation. The application runs actively (npm run dev confirmed working) and provides:
- OpenAI-compatible API endpoint
- Multi-credential load balancing
- Automatic failover mechanisms
- Seamless credential onboarding
- Cloudflare Workers deployment
- OpenAI-Compatible API: Drop-in replacement for OpenAI API, compatible with existing tools and libraries
- Multi-Credential Load Balancing: Distributes requests across API credential pools to avoid rate limits
- Automatic Failover: Built-in fallback mechanism ensures service availability during credential failures
- Seamless Onboarding: Intuitive setup with OAuth credential handling via
oauth creds/directory - AI Studio API Key Support: Enhanced authentication using AI Studio keys for improved access and additional features
- Cloudflare Workers Deployment: Optimized for edge deployment with global low-latency responses
- Interactive Setup Wizard: A command-line tool (
npm run setup-wizard) that guides you through the entire project configuration in minutes. - Stateful Credential Management: Tracks credential health (AVAILABLE, RATE_LIMITED, INVALID)
- Error-Based State Changes: Automatic handling of 429 (rate-limited) and 401/403 (invalid) errors
- Round-Robin Selection: Efficient distribution of requests across healthy credentials
For the fastest and easiest setup, use the interactive CLI wizard. This tool will guide you through the entire configuration process, including credential import, load balancing, and local development settings. For a detailed walkthrough, see the USAGE.md guide.
To run the wizard, execute the following command:
npm run setup-wizardThe wizard will automatically detect your environment, ask a series of questions, and generate the necessary config.yaml and wrangler.toml files for you.
Note: The setup wizard supports both English and Spanish. Spanish translations are machine-generated and may contain inaccuracies.
For advanced users or those who prefer manual configuration, follow the steps below.
- Node.js (v18+ recommended)
- npm or yarn
- Wrangler CLI (
npm i -g wrangler) - Cloudflare account and API token
- Gemini OAuth credentials (see OAUTH_CREDENTIALS_GUIDE.md)
git clone gemini-cli-proxy
cd gemini-cli-proxy
npm install-
Add Credentials
- Place your Gemini OAuth credential files (
.jsonformat) in theoauth creds/directory. - For AI Studio keys, create a file named
ai_studio_keys.txtorkeys.txtin theoauth creds/directory and add your API keys (one per line). - Note: The setup wizard can handle credential import for you automatically.
- Place your Gemini OAuth credential files (
-
Configuration
- Manually create or edit
config.yamlto configure load balancing, logging, and AI Studio integration. See the Configuration section for details. - Copy
wrangler.toml.templatetowrangler.tomland fill in your actual Cloudflare KV namespace ID and other configuration details. Note: The actualwrangler.tomlfile is gitignored to prevent committing private information.
- Manually create or edit
-
AI Studio Keys (Optional)
- Create either
oauth creds/ai_studio_keys.txtoroauth creds/keys.txtwith your AI Studio API keys (one per line) - Alternatively, create
oauth creds/keys.jsonwith your keys in JSON format - The system will automatically detect and use whichever file exists
- Create either
-
Run Development Server
npm run dev
The npm run predev script (which npm run dev calls) will process your credential files and create a .dev.vars file for local development.
The service provides OpenAI-compatible endpoints. For a complete list of endpoints, parameters, and example responses, please refer to the API_REFERENCE.md.
POST /v1/chat/completions- Chat completionsGET /v1/models- List available modelsGET /health- Health check
curl -X POST https://your-worker-url/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Hello!"}]
}'Create config.yaml:
primary_keys:
- key: "GEMINI_API_KEY_1"
- key: "GEMINI_API_KEY_2"
fallback_keys:
- key: "FALLBACK_GEMINI_API_KEY_1"
# AI Studio Configuration
ai_studio:
enabled: true
keys_file: "./oauth creds/ai_studio_keys.txt" # Auto-detects ai_studio_keys.txt or keys.txt (can also be .json file)
cooldown_seconds: 60
fallback_mode: "fallback" # Options: fallback, combined, disabled
prefer_ai_studio_for_pro: false # Prefer AI Studio for Pro models when CLI is rate limited
rate_limit_settings:
cooldown_period_ms: 60000
failover_thresholds:
consecutive_failures: 3
recheck_interval_ms: 300000enabled: Enable/disable AI Studio key support (default:true)keys_file: Path to the AI Studio keys file (default:"oauth creds/ai_studio_keys.txt") - Can also be .json filecooldown_seconds: Cooldown period for rate-limited AI Studio keys (default:60)fallback_mode: Credential selection mode:"fallback": Priority-based (OAuth first, then AI Studio)"combined": Shared pool (all credentials mixed randomly)"disabled": AI Studio keys disabled
prefer_ai_studio_for_pro: Prefer AI Studio keys for Pro models when CLI is rate limited
The application uses the following environment variables for configuration. These can be set in a .dev.vars file for local development or as secrets in your Cloudflare Workers environment.
-
GEMINI_API_KEY_1,GEMINI_API_KEY_2, etc.:- Description: These variables hold your Gemini OAuth credentials or AI Studio API keys. The numeric suffix allows you to define multiple keys for load balancing and rotation.
- Usage: In local development, the
npm run predevscript automatically populates these in.dev.varsfrom the files in theoauth creds/directory. In production, you should set these as encrypted secrets in your Cloudflare environment. - Example:
npx wrangler secret put GEMINI_API_KEY_1 < "oauth creds/credential-1.json"
-
WRANGLER_KV_NAMESPACE_ID:- Description: The ID of the Cloudflare KV namespace used for storing credential state and other application data.
- Usage: This is required for the application to persist data across requests and manage credential status. You must create a KV namespace in your Cloudflare dashboard and add its ID to your
wrangler.tomlfile and as an environment variable. - Example: In
wrangler.toml:kv_namespaces = [{ binding = "GEMINI_CLI_LOADBALANCE", id = "your-kv-namespace-id" }]
-
LOG_LEVEL:- Description: Controls the verbosity of the application's logging.
- Usage: Set this to one of the following values:
error,warn,info,debug,trace. The default isinfo. - Example:
LOG_LEVEL="debug"
-
CONFIG_YAML:- Description: The raw YAML configuration content.
- Usage: Instead of reading from a
config.yamlfile, the application can be configured directly via this environment variable. This is particularly useful in environments where file system access is limited. - Example:
CONFIG_YAML="ai_studio:\n enabled: true"
-
Upload Credentials as Secrets
npx wrangler secret put GEMINI_API_KEY_1 < "oauth creds/credential-1.json"
-
Configure KV Namespace
- Create KV namespace in Cloudflare dashboard
- Add to
wrangler.toml:
kv_namespaces = [ { binding = "GEMINI_CLI_LOADBALANCE", id = "your-kv-namespace-id" } ]
-
Deploy
npm run deploy
For users who prefer to run the application in a containerized environment, here is a sample Docker configuration.
View Sample `Dockerfile`
Create a Dockerfile in the root of your project with the following content:
# Use an official Node.js runtime as a parent image
FROM node:18-alpine
# Set the working directory in the container
WORKDIR /usr/src/app
# Copy package.json and package-lock.json
COPY package*.json ./
# Install app dependencies
RUN npm install
# Bundle app source
COPY . .
# Run the pre-development script to generate .dev.vars
RUN npm run predev
# Make port 8787 available to the world outside this container
EXPOSE 8787
# Define the command to run your app
CMD ["npm", "run", "dev"]View Sample `docker-compose.yml`
Create a docker-compose.yml file to easily manage your container:
version: '3.8'
services:
gemini-proxy:
build: .
ports:
- "8787:8787"
volumes:
- ./oauth_creds:/usr/src/app/oauth_creds
- ./config.yaml:/usr/src/app/config.yaml
environment:
- LOG_LEVEL=debug-
Build the Docker image:
docker build -t gemini-cli-proxy . -
Run the container using Docker Compose:
docker-compose up
The service will be available at http://localhost:8787.
Test deployment with:
curl https://your-worker-url/health
curl https://your-worker-url/v1/models- Store credentials securely using Cloudflare secrets
- Regularly rotate API keys
- Implement proper access controls
- Monitor for unusual activity
- Keep dependencies updated
- Credential encryption in production
- Secure key storage via Cloudflare KV
- Input validation (recommended enhancement)
- Rate limiting (recommended enhancement)
- Fork the repository
- Create feature branch
- Make changes with tests
- Submit pull request
- TypeScript strict mode
- ESLint configuration
- Prettier formatting
- Comprehensive test coverage
npm test # Run test suite
npm run test:watch # Watch mode
npm run test:coverage # Coverage report- Basic console logging
- Cloudflare Workers analytics
- Request/response logging
- Structured JSON logging
- Metrics collection (Prometheus)
- Alerting (Grafana)
- Performance monitoring
This project is licensed under the MIT License.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
For issues and questions:
- GitHub Issues
- Documentation
- Community discussions
Last Updated: 2025-09-11 Version: 1.0.0