Gemma 4 integration investigation: Bedrock Marketplace vs local Ollama

## Context

Investigated options for integrating Gemma 4 into the current NadirClaw + Goose stack.

## Findings

### Gemma 4 availability
- Gemma 4 released **April 2, 2026** — not yet available as a native Bedrock model
- Currently available on **Google Cloud** natively
- Reachable on AWS only via **Bedrock Marketplace** or **SageMaker JumpStart**
- Native Bedrock currently tops out at Gemma 3 (4B / 12B / 27B)

### What is Bedrock Marketplace?
- Catalog of 100+ third-party/open-weight models deployed to **SageMaker-managed endpoints**
- Subscribe → deploy to endpoint → call via `InvokeModel` / `Converse`
- Unlike native Bedrock (serverless, per-token), Marketplace uses **provisioned endpoints billed per instance-hour** regardless of usage

### Cost estimate (Gemma 4 / 27B-class via Marketplace)
| Mode | Billing | Estimate |
|---|---|---|
| Native Bedrock (Gemma 3 27B) | Per token | $0.23/$0.38 per 1M in/out |
| Marketplace endpoint (e.g. `ml.g5.2xlarge`) | Per hour | ~$1–2/hr |
| 8hrs/day on Marketplace | Daily | ~$10/day |

This exceeds our current `NADIRCLAW_DAILY_BUDGET=5.00`.

### Integration blocker with current NadirClaw config
NadirClaw routes exclusively through **Bedrock Mantle** (`bedrock-mantle.eu-west-2.api.aws/v1`), which only exposes **native** Bedrock models via an OpenAI-compatible endpoint. Marketplace/SageMaker endpoints are **not reachable** through Mantle.

To add a Marketplace model we would need to:
- Add a second LiteLLM provider in NadirClaw pointing to the SageMaker endpoint (`sagemaker/...`)
- Or proxy it separately and define a new routing tier

### Better alternative: local Ollama (Gemma 4 via srfm-lab guide)
Ref: https://github.com/Mattbusel/srfm-lab/blob/c55a82754b31246664b1def452f3d261b0a5fa77/docs/guides/local_rag_setup.md

Runs Gemma 4 (26B) locally via Ollama with:
- Codebase indexing into **ChromaDB** (function/class-boundary chunking)
- HTTP API: `/v1/query`, `/v1/retrieve`, `/v1/review`, `/v1/document`, `/v1/ideate`
- **Claude Code MCP integration** (fits Goose workflow)
- Git pre-commit hooks for automated code review
- VS Code integration

Requirements: NVIDIA GPU 8GB+ VRAM, 16GB+ RAM, 50GB+ disk.

**Advantages over Marketplace:**
- Zero AWS cost (no per-token or per-hour charges)
- Code never leaves the machine
- Direct Claude Code MCP integration
- Gemma 4 already available on Ollama

## Next steps

- [ ] Assess whether local GPU is available with sufficient VRAM (8GB+ required)
- [ ] If yes: prototype local Ollama + ChromaDB RAG stack from srfm-lab guide
- [ ] If no: evaluate Marketplace cost vs benefit with a time-limited endpoint test
- [ ] Revisit when Gemma 4 lands as a native Bedrock model (likely later 2026)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemma 4 integration investigation: Bedrock Marketplace vs local Ollama #3

Context

Findings

Gemma 4 availability

What is Bedrock Marketplace?

Cost estimate (Gemma 4 / 27B-class via Marketplace)

Integration blocker with current NadirClaw config

Better alternative: local Ollama (Gemma 4 via srfm-lab guide)

Next steps

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Mode	Billing	Estimate
Native Bedrock (Gemma 3 27B)	Per token	$0.23/$0.38 per 1M in/out
Marketplace endpoint (e.g. `ml.g5.2xlarge`)	Per hour	~$1–2/hr
8hrs/day on Marketplace	Daily	~$10/day

Gemma 4 integration investigation: Bedrock Marketplace vs local Ollama #3

Description

Context

Findings

Gemma 4 availability

What is Bedrock Marketplace?

Cost estimate (Gemma 4 / 27B-class via Marketplace)

Integration blocker with current NadirClaw config

Better alternative: local Ollama (Gemma 4 via srfm-lab guide)

Next steps

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions