Skip to content

Gemma 4 integration investigation: Bedrock Marketplace vs local Ollama #3

@eugenio

Description

@eugenio

Context

Investigated options for integrating Gemma 4 into the current NadirClaw + Goose stack.

Findings

Gemma 4 availability

  • Gemma 4 released April 2, 2026 — not yet available as a native Bedrock model
  • Currently available on Google Cloud natively
  • Reachable on AWS only via Bedrock Marketplace or SageMaker JumpStart
  • Native Bedrock currently tops out at Gemma 3 (4B / 12B / 27B)

What is Bedrock Marketplace?

  • Catalog of 100+ third-party/open-weight models deployed to SageMaker-managed endpoints
  • Subscribe → deploy to endpoint → call via InvokeModel / Converse
  • Unlike native Bedrock (serverless, per-token), Marketplace uses provisioned endpoints billed per instance-hour regardless of usage

Cost estimate (Gemma 4 / 27B-class via Marketplace)

Mode Billing Estimate
Native Bedrock (Gemma 3 27B) Per token $0.23/$0.38 per 1M in/out
Marketplace endpoint (e.g. ml.g5.2xlarge) Per hour ~$1–2/hr
8hrs/day on Marketplace Daily ~$10/day

This exceeds our current NADIRCLAW_DAILY_BUDGET=5.00.

Integration blocker with current NadirClaw config

NadirClaw routes exclusively through Bedrock Mantle (bedrock-mantle.eu-west-2.api.aws/v1), which only exposes native Bedrock models via an OpenAI-compatible endpoint. Marketplace/SageMaker endpoints are not reachable through Mantle.

To add a Marketplace model we would need to:

  • Add a second LiteLLM provider in NadirClaw pointing to the SageMaker endpoint (sagemaker/...)
  • Or proxy it separately and define a new routing tier

Better alternative: local Ollama (Gemma 4 via srfm-lab guide)

Ref: https://github.com/Mattbusel/srfm-lab/blob/c55a82754b31246664b1def452f3d261b0a5fa77/docs/guides/local_rag_setup.md

Runs Gemma 4 (26B) locally via Ollama with:

  • Codebase indexing into ChromaDB (function/class-boundary chunking)
  • HTTP API: /v1/query, /v1/retrieve, /v1/review, /v1/document, /v1/ideate
  • Claude Code MCP integration (fits Goose workflow)
  • Git pre-commit hooks for automated code review
  • VS Code integration

Requirements: NVIDIA GPU 8GB+ VRAM, 16GB+ RAM, 50GB+ disk.

Advantages over Marketplace:

  • Zero AWS cost (no per-token or per-hour charges)
  • Code never leaves the machine
  • Direct Claude Code MCP integration
  • Gemma 4 already available on Ollama

Next steps

  • Assess whether local GPU is available with sufficient VRAM (8GB+ required)
  • If yes: prototype local Ollama + ChromaDB RAG stack from srfm-lab guide
  • If no: evaluate Marketplace cost vs benefit with a time-limited endpoint test
  • Revisit when Gemma 4 lands as a native Bedrock model (likely later 2026)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions