Skip to content

peterb154/openclaw-bedrock-cache-plugin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

openclaw-bedrock-cache-plugin

OpenClaw plugin that injects Bedrock Converse API cachePoint blocks into API payloads, enabling prompt caching for Claude models on Amazon Bedrock.

Also includes a deterministic cost reporter that tracks Bedrock marketplace costs and cache efficiency metrics.

How caching works

The Bedrock Converse API supports prompt caching via cachePoint marker blocks placed after static content. Cached prefixes are reused across requests within a 5-minute TTL window, charged at 10% of normal input token cost.

Cache points are injected after:

  1. System prompt — end of system[] array
  2. Tool definitions — end of toolConfig.tools[] array

All other built-in provider behavior is preserved: region resolution, error classification, replay hooks, guardrail support.

Requirements

  • OpenClaw v2026.4.x or later
  • Amazon Bedrock access with Claude models enabled
  • AWS IAM user or role with Bedrock invoke permissions
  • Python 3.10+ and uv (for cost reporter)

Installation

Plugin

The plugin must be installed in OpenClaw's extensions directory:

mkdir -p ~/.openclaw/extensions/bedrock-cache
cp index.js package.json openclaw.plugin.json ~/.openclaw/extensions/bedrock-cache/

Disable the built-in amazon-bedrock plugin and enable bedrock-cache in ~/.openclaw/openclaw.json:

{
  "plugins": {
    "entries": {
      "amazon-bedrock": { "enabled": false },
      "bedrock-cache": { "enabled": true }
    }
  }
}

Restart the gateway after installing.

Cost reporter

cd cost-reporter
uv sync
uv run python daily_report.py

Deploy script

deploy.sh deploys both the plugin and cost reporter to CT 106 via Proxmox:

./deploy.sh

Cost reporter

The cost reporter produces a deterministic report with:

  • Yesterday's cost broken down by marketplace model (Claude Sonnet 4.6, 4.5, Haiku 4.5, etc.)
  • Caching cost split — cache write vs cache read vs output vs uncached input, with cost-basis hit rate
  • CloudWatch token metrics — per-model invocation count, input/output/cache write/cache read token counts, token-basis hit rate
  • Month-to-date totals with the same breakdowns
  • Projected monthly total based on daily average

Example output

**Daily AWS Bedrock Report** (2026-04-14)

**Yesterday** (Apr 13): **$47.59**
  Claude Sonnet 4.5: $47.57
  Claude Haiku 4.5: $0.02

**Caching** (yesterday)
  Write: $42.07 | Read: $4.20 | Output: $1.30
  Cache hit rate (cost): 9%

**Token Metrics** (yesterday)
  claude-sonnet-4-5-20250929-v1:0: 315 calls | in=1.3K cw=11.03M cr=12.73M out=78.1K | hit=54%
  titan-embed-text-v2:0: 266 calls | in=38.2K cw=0 cr=0 out=0 | hit=0%

**Month-to-Date**: **$419.23**
  Claude Sonnet 4.6: $348.07
  Claude Sonnet 4.5: $70.83
  Claude Haiku 4.5: $0.32

**MTD Caching**
  Write: $354.89 | Read: $57.31 | Output: $6.88
  Cache hit rate (cost): 14%

**Projected Month**: $935
  Avg/day: $32.25 | Days left: 16

OpenClaw cron setup

To run the report daily via OpenClaw cron, the job payload should be:

Run this command and post the EXACT output to Discord channel #openclaw
(channel ID: <your-channel-id>). Do not add commentary or modify the output.
Command: cd ~/scripts/aws-cost-reporter && uv run python scripts/daily_report.py

Use message tool: action=send, channel=discord, target=channel:<your-channel-id>

OpenClaw skill

Copy skills/aws-cost-report.md to your OpenClaw agent directory (e.g., ~/.openclaw/agents/main/agent/skills/) to make the cost report available as the /cost command in Discord.

OpenClaw configuration

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "amazon-bedrock/us.anthropic.claude-haiku-4-5-20251001-v1:0",
        "fallbacks": [
          "amazon-bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0",
          "ollama/qwen3:8b"
        ]
      },
      "params": {
        "cacheRetention": "short"
      }
    }
  },
  "plugins": {
    "entries": {
      "amazon-bedrock": {
        "enabled": false,
        "config": {
          "discovery": {
            "enabled": true,
            "region": "us-east-1",
            "providerFilter": ["anthropic"],
            "refreshInterval": 0
          }
        }
      },
      "bedrock-cache": {
        "enabled": true
      }
    }
  }
}

Monitoring cache performance

Enable Bedrock model invocation logging to see per-request cache metrics in CloudWatch.

Query cache metrics with CloudWatch Logs Insights:

fields @timestamp,
  output.outputBodyJson.usage.inputTokens as input,
  output.outputBodyJson.usage.cacheWriteInputTokens as cacheWrite,
  output.outputBodyJson.usage.cacheReadInputTokens as cacheRead,
  output.outputBodyJson.usage.outputTokens as output
| filter operation = "ConverseStream"
| sort @timestamp desc
| limit 50

After OpenClaw upgrades

OpenClaw upgrades may replace the extensions directory. Re-run deploy.sh and restart the gateway after upgrading.

License

MIT

About

OpenClaw plugin that injects Bedrock Converse API cachePoint blocks for prompt caching on Claude models (~90% input cost savings)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors