A trust protocol for AI agents. One hook. Any agent. Every decision verified.
Works with a single assistant or an entire multi-agent pipeline. Adds document-aware review, structured challenge, auditability, and governance scoring.
flowchart LR
A["🔍 Analyst<br/>scores idea<br/><b>72/100</b>"] -->|"score ≥ 70"| B["⚔️ Critic<br/>3 challenges"]
B --> C["🛡️ Defender<br/>3 rebuttals"]
C --> D["⚖️ Judge<br/>verdict + delta"]
D --> E["📊 Governance<br/>Score 7.8/10"]
E -.->|"30 days later"| F["🔄 Retro<br/>was judge right?<br/>credibility ±0.05"]
style A fill:#2563eb,color:#fff,stroke:none
style B fill:#dc2626,color:#fff,stroke:none
style C fill:#2563eb,color:#fff,stroke:none
style D fill:#7c3aed,color:#fff,stroke:none
style E fill:#059669,color:#fff,stroke:none
style F fill:#d97706,color:#fff,stroke:none
Agent Constitution is a trust protocol you can place around any AI agent — a single assistant, a planner, a reviewer, or a full multi-agent pipeline.
It is meant for moments where the main problem is not producing an answer, but deciding whether that answer should drive action.
The important part is that the judgment can be grounded in attached documents.
Release checklists, rollback runbooks, deploy briefs, pricing memos, ownership maps, and similar files are not treated as decoration. They are treated as evidence that should change the recommendation.
That usually shows up in questions like:
- should we ship this change now
- is this recommendation strong enough to act on
- what should we do first
- what risks are still unresolved
- when should confidence be treated as provisional
The idea behind the project is straightforward:
AI systems are getting better at generating output, but judgment is still where a lot of value lives.
If a decision matters, it often helps to make the review process more explicit. Agent Constitution does that by helping systems:
- encode epistemic rules in markdown instead of burying them in prompt strings
- treat attached documents as evidence instead of ignoring them after retrieval
- route high-stakes outputs through structured challenge and judgment
- preserve audit history so important decisions can be revisited later
You can use it to harden:
- a single assistant that needs review before acting
- a planner or deploy bot that should trigger governance only on high-stakes outputs
- a multi-agent workflow that needs explicit arbitration instead of free-form consensus
In practice, the goal is simple: make important AI decisions easier to challenge, easier to inspect, and easier to revisit later.
Agent Constitution is a good fit when:
- a recommendation may lead to a real action, not just a draft
- confidence should be challenged before a decision is accepted
- you want a visible review step instead of implicit trust in one answer
- audit history matters because the decision may need to be revisited later
Common examples include deploys, pricing exceptions, architecture changes, compliance-sensitive choices, and memory or policy updates that are hard to unwind.
Concrete examples:
- deploy review using a release checklist, rollback runbook, and deploy brief
- pricing exception review using a finance memo, precedent guardrails, and account context
- organization change review using an ownership map, launch timeline, and dependency notes
- launch-readiness review for a README, proposal, or rollout plan framed as a decision
The key practical pattern is simple:
- ask a decision question
- attach the files a reviewer would actually want
- let the system say what the current judgment is and what evidence is still missing
What this looks like in a product surface:
Before governance
Assistant:
Recommendation: Approve the billing-auth hotfix rollout now.
After Agent Constitution summary gate
Assistant:
Recommendation: Approve the billing-auth hotfix rollout now.
Confidence: 82%
Assessment: 72/100 (Promising)
Adjusted score: 51/100 (Caution)
Governance check triggered.
Verdict: Proceed With Caution
Score delta: -21
Delta severity: Major Concern
Why: Several concerns remain unresolved before the next gate.
Top concern: Canary monitoring is still missing a billing-transaction-specific abort gate.
That example is based on a recorded live-model run, not a fabricated mock transcript. If you want to replay it locally without an API key, run:
python examples/demo_replay.pyBEFORE AFTER Agent Constitution
───────────────────────────────────── ─────────────────────────────────────
Planner: "Approve the pricing exception." Planner: "Approve the pricing exception."
Confidence: 0.86 Confidence: 0.86
Decision ships with no challenge Governance check triggered
Critic: "Margin erosion and precedent risk"
Defender:"Strategic account, capped term"
Judge: "proceed_with_caution, delta: -21"
Key issue: finance approval path is missing
Other frameworks usually focus on how components communicate. Agent Constitution focuses on how important decisions get challenged, judged, and audited.
Agent Constitution sits near several existing categories, but it is trying to solve a slightly different problem:
- Agent orchestration frameworks solve how agents, tools, and workflows run together
- Prompt / role systems shape behavior through instructions, personas, and local rules
- Guardrail / evaluation systems check whether outputs violate rules or meet quality bars
- Debate / self-critique methods improve reasoning by adding challenge or adversarial review
Agent Constitution borrows from all of these, but its center of gravity is different.
It treats high-stakes decisions as governed events.
So the primary unit is not just:
- a message
- a prompt
- a workflow step
- or a final output
The primary unit is a decision that may need challenge, arbitration, and audit history.
That is the main distinction. The goal is not simply:
- better generation
- better collaboration
- better filtering
The goal is to create a more explicit process for producing better judgment when the decision matters.
That difference also matters when people compare "agent governance" projects:
| Microsoft Agent Governance Toolkit | Agent Constitution | |
|---|---|---|
| Focus | Runtime security and policy enforcement | Decision quality and judgment quality |
| When it runs | Around agent actions and tool execution | Around high-stakes recommendations and decisions |
| Core question | Should this action be allowed? |
Is this recommendation sound enough to act on? |
| Mechanism | Policy engine, runtime interception, permission checks | Structured review, challenger/defender/judge flow, audit trail |
| Cost profile | Always-on runtime enforcement | Selective extra model calls when a decision merits challenge |
They are complementary, not substitutes.
There are three good ways to try the project, depending on what you want to verify.
1. Fastest public demo from a repo clone
git clone https://github.com/AgentPolis/agent-constitution.git
cd agent-constitution
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
python examples/demo_replay.pyThis replays a recorded live-model deploy review with real supporting files, so it shows the intended product surface without requiring an API key.
2. Best zero-key proof that attached documents change the judgment
ac debate "Should we deploy the billing-auth hotfix to production tonight?" \
--context-file examples/context/deploy/release-checklist.md \
--context-file examples/context/deploy/rollback-runbook.md \
--context-file examples/context/deploy/deploy-brief.mdExpected shape of the result:
2. Analyst evaluates: Should we deploy the billing-auth hotfix to production tonight?
Context files:
- examples/context/deploy/release-checklist.md
- examples/context/deploy/rollback-runbook.md
- examples/context/deploy/deploy-brief.md
Score: 79/100 (Promising)
3. Score 79 >= 70 — debate triggered
Verdict: proceed
Delta: +8
Final: 79 -> 87
3. Package smoke test after install
python3.11 -m pip install agent-constitution
ac debate "Should we deploy the billing-auth hotfix to production tonight?"agent-constitution requires Python 3.11+.
That path is still useful, but without attached files it should be read as a first-pass judgment rather than a fully grounded deploy decision.
The zero-config smoke test uses deterministic mock responses. For real-model debate, use --adapter claude, --adapter anthropic, or --adapter ollama.
For public-facing demos, this repo now prefers a recorded live-model replay over a generic mock transcript.
python examples/demo_replay.pyreplays a real deploy review captured from a live Claude run.ac debate "..." --context-file ...with no API key uses the mock adapter, but still lets you verify that attached files change the recommendation.ac debate "..."with no files is still a useful smoke test for CI and local installation.ac debate "..." --adapter anthropic|claude|ollamaruns the real judgment path.
That split is intentional:
- the replay demo shows what a real model can surface when it actually reads context files
- the mock path is still useful for deterministic testing, trigger checks, and hook validation
If you only want one public demo to point people at, use python examples/demo_replay.py.
There is also a real self-dogfooding example in docs/case-studies/readme-launch-review.md, where Agent Constitution reviews its own README before launch.
And a governance-method example in docs/case-studies/pre-execution-review-vs-post-audit.md, which compares pre-execution decision review with post-execution audit as evidence models.
ac debate is no longer one generic score template.
The initial analyst pass now changes rubric based on the decision type:
deploy:impact / readiness / rollback / blast_radius / evidencepricing:upside / precedent_risk / reversibility / evidence / strategic_fitorg_design:clarity / disruption / timing / reversibility / execution_riskgeneric:impact / readiness / risk / evidence / reversibility
Current score bands:
0-34: Weak35-49: Borderline50-69: Caution70-84: Promising85-100: Strong
Current judge deltas are discrete:
+8: strengthens case0: no material change-13: notable concern-21: major concern-34: stop-ship concern
That makes the output easier to read as a decision, not just a score.
Not every decision needs the same depth of review. DecisionPolicy now supports complexity-aware triggering:
from agent_constitution import TrustProtocol, VerificationTier, BaseAgent
protocol = TrustProtocol(
challenger=BaseAgent(role="critic", goal="Challenge"),
judge=BaseAgent(role="judge", goal="Arbitrate"),
tier=VerificationTier.HIGH,
min_complexity="high",
)
agent = BaseAgent(role="analyst", goal="Evaluate", hooks=[protocol.hook])Verification tiers:
LOW: skip debate — log trigger onlySTANDARD: single-round debate (challenger/defender/judge)HIGH: full debate with enforced context documentsCRITICAL: multi-round debate with forced context and expanded challenge set
Complexity can come from the agent response ("complexity": "high" in JSON output) or be set directly on the policy. The tier maps automatically: low → LOW, medium → STANDARD, high → HIGH, critical → CRITICAL.
Every debate now produces a tamper-evident hash chain:
genesis(chain_id) → assessment → challenge → defense → verdict
Each record contains a SHA-256 hash linking it to the previous record. If any record is modified after the fact, verification fails.
# After a debate completes
chain = gate.last_chain
artifact = chain.to_artifact()
# Pin this to an external system as a trust anchor
print(artifact["chain_root_hash"]) # SHA-256 hex
# Verify offline — no runtime needed
result = GovernanceChain.verify_artifact(artifact)
assert result.validWhat this guarantees:
- If anyone modifies a record after creation, the chain breaks
- The artifact is portable — export as JSON, verify anywhere
chain_root_hashcan be pinned to git commits, CI artifacts, or S3 as an external anchor
What this does not guarantee:
- It cannot prevent an attacker with write access from recomputing the entire chain (signatures planned for a future release)
- It cannot replay LLM debates to prove a specific model produced the output (LLM outputs are non-deterministic)
The signature field is reserved on every record for future Ed25519 signing.
The package is meant to be runnable, not just demoed.
For anything real, do not ask it to judge a one-line prompt in a vacuum.
The practical pattern is:
- ask a decision question
- attach background or supporting documents
- let the system tell you what is still missing before you act
If you do not provide context, the output should be treated as a first-pass judgment, not final approval.
You do not have to type ac debate in a product surface if the host already wires Agent Constitution in.
But the request still needs to be explicit enough that the system can identify:
- the decision being reviewed
- the downside if it is wrong
- the background or files that should be considered
These are good natural-language requests:
Use Agent Constitution to judge whether the billing-auth hotfix should go to production tonight.Run a challenger / defender / judge review on whether this README is ready for public launch.Assess this pricing exception, and tell me what context is missing before we approve it.Do a deploy decision review for this hotfix. Background is in the release checklist, rollback runbook, and deploy brief.
These are weak requests:
Can we ship this?What do you think?Is this okay?Can you look at that thing from earlier?
Weak requests can still be answered, but they should not be treated as fully grounded governance judgments.
You can install it and try these directly:
ac debate "Should we deploy the billing-auth hotfix to production tonight?"
ac debate "Should we approve this pricing exception for a strategic enterprise account?"
ac debate "Should we reorganize product and engineering into vertical pods before the Q4 launch?"
ac debate "Should we publish this README as-is for public launch?"And for a real decision, attach the files a reviewer would actually want:
ac debate "Should we deploy the billing-auth hotfix to production tonight?" \
--context-file examples/context/deploy/release-checklist.md \
--context-file examples/context/deploy/rollback-runbook.md \
--context-file examples/context/deploy/deploy-brief.mdThere are sample deploy context files in:
examples/context/deploy/
The last example is important: you can use Agent Constitution to review a README, launch plan, or proposal as long as you frame it as a decision.
What the package returns today:
- an initial analyst score with scenario-aware dimensions
- a trigger decision based on score threshold, complexity level, or both
- complexity-triggered verification tiers (LOW / STANDARD / HIGH / CRITICAL)
- judgments that change when you attach materially different supporting files
- challenger / defender / judge outputs when debate triggers
- a concrete next-step package:
missing_context,next_actions,upgrade_condition,downgrade_condition - a hash-chained governance record (tamper-evident, portable, offline-verifiable)
- an audit trail that can be recorded and revisited later
- a per-run markdown debate record under
workspace/debates/
By default, those workspace/ artifacts are created relative to the current working directory where you run ac.
What the mock path is good for:
- verifying trigger behavior
- validating product surfaces
- testing hooks and audit flows
- checking whether scenario rubrics feel sensible
- proving that attached documents affect the result instead of being ignored
What it is not yet:
- a claim of full calibration across all decision types
- proof that any real model backend will match mock behavior out of the box
- a replacement for domain-specific evidence gathering
In other words, it can tell you:
- what the current recommendation is
- what context is missing
- what evidence would upgrade the decision
- what evidence would downgrade it
That is the key boundary: without background and files, it can still give you a first-pass judgment, but it should not pretend the decision is fully grounded.
It cannot honestly tell you:
- that a deploy is safe if you never gave it the deployment prep material
- that a pricing exception is acceptable if finance context was never attached
- that a reorg is wise if the launch timeline and ownership map are missing
If you want fixed prompts and a repeatable smoke test:
- golden prompts: docs/golden-examples.md
- decision packet template: docs/decision-packet-template.md
- distribution check:
python scripts/check_scenario_distribution.py
Each ac debate ... run also writes a human-readable debate record like:
workspace/debates/20260405T113600Z-should-we-deploy-the-billing-auth-hotfix.md
Use real LLMs:
# Anthropic API
export ANTHROPIC_API_KEY=sk-ant-...
ac debate "topic" --adapter anthropic
# Local models (free, private)
ollama serve
ac debate "topic" --adapter ollama --model llama3
# Claude CLI
ac debate "topic" --adapter claude --model sonnet
# Mixed-model debate from the CLI
ac debate "topic" --adapter claude --model sonnet --critic-model opus --judge-model opusWhat you get right away:
- a zero-config replay demo based on a captured live-model run
- a zero-config debate smoke test with no API key
- strict schema validation for challenger, defender, and judge output
- complexity-triggered verification with four tiers (LOW / STANDARD / HIGH / CRITICAL)
- hash-chained governance records with offline artifact verification
- a
TrustProtocolfacade for one-line integration - a provisional governance score computed from recorded runs
- support for Mock, Anthropic, Ollama, and Claude CLI backends
Short answers to the obvious questions:
Can I use this with a single agent?Yes. A single assistant can still be wrapped with trigger rules, challenger/judge review, and audit history.Do I need premium models everywhere?No. Use stronger models where arbitration quality matters most, usually the critic and judge.When does this add value over a single stronger model?When the decision itself benefits from an explicit review process, not only a higher-quality answer.
What this is and is not:
This is not a claim that debate always beats the strongest single model.The point is to add challenge, arbitration, and auditability where a raw answer is not enough.This is not for every prompt.The intended use case is high-stakes or hard-to-reverse decisions such as deploys, auth changes, billing logic, pricing, compliance, and major architecture calls.This is not fully calibrated governance out of the box.Today the package records governance history and computes a provisional score from real runs; retrospective verification and credibility adjustment exist as library primitives and are still early-stage operationally.This is not free in token cost.Structured review adds extra model calls, so it should be triggered selectively where decision quality and auditability are worth the overhead.
If you are working from a local clone instead of the published package:
git clone https://github.com/AgentPolis/agent-constitution.git
cd agent-constitution
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
python -m pytest --tb=short -vtests/conftest.py also makes local imports work when pytest is launched from the
parent workspace, but editable install remains the recommended development path.
Agent Constitution does not require Sonnet- or Opus-class models to function, but model quality has a direct effect on debate quality.
- For onboarding, CI, and structure checks,
MockAdapteris enough - For lightweight product demos or internal prototyping, a smaller real model can be acceptable
- For high-stakes review, use a stronger reasoning model for at least the critic and judge roles
- For launch, security, compliance, pricing, architecture, or memory-conflict decisions, Sonnet / Opus class models or their equivalent are strongly recommended
Current role-model behavior:
- CLI path:
ac debate ... --adapter ... --model ...still gives you a fast shared default, but now also supports--analyst-*,--critic-*, and--judge-*overrides - Library path: you can give each
BaseAgent(...)its own adapter and model, so heterogeneous debate is also supported programmatically - Product choice today: same-model debates remain the easy path; mixed-model debates are also supported when you want a stronger critic or judge
Example: heterogeneous role setup
from adapters import AnthropicAPIAdapter
from constitution import BaseAgent, Constitution, Debate
rules = Constitution.default()
analyst = BaseAgent(
role="analyst",
goal="Produce the initial opportunity assessment",
adapter=AnthropicAPIAdapter(model="claude-sonnet-4-5"),
constitution=rules,
)
critic = BaseAgent(
role="critic",
goal="Pressure-test the assessment and surface hidden risks",
adapter=AnthropicAPIAdapter(model="claude-opus-4-1"),
constitution=rules,
)
judge = BaseAgent(
role="judge",
goal="Render the most reliable final verdict",
adapter=AnthropicAPIAdapter(model="claude-opus-4-1"),
constitution=rules,
)
debate = Debate(challenger=critic, defender=analyst, judge=judge)Equivalent CLI pattern:
ac debate "Should we ship this pricing change?" \
--adapter claude \
--model sonnet \
--critic-model opus \
--judge-model opusThree practical rules of thumb:
- Developer: start with one shared model to simplify ops, then split judge / critic onto stronger models when accuracy matters more than cost
- Product manager: if the debate outcome changes user-visible recommendations, budget for a stronger judge before you budget for a fancier analyst
- Agent designer: the critic and judge usually benefit most from stronger reasoning; the analyst can often be cheaper if its output is challengeable and schema-validated
One caution: a stronger model does not replace good role design. If the personas, constitutions, and trigger policy are weak, using Opus everywhere mostly makes the process more expensive, not more reliable.
ac debate "topic" does not skip straight to the challenger/defender/judge round.
It runs in two stages:
- the analyst produces an initial scored assessment
- the debate engine checks whether that score crosses the trigger threshold
By default, structured debate triggers only when the initial analyst score is 70/100 or higher.
- If score
>= 70, the critic, defender, and judge run - If score
< 70, the CLI exits after the initial assessment and records that debate was not triggered
Quick ways to see it:
# Full CLI path: assessment first, then auto-trigger if threshold is met
ac debate "Should we expand from mid-market to enterprise this year?"
# Standalone mock demo with explicit topic input
python examples/demo_debate.py --topic "Should we expand from mid-market to enterprise this year?"If you want to force a debate programmatically, call Debate.run(...) directly after your own scoring step. The library-level trigger check is Debate.should_trigger(score).
In practice, Agent Constitution is often most useful as a library gate inside an existing agent pipeline, not only as a standalone CLI.
Common trigger patterns:
- Score-gated library call
result = my_agent.run("Should we deploy to production?")
score = extract_score(result)
debate = Debate(challenger=critic, defender=my_agent, judge=judge)
if debate.should_trigger(score):
verdict = debate.run(topic=result, initial_score=score)- Hook-based governance gate
from constitution import BaseAgent, Constitution, DecisionPolicy, GovernanceGateHook
policy = DecisionPolicy(
action_types={"deploy", "launch", "migrate"},
environments={"production"},
critical_keywords={"auth", "billing", "security"},
match_mode="any",
)
gate = GovernanceGateHook(
challenger=critic,
defender=defender,
judge=judge,
trigger_policy=policy,
render_mode="summary", # "silent" | "summary" | "full_transcript"
response_formatter=GovernanceGateHook.chat_response_formatter("summary"),
)
agent = BaseAgent(
role="planner",
goal="Produce deployment recommendations",
constitution=Constitution.default(),
hooks=[gate],
)
response = agent.run("Should we deploy version 1.8.2 to production?")
if gate.last_result is not None:
print(gate.last_result.verdict)
print(gate.last_trigger_reasons)- External workflow trigger
Use your own rules to decide when a debate is mandatory, or let the built-in policy gate infer common high-stakes signals from planner output. This is the path that best fits external systems such as PM planners, deploy bots, PR reviewers, or memory pipelines that are not "Agent Constitution agents" themselves.
GovernanceGateHook is deliberately a review layer over planner output and recommendation text. It is not a runtime permission system or an execution sandbox.
Auto-trigger opportunities usually show up around:
- production deploys
- auth / billing / security-sensitive PRs
- large architectural changes
- expensive or irreversible business decisions
- planner output with
action,environment, ordecision_typefields
If your upstream system is not debate-aware, pass a dedicated defender= agent into GovernanceGateHook(...) so the gate can challenge the decision without requiring the original planner to emit debate-shaped JSON.
User-facing render modes:
silent: keep the original response unchanged, but still populategate.last_resultsummary: append or inject a compact governance verdict for chat surfacesfull_transcript: attach challenges, defenses, and audit trail for audit-heavy views
For chat products, pair render_mode="summary" with GovernanceGateHook.chat_response_formatter("summary") so the user sees a polished assistant reply instead of raw JSON enrichment.
If you want the most guided end-user experience, start with python examples/demo_interactive.py --mock, then try python examples/demo_governance_gate.py, python examples/demo_user_experience.py, and python examples/demo_chat_surface.py to see what each render mode looks like in practice.
Every agent injects epistemic rules at the system-prompt level.
Speculation must be tagged [SPECULATION]. Bad news gets promoted. Confidence is always 0.0-1.0.
Each agent has its own SOUL.md:
# Analyst — Nate
## Mission
Evaluate opportunities with calibrated, multi-dimensional assessments.
## Hard Constraints
- Inherits all rules from ../../CONSTITUTION.md
- Must tag any market size estimate above $10B as [SPECULATION] unless sourced
- Always present the bear case before the bull caseRules live in version-controlled markdown instead of inline strings. Edit a markdown file to change how an agent thinks.
High-scoring assessments (score >= 70/100) trigger structured debates so strong recommendations are challenged before action. A challenger raises three specific challenges. The defender rebuts each. A judge renders a verdict with a score delta and full audit trail.
from constitution import BaseAgent, Constitution, Debate
rules = Constitution.default()
analyst = BaseAgent(role="analyst", goal="Evaluate opportunities", constitution=rules)
critic = BaseAgent(role="critic", goal="Challenge assumptions", constitution=rules)
judge = BaseAgent(role="judge", goal="Render fair verdicts", constitution=rules)
debate = Debate(challenger=critic, defender=analyst, judge=judge)
result = debate.run(topic="Should we expand from mid-market to enterprise this year?")
result.verdict # "proceed_with_caution"
result.score_delta # -21
result.challenges # ["Market is more competitive than assessed", ...]
result.audit_trail # Full debate recordDebate-stage LLM responses go through separate validation before they're trusted.
The debate engine uses explicit schema validators (_validate_challenges, _validate_defenses, _validate_verdict) and rejects malformed challenger, defender, and judge output by default. If you want legacy fallback behavior, opt into strict_validation=False.
This is currently an early library primitive for prediction tracking and credibility adjustment. It is useful for experimentation and API design, but it is not yet a persisted retrospective system with end-to-end operational workflow.
from constitution import Retrospective
retro = Retrospective()
pred = retro.record_prediction("analyst", "Market will grow 3x", confidence=0.75)
# ... time passes ...
retro.verify(pred.id, outcome="correct") # credibility +0.05
retro.get_credibility("analyst") # 1.05Plug into any point in the governance pipeline without modifying core code.
from constitution import BaseAgent, Debate, DebateHook, AgentHook
class AuditHook(DebateHook):
"""Log every debate step to an external system."""
def post_verdict(self, result):
send_to_datadog(result.audit_trail)
return result
class CostApprovalHook(AgentHook):
"""Allow cost overruns instead of crashing."""
def on_cost_limit(self, agent, cost_usd, total_cost):
return "warn" # "raise" (default) | "warn" | "allow"
# Built-in gate for existing agent pipelines
from constitution import DecisionPolicy, GovernanceGateHook
# Hooks compose — pass multiple, they chain in order
debate = Debate(challenger, defender, judge, hooks=[AuditHook()])
agent = BaseAgent(role="analyst", goal="Evaluate", hooks=[CostApprovalHook()])
policy = DecisionPolicy.high_stakes_default()
gate = GovernanceGateHook(
challenger=critic,
defender=defender,
judge=judge,
trigger_policy=policy,
render_mode="summary",
)Available hook points:
| Hook | When | Can modify |
|---|---|---|
AgentHook.pre_call |
Before LLM call | Prompt |
AgentHook.post_call |
After LLM call | Response content |
AgentHook.on_cost_limit |
Cost would exceed limit | Raise / warn / allow |
DebateHook.pre_challenge |
Before challenger runs | Topic |
DebateHook.post_challenge |
After challenge validation | Challenges list, revalidated before use |
DebateHook.pre_defense |
Before defender runs | Challenges |
DebateHook.post_defense |
After defense validation | Defenses list, revalidated before use |
DebateHook.pre_verdict |
Before judge runs | Abort (raise) |
DebateHook.post_verdict |
After verdict | Full result, revalidated before return |
DebateHook.on_validation_error |
Schema validation fails | Raise / fallback |
Hooks are best for logging, policy gates, and controlled transformations. In strict mode, any hook mutation that breaks the validated debate schema is rejected. DecisionPolicy lets a gate trigger from score, action type, environment, or critical keywords instead of relying on a single hard-coded score path.
Measure how well-governed your agent system is from recorded CLI runs:
ac scoreDimension Score Weight
───────────────────────── ───── ──────
Epistemic Honesty 8/10 25%
Constitutional Compliance 7/10 25%
Debate Rigor 6/10 20%
Calibration Accuracy N/A 15%
Audit Completeness 9/10 15%
Provisional Governance Score: 6.3/10
The governance score tracks five dimensions: epistemic honesty, constitutional compliance, debate rigor, calibration accuracy, and audit completeness. ac debate records governance data to workspace/governance_history.json, and ac score aggregates those real runs instead of printing placeholders. Until you verify retrospectives, the report stays explicitly uncalibrated and should be treated as a provisional operational snapshot rather than a final grade. It is best read as a process-level proxy for how the workflow is being used, not as proof of real-world decision quality.
2026 is the year of agent governance. Singapore launched the world's first Agentic AI Governance Framework at WEF 2026. Gartner predicts 40% of enterprise apps will feature AI agents by year-end. There is growing attention on governance, but most public discussion still sits at the framework or policy layer.
Agent Constitution is one attempt to turn that discussion into installable code: explicit constitutional rules, policy-based triggering, structured adversarial review, and auditable outputs. It is not the only way to approach this problem, but it is a concrete starting point.
Agent Constitution is not a replacement for orchestration frameworks. It is a governance layer that can work alongside them.
| Dimension | CrewAI | LangGraph | AutoGen | Agent Constitution |
|---|---|---|---|---|
| Agent coordination | Yes | Yes | Yes | Debate-scoped only |
| Adversarial debate | Not built-in | Not built-in | Via GroupChat | Structured + schema-validated |
| Retrospective verification | Not built-in | Not built-in | Not built-in | Library primitives |
| Human-readable rules (SOUL.md) | Not built-in | Not built-in | Not built-in | Yes |
| Team governance | Not built-in | Not built-in | Limited | Core focus |
| Cost tracking | Via LiteLLM | Via callbacks | Via token tracking | Built-in + hooks |
These frameworks solve how agents coordinate. Agent Constitution focuses on a different question: how decisions get challenged, judged, and audited. They are complementary rather than competing.
The same governance ideas can wrap a single personal agent too.
from constitution import BaseAgent, Constitution
from adapters import OllamaAdapter
personal = BaseAgent(
role="personal_assistant",
goal="Help me think clearly",
constitution=Constitution.from_soul_md("my_soul.md"),
adapter=OllamaAdapter(model="llama3") # Free, local
)Same constitutional governance. Epistemic honesty, self-challenge, calibrated confidence.
python examples/demo_personal.py| Adapter | Requires | Use case |
|---|---|---|
MockAdapter |
Nothing | Testing, demos, CI |
AnthropicAPIAdapter |
ANTHROPIC_API_KEY |
Production with API billing |
ClaudeCLIAdapter |
Claude Max subscription | Local dev with Claude CLI |
OllamaAdapter |
Ollama running locally | Free, private, any open model |
Add your own:
from adapters import LLMAdapter, LLMResponse
class MyAdapter(LLMAdapter):
def call(self, messages, system_prompt="", tools=None, max_tokens=4096) -> LLMResponse:
...| Module | What it does |
|---|---|
constitution/debate.py |
Adversarial debate engine + schema validators |
constitution/retrospective.py |
Early library primitive for prediction tracking and credibility adjustment |
constitution/governance_score.py |
5-dimension governance scoring from recorded runs |
constitution/cost_guard.py |
Token cost accounting with hard-limit enforcement after each call |
constitution/base_agent.py |
BaseAgent with constitution injection |
constitution/hooks.py |
AgentHook + DebateHook lifecycle system |
constitution/cli.py |
ac CLI entry point (ac debate, ac score) |
adapters/mock.py |
Debate-aware mock adapter (zero API key) |
adapters/anthropic_api.py |
Anthropic API adapter |
adapters/ollama.py |
Ollama local models adapter |
adapters/claude_cli.py |
Claude CLI adapter |
| Tech | Role |
|---|---|
| Python 3.11+ | Runtime |
| Rich | CLI formatting and tables |
| PyYAML | Constitution / SOUL.md loading |
| httpx | HTTP client for Ollama and API adapters |
| pytest | 215 tests, zero API keys required |
| ruff | Linting and formatting |
CONSTITUTION.md Shared epistemic rules (injected to all agents)
examples/agents/
analyst/SOUL.md Analyst identity, values, constraints
critic/SOUL.md Critic persona, debate role
judge/SOUL.md Judge impartiality rules
adapters/
mock.py Debate-aware mock (zero API key)
anthropic_api.py Anthropic API
claude_cli.py Claude CLI
ollama.py Ollama local models
constitution/
base_agent.py BaseAgent with constitution injection
constitution.py Constitution loader (SOUL.md / YAML / default)
debate.py Adversarial debate engine + schema validators
signal.py + signal_pool.py Early signal primitives for dedup and pooling
cost_guard.py Token cost accounting and hard-limit enforcement
trace.py RunTrace audit trail
retrospective.py Early prediction recording primitive
governance_score.py Five-dimension governance scoring
cli.py `ac` CLI
- Generator/Validator separation: Debate-stage LLM responses are generated, then validated by a separate function. The debate engine uses
_validate_challenges(),_validate_defenses(), and_validate_verdict()and raisesDebateValidationErroron malformed debate output by default. - Constitution as prompt injection: Rules live in markdown files, not Python strings.
SOUL.mdfiles are human-readable and version-controllable. - Cost guard with hard limit: Cost is accounted for after each completed call. If cumulative cost would cross the hard limit, the guard raises
CostLimitExceededimmediately after that call unless a hook explicitly allows the over-limit result.
Agent Constitution emerged from sustained experimentation with real decision workflows, review gates, and agent-mediated judgment loops.
It reflects a point of view shaped by implementation and repeated testing: for certain classes of decisions, improving model capability alone may not be enough. The decision itself can benefit from explicit challenge, arbitration, and auditability.
This repository does not present itself as a finalized standard for agent governance. It is a concrete, working proposal for how governance can be added to machine-made decisions without rebuilding an entire system from scratch.
We hope it serves both as a usable toolkit and as an invitation:
- to evaluate these ideas in real workflows
- to challenge the assumptions behind them
- to contribute stronger evidence, counterexamples, and better patterns
Multi-agent debate improves factual reasoning and reduces hallucination (Du et al., 2023). Heterogeneous agents with dynamic debate mechanisms outperform homogeneous approaches (FREE-MAD, 2025).
Agent Constitution draws on these findings and adapts them into a practical, installable governance workflow. The research is still evolving, and so is this project.
These are exploratory directions, not shipped features.
Current scope — Governance harness
- Constitutional agent governance via
SOUL.md - Adversarial debate engine (challenger/defender/judge)
- Early retrospective primitives for prediction tracking and credibility adjustment
- Personal agent mode
- 4 LLM backends: Mock, Anthropic, Claude CLI, Ollama
acCLI + Governance Score
Near-term extensions
- Per-turn token budgets (not just session-level hard limits)
- Permission gates on adapter calls (sub-agents get restricted scope)
- Auto-compaction with semantic retention for long-running sessions
- Consolidation engine: background learning extraction during idle time
Longer-term experiments
-
Skill auto-creation from experience (with adversarial review before promotion)
-
Dream/consolidation cycle: session end → extract learnings → update SOUL.md
-
Memory MCP server (recall/store/consolidate across sessions)
-
Model Context Protocol integration (tools as MCP servers)
-
Cross-framework collaboration patterns
-
Multi-platform gateway (Discord, Telegram, Slack)
Why start with governance? Because protocols solve how agents communicate. Agent Constitution focuses on a complementary question: how decisions get challenged, judged, and audited. We believe this layer is worth getting right early.
Apache-2.0 — see LICENSE
Use it freely. Modify it freely. Build on it commercially. Contributions are welcome under the project's CLA.