diff --git a/README.md b/README.md
index 1b35e5c..6e7493b 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,666 @@
-# AREF - Adaptive Resilience Engineering Framework
+
+
+# AREF — Adaptive Resilience Engineering Framework
+
+**A comprehensive systems-level platform for infrastructure resilience, failure recovery, and operational continuity**
+
+[](https://github.com/dnoice/AREF)
+[](https://www.python.org/downloads/)
+[](LICENSE)
+[](https://fastapi.tiangolo.com)
+[](docker-compose.yml)
+[](tests/)
+
+*Author: Dennis 'dnoice' Smaltz — digiSpace Technical Studio*
+
+
+
+---
+
+## Table of Contents
+
+- [Overview](#overview)
+- [The Five Pillars of Resilience](#the-five-pillars-of-resilience)
+- [Architecture](#architecture)
+- [Key Features](#key-features)
+- [Tech Stack](#tech-stack)
+- [Quick Start](#quick-start)
+ - [Option 1 — Full Stack (Docker)](#option-1--full-stack-docker)
+ - [Option 2 — Dashboard Only (Local)](#option-2--dashboard-only-local)
+ - [Option 3 — Interactive Demo](#option-3--interactive-demo)
+ - [Option 4 — CLI](#option-4--cli)
+- [Installation](#installation)
+- [Configuration](#configuration)
+- [Microservices](#microservices)
+- [Dashboard & API](#dashboard--api)
+- [CLI Reference](#cli-reference)
+- [Chaos Engineering](#chaos-engineering)
+- [Composite Resilience Score (CRS)](#composite-resilience-score-crs)
+- [Maturity Model](#maturity-model)
+- [Runbooks](#runbooks)
+- [Testing](#testing)
+- [Project Structure](#project-structure)
+- [Contributing](#contributing)
+- [License](#license)
+
+---
+
+## Overview
+
+**AREF** (Adaptive Resilience Engineering Framework) is a production-grade reference implementation of a **self-healing distributed systems platform**. It operationalises the five engineering disciplines required to keep complex services reliable under failure conditions:
+
+1. **Detect** anomalies early, before they cascade
+2. **Absorb** the blast radius and contain the damage
+3. **Adapt** dynamically by reconfiguring in real-time
+4. **Recover** in a tiered, runbook-driven fashion
+5. **Evolve** continuously through post-incident learning
+
+AREF ships with a live control-plane dashboard, a rich CLI, five instrumented microservices, a chaos-injection engine, Prometheus/Grafana observability, and a complete test suite — everything needed to study, demo, and extend resilience engineering patterns.
+
+> **Status:** Reference / Framework stage. All state is in-memory. PostgreSQL and Redis are provisioned in Docker but not yet wired to the application layer (planned future phase).
+
+---
+
+## The Five Pillars of Resilience
+
+```
+┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
+│ Pillar I │ │ Pillar II │ │ Pillar III │ │ Pillar IV │ │ Pillar V │
+│ DETECTION │→ │ ABSORPTION │→ │ ADAPTATION │→ │ RECOVERY │→ │ EVOLUTION │
+│ │ │ │ │ │ │ │ │ │
+│ Early │ │ Impact │ │ Real-time │ │ Tiered │ │ Post- │
+│ warning & │ │ containment │ │ reconfig- │ │ service │ │ incident │
+│ anomaly │ │ & blast │ │ uration │ │ restoration │ │ learning │
+│ detection │ │ radius ctrl │ │ │ │ (T0 – T4) │ │ │
+└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
+```
+
+| Pillar | Purpose | Key Components |
+|--------|---------|----------------|
+| **I — Detection** | Identify failures before they become outages | Threshold detector, statistical anomaly detector (Z-score), synthetic probing, SLI/SLO tracking |
+| **II — Absorption** | Contain blast radius and prevent cascades | Circuit breaker (3-state FSM), bulkhead isolation, token-bucket rate limiter, 4-tier graceful degradation, blast-radius graph |
+| **III — Adaptation** | Reconfigure the system in real-time | Feature flags, weighted traffic shifting, horizontal auto-scaler, 6-step decision tree |
+| **IV — Recovery** | Restore service in a structured, time-boxed way | Tiered runbook executor (T0–T4), YAML runbooks, incident commander workflow |
+| **V — Evolution** | Turn every incident into a system improvement | Automated post-incident reviews, action-item tracker, pattern matcher, knowledge base |
+
+---
+
+## Architecture
+
+```
+ ┌─────────────────────────────────┐
+ │ AREF Dashboard (port 8080) │
+ │ FastAPI + SPA Web UI │
+ │ REST API / Control Plane │
+ └────────────┬────────────────────┘
+ │ events / polling
+ ┌────────────────────────▼────────────────────────────┐
+ │ AREF Engine Core │
+ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
+ │ │Detection │ │Absorption│ │Adaptation│ │
+ │ │ Engine │ │ Engine │ │ Engine │ │
+ │ └──────────┘ └──────────┘ └──────────┘ │
+ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
+ │ │Recovery │ │Evolution │ │ Maturity │ │
+ │ │ Engine │ │ Engine │ │ Assessor │ │
+ │ └──────────┘ └──────────┘ └──────────┘ │
+ │ Event Bus (pub/sub) │
+ └───────────────────┬─────────────────────────────────┘
+ │ HTTP / health checks
+ ┌─────────────────────────▼──────────────────────────────────┐
+ │ Microservice Layer │
+ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
+ │ │ Gateway │ │ Orders │ │ Payments │ │ Inventory │ │
+ │ │ :8000 │ │ :8001 │ │ :8002 │ │ :8003 │ │
+ │ └──────────┘ └──────────┘ └──────────┘ └──────────────┘ │
+ │ ┌──────────────────┐ │
+ │ │ Notifications │ │
+ │ │ :8004 │ │
+ │ └──────────────────┘ │
+ └────────────────────────────────────────────────────────────┘
+ │
+ ┌─────────────────────────▼──────────────────────────────────┐
+ │ Observability Layer │
+ │ Prometheus :9090 · Grafana :3000 │
+ └────────────────────────────────────────────────────────────┘
+```
+
+### Event-Driven Communication
+
+All engines communicate through an **async in-process event bus** using a `"category.event_type"` topic format:
+
+| Topic | Published by |
+|-------|-------------|
+| `detection.alert_fired` | Detection Engine |
+| `absorption.circuit_breaker_opened` | Absorption Engine |
+| `adaptation.adaptation_executed` | Adaptation Engine |
+| `recovery.recovery_started` | Recovery Engine |
+| `recovery.recovery_resolved` | Recovery Engine |
+| `evolution.post_incident_review_generated` | Evolution Engine |
+
+Engines can subscribe using wildcards: `"detection.*"` or `"*"` for all events.
+
+---
+
+## Key Features
+
+- **Five-Pillar Resilience Framework** — A complete, end-to-end incident lifecycle implemented in code
+- **Composite Resilience Score (CRS)** — A single weighted metric (0–5) reflecting overall system resilience across configurable risk profiles
+- **Five-Level Maturity Model** — Per-pillar gap analysis from *Reactive* to *Optimizing*
+- **Live Dashboard** — Single-page control plane with real-time CRS, pillar health, incident timeline, and chaos controls
+- **Rich CLI** — Full `aref` command-line interface for status, maturity, chaos, and timeline
+- **Chaos Engineering Engine** — Five pre-defined fault-injection experiments with automatic rollback
+- **Runbook-Driven Recovery** — YAML-defined, version-controlled runbooks with T0–T4 tiering
+- **Prometheus + Grafana** — Out-of-the-box metrics, scrape configs, and Grafana provisioning
+- **Five Instrumented Microservices** — Gateway + Orders + Payments + Inventory + Notifications with shared factory
+- **Structured Logging** — structlog throughout, with correlation IDs
+- **Interactive Demo** — End-to-end Rich terminal demo with scenario walkthroughs
+- **Full Test Suite** — Unit and integration tests covering the complete incident lifecycle
+
+---
+
+## Tech Stack
+
+| Layer | Technology |
+|-------|-----------|
+| Language | Python 3.11+ |
+| Web Framework | FastAPI 0.110+ |
+| ASGI Server | Uvicorn |
+| Data Validation | Pydantic v2, pydantic-settings |
+| CLI | Click + Rich |
+| Metrics | Prometheus Client, prometheus-fastapi-instrumentator |
+| Anomaly Detection | NumPy (Z-score, statistical baseline) |
+| Logging | structlog |
+| Runbooks | PyYAML |
+| HTTP Client | HTTPX |
+| Containerisation | Docker, Docker Compose |
+| Observability | Prometheus, Grafana |
+| Database (provisioned) | PostgreSQL 16 |
+| Cache (provisioned) | Redis 7 |
+| Build | Hatchling |
+| Test | pytest, pytest-asyncio, pytest-cov |
+| Lint / Format | Ruff |
+| Type Check | mypy (strict) |
+
+---
+
+## Quick Start
+
+### Option 1 — Full Stack (Docker)
+
+The fastest way to run everything — five microservices, the AREF control plane, Prometheus, and Grafana:
+
+```bash
+git clone https://github.com/dnoice/AREF.git
+cd AREF
+
+# (Optional) copy and customise environment config
+cp .env.example .env
+
+docker compose up --build
+```
+
+| Service | URL |
+|---------|-----|
+| AREF Dashboard | http://localhost:8080 |
+| API Gateway | http://localhost:8000 |
+| Orders Service | http://localhost:8001 |
+| Payments Service | http://localhost:8002 |
+| Inventory Service | http://localhost:8003 |
+| Notifications Service | http://localhost:8004 |
+| Prometheus | http://localhost:9090 |
+| Grafana | http://localhost:3000 (admin / aref) |
+
+### Option 2 — Dashboard Only (Local)
+
+Run just the AREF control plane locally without Docker:
+
+```bash
+# Install with dev dependencies
+pip install -e ".[dev]"
+
+# Start the dashboard
+uvicorn aref.dashboard.app:app --port 8080 --reload
+```
+
+Open http://localhost:8080 in your browser.
+
+### Option 3 — Interactive Demo
+
+An end-to-end Rich terminal walkthrough of the full incident lifecycle:
+
+```bash
+pip install -e ".[dev]"
+
+# Run the full interactive demo
+python -m scripts.demo
+
+# Run a specific experiment directly
+python -m scripts.demo --experiment payment_provider_failure
+```
+
+### Option 4 — CLI
+
+```bash
+pip install -e ".[dev]"
+
+aref status # Platform overview & CRS
+aref pillars # Per-pillar health & scores
+aref maturity # Maturity assessment & gap analysis
+aref timeline # Recent event history
+aref chaos list # Available chaos experiments
+aref chaos run payment_provider_failure # Inject a fault
+aref serve # Start dashboard from CLI
+```
+
+---
+
+## Installation
+
+**Prerequisites:** Python 3.11+, pip
+
+```bash
+# Clone the repository
+git clone https://github.com/dnoice/AREF.git
+cd AREF
+
+# Install runtime dependencies only
+pip install -e .
+
+# Install with developer tooling (tests, lint, type-check)
+pip install -e ".[dev]"
+```
+
+---
+
+## Configuration
+
+All configuration is driven by environment variables (or a `.env` file). Copy the template to get started:
+
+```bash
+cp .env.example .env
+```
+
+### General Settings
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `AREF_ENVIRONMENT` | `development` | Set to `docker` for container hostnames |
+| `AREF_DEBUG` | `true` | Enable debug mode |
+| `AREF_LOG_LEVEL` | `INFO` | Log verbosity |
+| `AREF_RISK_PROFILE` | `balanced` | CRS weighting profile (see [CRS section](#composite-resilience-score-crs)) |
+| `AREF_API_HOST` | `0.0.0.0` | API bind address |
+| `AREF_API_PORT` | `8000` | API listen port |
+| `AREF_DASHBOARD_PORT` | `8080` | Dashboard listen port |
+
+### Pillar-Specific Settings (selected)
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `AREF_DETECTION_MTTD_TARGET_SECONDS` | `300` | MTTD target (< 5 min) |
+| `AREF_ABSORPTION_CIRCUIT_BREAKER_FAILURE_THRESHOLD` | `5` | Failures before circuit opens |
+| `AREF_ABSORPTION_CIRCUIT_BREAKER_RECOVERY_TIMEOUT` | `30` | Seconds before half-open |
+| `AREF_ABSORPTION_RATE_LIMIT_REQUESTS_PER_SECOND` | `100` | Token bucket rate limit |
+| `AREF_ADAPTATION_ADAPTATION_WINDOW_SECONDS` | `120` | Window before escalating to recovery |
+| `AREF_RECOVERY_MTTR_TARGET_SECONDS` | `900` | MTTR target (< 15 min) |
+| `AREF_EVOLUTION_ACTION_COMPLETION_RATE_TARGET` | `85` | Target action completion % |
+
+See [`.env.example`](.env.example) for the full list of 60+ variables covering all five pillars, database, Redis, and Grafana.
+
+---
+
+## Microservices
+
+All five services are built through the shared `services/base.py` factory (`create_service()`), which automatically provides:
+
+- Prometheus instrumentation (`/metrics`)
+- Health and readiness probes (`/health`, `/readyz`)
+- Service info endpoint (`/info`)
+- Correlation ID propagation
+- Structured logging
+- CORS middleware
+
+| Service | Port | Responsibility |
+|---------|------|---------------|
+| **Gateway** | 8000 | Request routing, retry logic, circuit-breaker awareness, order pipeline orchestration |
+| **Orders** | 8001 | Order lifecycle state machine, payment callbacks, inventory coordination, audit trail |
+| **Payments** | 8002 | Payment provider integration, failure simulation, queued payments |
+| **Inventory** | 8003 | Stock management, reservation, degradation scenarios |
+| **Notifications** | 8004 | Queue-based notification dispatch |
+
+---
+
+## Dashboard & API
+
+The AREF Dashboard is a **FastAPI** application (`aref/dashboard/app.py`) that serves:
+
+- A **single-page web application** (`/`) with six tabs: Overview, Pillars, Services, Metrics, Chaos, and Timeline
+- A **REST API** under `/api/v1/` with four route groups:
+
+| Route Group | Path Prefix | Description |
+|-------------|------------|-------------|
+| Status | `/api/v1/status/` | Platform status, service health, incident list |
+| Pillars | `/api/v1/pillars/` | Per-pillar status, scores, and active alerts |
+| Metrics | `/api/v1/metrics/` | CRS, MTTD, MTTR, pillar scores |
+| Chaos | `/api/v1/chaos/` | List experiments, inject faults, rollback |
+
+The dashboard polls all five microservices every 10 seconds and updates the UI in real-time.
+
+---
+
+## CLI Reference
+
+The `aref` CLI is built with Click and Rich:
+
+```
+Usage: aref [OPTIONS] COMMAND [ARGS]...
+
+Commands:
+ status Display platform status and overall CRS score
+ pillars Show per-pillar health, scores, and active alerts
+ maturity Run maturity assessment and gap analysis
+ timeline Display recent event history from the event bus
+ chaos Manage and run chaos experiments
+ serve Start the AREF dashboard server
+```
+
+```bash
+# View platform status with CRS
+aref status
+
+# Detailed pillar breakdown
+aref pillars
+
+# Maturity assessment across all five pillars
+aref maturity
+
+# Recent event timeline (last 20 events)
+aref timeline --limit 20
+
+# List available chaos experiments
+aref chaos list
+
+# Run a specific experiment
+aref chaos run cascading_failure
+
+# Start the dashboard (alternative to uvicorn)
+aref serve --port 8080
+```
+
+---
+
+## Chaos Engineering
+
+AREF ships with a **FaultInjector** (`chaos/injector.py`) and five pre-defined experiments (`chaos/experiments.py`):
+
+| Experiment | Target | Fault Type | Description |
+|------------|--------|-----------|-------------|
+| `payment_provider_failure` | Payments | Error injection | Simulates a payment provider outage |
+| `order_service_latency` | Orders | Latency injection | Adds artificial latency to order processing |
+| `inventory_degradation` | Inventory | Error rate + degradation | Triggers graceful degradation mode |
+| `notification_overload` | Notifications | Load spike | Floods the notification queue |
+| `cascading_failure` | Multiple | Multi-service | Simulates a cascading cross-service failure |
+
+All experiments include **automatic rollback** — the injector restores the original service behaviour when the experiment concludes or times out.
+
+### Running Chaos via the API
+
+```bash
+# List experiments
+GET /api/v1/chaos/experiments
+
+# Start an experiment
+POST /api/v1/chaos/experiments/{experiment_id}/start
+
+# Rollback (stop experiment)
+POST /api/v1/chaos/experiments/{experiment_id}/stop
+```
+
+---
+
+## Composite Resilience Score (CRS)
+
+The **CRS** is a single weighted metric in the range **0.0 – 5.0** that reflects overall system resilience. The weight of each pillar is determined by the active **risk profile**:
+
+| Pillar | Availability Critical | Data Integrity Critical | Balanced | Innovation Heavy |
+|--------|----------------------|------------------------|---------|----------------|
+| Detection | 30% | 20% | 20% | 15% |
+| Absorption | 25% | 20% | 20% | 15% |
+| Adaptation | 20% | 15% | 20% | 25% |
+| Recovery | 15% | 30% | 20% | 15% |
+| Evolution | 10% | 15% | 20% | 30% |
+
+Set the risk profile via `AREF_RISK_PROFILE` environment variable (options: `balanced`, `availability_critical`, `data_integrity_critical`, `innovation_heavy`).
+
+**Formula:**
+```
+CRS = Σ (pillar_score × weight) for each pillar in {I, II, III, IV, V}
+```
+
+---
+
+## Maturity Model
+
+Each pillar is assessed independently on a **five-level maturity scale**:
+
+| Level | Name | Characteristics |
+|-------|------|----------------|
+| **1** | Reactive | Ad-hoc responses, no documented processes |
+| **2** | Managed | Repeatable processes, basic monitoring in place |
+| **3** | Defined | Documented procedures, standard tooling adopted |
+| **4** | Measured | Quantified metrics, targets tracked and met |
+| **5** | Optimizing | Continuous improvement, full automation, feedback loops |
+
+The **MaturityAssessor** (`aref/maturity/model.py`) calculates a score for each pillar, identifies gaps, and generates prioritised improvement recommendations. Access via `aref maturity` or the Dashboard *Maturity* tab.
+
+---
+
+## Runbooks
+
+Recovery runbooks are **YAML-defined** files stored in `runbooks/` and executed by the `RunbookExecutor` (`aref/recovery/runbooks.py`).
+
+### Runbook Structure
+
+```yaml
+runbooks:
+ - name: payment_t0_stabilize
+ service: payments
+ tier: 0 # T0 = fully automated, 0–5 minutes
+ version: "1.0.0"
+ description: "Emergency stabilization for payment provider outage"
+ steps:
+ - order: 1
+ action: detect_payment_failures
+ automated: true
+ timeout_seconds: 10
+ - order: 2
+ action: open_circuit_breaker
+ automated: true
+ timeout_seconds: 5
+ - order: 3
+ action: switch_provider
+ automated: true
+ timeout_seconds: 10
+ escalation: "Page on-call if backup provider also fails"
+```
+
+### Recovery Tiers
+
+| Tier | Time Window | Automation | Owner |
+|------|-------------|-----------|-------|
+| **T0** | 0 – 5 min | Fully automated | System |
+| **T1** | 5 – 15 min | Mostly automated | Incident Commander |
+| **T2** | 15 – 60 min | Semi-automated | Engineering team |
+| **T3** | 1 – 4 hours | Manual with tooling | Senior engineers |
+| **T4** | 1 – 2 weeks | Process-driven | Leadership + Engineering |
+
+---
+
+## Testing
+
+```bash
+# Install dev dependencies
+pip install -e ".[dev]"
+
+# Run all tests
+pytest tests/ -v
+
+# Run with coverage report
+pytest tests/ -v --cov=aref --cov=services
+
+# Run only unit tests
+pytest tests/unit/ -v
+
+# Run only integration tests
+pytest tests/integration/ -v
+```
+
+### Test Coverage
+
+| Suite | File | What's Tested |
+|-------|------|--------------|
+| Unit | `tests/unit/test_core.py` | Config, event bus, metrics, models |
+| Unit | `tests/unit/test_pillars.py` | Circuit breaker FSM, feature flags, recovery tiers |
+| Integration | `tests/integration/test_full_pipeline.py` | Complete incident lifecycle: detection → absorption → adaptation → recovery → evolution |
+
+### Linting & Type Checking
+
+```bash
+# Lint with Ruff
+ruff check .
+
+# Type check with mypy (strict mode)
+mypy aref/
+```
+
+---
+
+## Project Structure
+
+```
+AREF/
+├── aref/ # Core AREF framework package
+│ ├── core/ # Shared infrastructure
+│ │ ├── config.py # Pydantic-settings configuration (env-driven)
+│ │ ├── events.py # Async event bus (pub/sub + history)
+│ │ ├── metrics.py # Prometheus metrics + CRS formula engine
+│ │ ├── models.py # Domain models (Incident, ActionItem, …)
+│ │ └── logging.py # structlog setup
+│ ├── detection/ # Pillar I — Early warning & anomaly detection
+│ │ ├── engine.py # DetectionEngine orchestrator
+│ │ ├── threshold.py # Metric threshold detection
+│ │ ├── anomaly.py # Statistical anomaly detection (Z-score)
+│ │ ├── synthetic.py # Active HTTP synthetic probing
+│ │ └── sli_tracker.py # SLI/SLO tracking + error budget
+│ ├── absorption/ # Pillar II — Impact containment
+│ │ ├── circuit_breaker.py # 3-state circuit breaker + registry
+│ │ ├── bulkhead.py # Semaphore-based concurrency isolation
+│ │ ├── rate_limiter.py # Token bucket rate limiting
+│ │ ├── blast_radius.py # Dependency graph + blast radius analysis
+│ │ └── degradation.py # 4-tier graceful degradation
+│ ├── adaptation/ # Pillar III — Real-time reconfiguration
+│ │ ├── engine.py # AdaptationEngine orchestrator
+│ │ ├── decision_tree.py # 6-step adaptation decision tree
+│ │ ├── feature_flags.py # Feature flag manager
+│ │ ├── traffic_shifter.py # Weighted route redistribution
+│ │ └── scaler.py # Simulated horizontal auto-scaler
+│ ├── recovery/ # Pillar IV — Tiered service restoration
+│ │ ├── engine.py # RecoveryEngine (T0–T4 orchestration)
+│ │ └── runbooks.py # YAML runbook executor
+│ ├── evolution/ # Pillar V — Post-incident learning
+│ │ ├── engine.py # EvolutionEngine orchestrator
+│ │ ├── post_incident.py # Automated PIR generator
+│ │ ├── tracker.py # Action item tracker
+│ │ ├── patterns.py # Incident pattern matcher
+│ │ └── knowledge_base.py # Lessons-learned repository
+│ ├── maturity/ # Maturity assessment & CRS scoring
+│ │ └── model.py # MaturityAssessor (L1–L5, gap analysis)
+│ ├── dashboard/ # Control plane — FastAPI + SPA
+│ │ ├── app.py # Main FastAPI application
+│ │ ├── routes/ # API route handlers (status, pillars, metrics, chaos)
+│ │ ├── templates/index.html # Single-page application (6 tabs)
+│ │ └── static/ # CSS, JS, SVG assets
+│ └── cli/
+│ └── main.py # Click CLI (status, pillars, maturity, chaos, timeline, serve)
+│
+├── services/ # Five FastAPI microservices
+│ ├── base.py # Shared service factory (health, metrics, CORS, logging)
+│ ├── gateway/gateway_app.py # API Gateway (port 8000)
+│ ├── orders/orders_app.py # Orders service (port 8001)
+│ ├── payments/payments_app.py # Payments service (port 8002)
+│ ├── inventory/inventory_app.py # Inventory service (port 8003)
+│ └── notifications/notifications_app.py # Notifications service (port 8004)
+│
+├── chaos/ # Fault injection & experiments
+│ ├── injector.py # FaultInjector with auto-rollback
+│ └── experiments.py # 5 pre-defined chaos experiments
+│
+├── scripts/
+│ └── demo.py # Interactive end-to-end demo (Rich UI)
+│
+├── tests/
+│ ├── unit/
+│ │ ├── test_core.py # Core infrastructure tests
+│ │ └── test_pillars.py # Pillar unit tests
+│ └── integration/
+│ └── test_full_pipeline.py # Full incident lifecycle integration test
+│
+├── runbooks/
+│ └── payment_failure.yml # T0 & T1 runbooks for payment outage
+│
+├── config/
+│ ├── prometheus.yml # Prometheus scrape config
+│ └── grafana/provisioning/ # Grafana dashboard provisioning
+│
+├── docs/
+│ ├── assets/ # Diagrams and framework documents
+│ ├── blueprint.pdf # Architectural blueprint
+│ └── standards/
+│ └── DOCSTRING_STANDARDS.md # Mandatory file header format
+│
+├── docker-compose.yml # Full stack orchestration (8 services)
+├── Dockerfile # Python 3.11-slim container image
+├── pyproject.toml # Project metadata, dependencies, tool config
+├── .env.example # Environment variable template
+└── LICENSE # Apache 2.0
+```
+
+---
+
+## Contributing
+
+1. Fork the repository and create a feature branch
+2. Follow the [Docstring Standards](docs/standards/DOCSTRING_STANDARDS.md) — every file requires the mandatory metadata header
+3. Ensure your code passes linting and type checks:
+ ```bash
+ ruff check .
+ mypy aref/
+ ```
+4. Add or update tests to cover your changes:
+ ```bash
+ pytest tests/ -v --cov=aref --cov=services
+ ```
+5. Open a pull request with a clear description of your changes
+
+### Development Notes
+
+- **Singletons** are accessed via `get_X()` / `reset_X()` functions (e.g. `get_config()`, `get_event_bus()`, `get_metrics_engine()`)
+- **Engine pattern** — all pillar engines implement `async start()`, `async stop()`, and `get_status() -> dict`
+- **Service files** follow the `{service}_app.py` naming convention
+- **Event bus topics** use the `"category.event_type"` format
+
+---
+
+## License
+
+Copyright © 2025 Dennis 'dnoice' Smaltz — digiSpace Technical Studio
+
+Licensed under the [Apache License, Version 2.0](LICENSE).
+
+---
+
+
+
+*Built with care for the craft of resilience engineering.*
+
+