Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
78 changes: 78 additions & 0 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -926,3 +926,81 @@ The null hypothesis suite proves the architecture is real. Five additional test
- **Phenomenal Convergence** (`test_phenomenal_convergence.py`, 17 tests): QDT 6-gate protocol -- pre-report quality space geometry, counterfactual state swap, no-report behavioral footprint, perturbational integration, baseline failure verification, phenomenal tethering via architectural anesthesia.

Full results and analysis: [TESTING.md](TESTING.md)

---

## 15. Round-3 Additions (April 2026): Sentrux + Kame + RSI Cooldown + Unattended Training

This section documents the four production additions that landed in the Round-3 cycle. Together they close the loop from "Aura proposes a code change" → "the change is graded against architectural quality and only promoted if it doesn't degrade the codebase" → "Aura speaks while the deeper model thinks" → "training survives a closed laptop and auto-fuses on completion."

### 15.1 Architecture Quality Gate (Aura-Sentrux)

**Files**: `core/architecture_quality/scorer.py`, `core/architecture_quality/gate.py`, `core/architecture_quality/rules.toml`

A native equivalent of [Sentrux](https://github.com/sentrux/sentrux) that gates self-modifications on architectural quality. No code copied — designed and written from scratch in the Aura idiom.

**Five root-cause metrics**, weighted into a single 0–10000 score:
- modularity (30%): networkx greedy modularity; stdlib package-cohesion fallback if networkx is unavailable
- acyclicity (30%): 1 − cycle_density via Tarjan SCC over module-level imports
- depth (10%): normalized DAG depth
- equality (15%): normalized Gini over file size and import fan-in/out
- redundancy (15%): AST function-body signature hash duplication

**Wiring**: `core/self_modification/safe_modification.SafeSelfModification.apply_fix` runs `_run_quality_gate` immediately after the Stage-5 quarantine→primary promotion. On gate failure the existing `_rollback(...)` restores the SHA-256-verified backup taken at Stage 1, and a structured rejection record is appended to `data/architecture_quality_rejections.jsonl`. Non-Python changes pass through; gate-internal errors fail-open (`gate_error_allowed`) so a buggy gate cannot brick self-modification.

**Default rules** (`core/architecture_quality/rules.toml`, TOML, parsed via stdlib `tomllib`):
- `max_score_drop = 200` (out of 10000)
- `max_new_cycles = 0`
- `max_new_god_files = 0` (god file = >800 LOC + high fan-in/out)
- `min_overall_score = 0` (off by default; live tree currently scores 5602/10000)

**Live registration**: `core/service_registration.register_all_services` registers `architecture_quality_gate` as a singleton; resolution calls `install_gate(...)` so the module-level installed-gate hook is populated for the cross-call rejection path.

**Tests**: `tests/test_architecture_quality.py` (6 tests, all green) — score range, synthetic-cycle drop, unchanged-tree pass, regressed-tree reject, end-to-end safe-modification block on architectural regression, dependency-graph parser correctness.

### 15.2 Tandem Speak-While-Thinking (Aura-Kame)

**Files**: `core/brain/llm/tandem_kame.py`, `core/brain/llm/tandem_signal_bus.py`, `core/brain/llm/tandem_router.py`

A native equivalent of Sakana's [Kame](https://pub.sakana.ai/kame/) (paper: arXiv 2510.02327). Maps Aura's existing 7B/14B fast lane to "fast frontend" and the 32B/72B Cortex/Solver to "slow backend." A priority-ordered asyncio pubsub bus carries `OracleSignal`s from the slow lane to the fast lane mid-stream.

**Signal priority** (highest first): `retract` > `handoff` > `correction` > `refine` > `continue`. A `retract` halts the fast stream and switches output to the slow lane; a `correction` splices into the fast stream; a `handoff` yields the slow output without a retract marker.

**Wiring**: `attach_tandem(router, fast, slow)` is opt-in — `core/brain/llm_health_router.py` is untouched. Round-3 service registration calls `attach_tandem` against the resolved llm router so `router.tandem` is reachable from the runtime; tandem mode is then triggered per-call by `should_use_tandem(...)` heuristics (length, intent class, explicit task type).

**Failure modes covered**: solo-mode passthrough when the bus is silent, slow-lane timeout (fast finishes solo), bus subscription priority ordering, fake-fast / fake-slow streaming for tests.

**Tests**: `tests/test_tandem_kame.py` (9 tests, all green).

### 15.3 RSI Loop Hardening: Tiered Sepsis with Cooldown

**File**: `core/self_modification/safe_modification.py`

The previous sepsis registry permanently banned any file whose Ghost-Boot validation failed once. That made the modifiable surface monotonically shrink and turned a single transient false negative into a permanent loss. Round-3 replaces it with a tiered, time-bounded ban:

- **1st strike** within a 3-day observation window: log + record event, no ban
- **2nd strike**: 24-hour cooldown
- **3rd strike**: 7-day cooldown
- Ban check uses absolute expiry timestamps (`bans[file_path] = expires_at`) and migrates legacy permanent entries to a 7-day expiry on first read.

**Effect**: Aura's RSI loop can keep proposing improvements to the same module after a transient failure, but escalating mistakes still degrade the modification surface for that file. The ban check happens early in `validate_proposal` so a file in cooldown short-circuits before backup, branch creation, or quarantine.

### 15.4 Unattended Training (Lid-Close Survivable)

**Files**: `training/run_unattended.sh`, `training/run_unattended.py`, `training/README_UNATTENDED.md`

A wrapper around the existing `training/train_and_fuse.py` pipeline that survives a closed laptop:

- `caffeinate -i -m -s -d` keeps the system awake while the script runs
- `tee`'d log at `training/logs/unattended_<timestamp>.log`
- Retry loop (default `MAX_RETRIES=5`, 30-second pause between)
- SIGTERM/SIGINT writes a final state snapshot before exit so a hard kill is graceful
- `training/adapters/aura-personality/training_state.json` records `{started_at, last_iter, last_checkpoint_path, last_heartbeat, phase}` after every checkpoint observation; resume-from-latest is automatic on respawn
- The existing `train_and_fuse.py` auto-fuse + `active.json` manifest publishing is preserved unchanged so the next Aura boot picks up the new fused model with no `.env` edit required.

**Tests**: `tests/test_run_unattended.py` (5 tests, all green).

### 15.5 What This Buys

Together: Aura can propose code changes, have them automatically gated on architectural quality before promotion, speak immediately while a deeper lane refines the answer, recover from a single failed boot validation without permanently losing the file from her self-modification surface, and run multi-hour LoRA training overnight with the lid closed.

2 changes: 1 addition & 1 deletion aura_cleanup.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from __future__ import annotations
#!/usr/bin/env python3
"""Compatibility entrypoint for Aura cleanup.

Expand All @@ -6,7 +7,6 @@
cleanup implementation lives under `scripts/one_off/`.
"""

from __future__ import annotations

import runpy
from pathlib import Path
Expand Down
14 changes: 7 additions & 7 deletions aura_main.py
Original file line number Diff line number Diff line change
Expand Up @@ -349,7 +349,7 @@ async def bootstrap_aura(orchestrator: Any):
except Exception as exc:
logger.debug("Memory monitor registration skipped: %s", exc)
from core.utils.task_tracker import get_task_tracker
get_task_tracker().track_task(asyncio.create_task(mem_monitor.start()))
get_task_tracker().track_task(mem_monitor.start())

logger.info("🛡️ Task Supervisor active (Memory monitoring enabled).")

Expand Down Expand Up @@ -506,12 +506,12 @@ async def _main_loop():
await orchestrator.start()
if hasattr(orchestrator, "_ensure_inference_gate_ready"):
await orchestrator._ensure_inference_gate_ready(context="server_boot")
asyncio.create_task(orchestrator.run(), name="OrchestratorMainLoop")
get_task_tracker().create_task(orchestrator.run(), name="OrchestratorMainLoop")

# 2. Start API Server (v21: Server now runs in Kernel)
# [STABILITY] Start API after brain is ready to ensure correct ServiceContainer lookups.
logger.info("🎬 [DEBUG] Pre-starting API server mission...")
api_task = asyncio.create_task(_run_api_server(), name="api_server")
api_task = get_task_tracker().create_task(_run_api_server(), name="api_server")
logger.info("🎬 [DEBUG] API server task created successfully.")

# Wait for API server to be TRULY ready (HTTP 200)
Expand Down Expand Up @@ -559,8 +559,8 @@ async def _stream_logger(stream, level):
content.append(decoded)
return "\n".join(content)

out_task = asyncio.create_task(_stream_logger(proc.stdout, "DEBUG"))
err_task = asyncio.create_task(_stream_logger(proc.stderr, "ERROR"))
out_task = get_task_tracker().create_task(_stream_logger(proc.stdout, "DEBUG"))
err_task = get_task_tracker().create_task(_stream_logger(proc.stderr, "ERROR"))

# Watch for exit
while proc.returncode is None:
Expand Down Expand Up @@ -591,7 +591,7 @@ async def _stream_logger(stream, level):
logger.warning(f"🎨 Restarting GUI in 5s... (Attempt {restart_count}/{max_restarts})")
await asyncio.sleep(5.0)

asyncio.create_task(_gui_reaper_loop(), name="gui_reaper")
get_task_tracker().create_task(_gui_reaper_loop(), name="gui_reaper")
pipe = None # Subprocess doesn't use the actor pipe
else:
# Linux/Others can still use the supervised actor
Expand Down Expand Up @@ -873,7 +873,7 @@ async def _run_server_with_bootstrap():
await orchestrator.start()
if hasattr(orchestrator, "_ensure_inference_gate_ready"):
await orchestrator._ensure_inference_gate_ready(context="server_boot")
asyncio.create_task(orchestrator.run(), name="OrchestratorMainLoop")
get_task_tracker().create_task(orchestrator.run(), name="OrchestratorMainLoop")
await run_server_async(host, args.port)
asyncio.run(_run_server_with_bootstrap())
elif args.desktop:
Expand Down
3 changes: 2 additions & 1 deletion core/actors/sensory_gate.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@

from core.utils.task_tracker import get_task_tracker
import asyncio
import logging
import multiprocessing
Expand Down Expand Up @@ -43,7 +44,7 @@ async def run(self):
self.bus.start()

# Start heartbeat loop after bus is active
asyncio.create_task(self._heartbeat_loop())
get_task_tracker().create_task(self._heartbeat_loop())

logger.info("👁️ SensoryGate Actor ready.")

Expand Down
3 changes: 2 additions & 1 deletion core/adaptation/abstraction_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
Analyzes specific, successful problem resolutions and distills them into
universal, generalized rules for zero-shot application in novel domains.
"""
from core.runtime.atomic_writer import atomic_write_text
import asyncio
import logging
import json
Expand All @@ -27,7 +28,7 @@ def __init__(self, storage_path: str = "data/first_principles.json"):

# Initialize the file if it doesn't exist
if not self.storage_path.exists():
self.storage_path.write_text("[]")
atomic_write_text(self.storage_path, "[]")

async def abstract_from_success(self, context: str, successful_resolution: str) -> str:
"""
Expand Down
5 changes: 3 additions & 2 deletions core/adaptation/adaptive_immunity.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@
It can execute only a narrow subset of repair actions through the existing
autopoiesis engine. Everything sensitive remains governance-gated.
"""

from __future__ import annotations

from core.runtime.atomic_writer import atomic_write_text

import asyncio
import copy
import hashlib
Expand Down Expand Up @@ -2107,7 +2108,7 @@ def _save_state(self) -> None:
},
}
try:
self._state_path.write_text(json.dumps(payload, indent=2), encoding="utf-8")
atomic_write_text(self._state_path, json.dumps(payload, indent=2), encoding="utf-8")
except Exception as exc:
logger.debug("Adaptive immune state save skipped: %s", exc)

Expand Down
2 changes: 1 addition & 1 deletion core/adaptation/autonomous_resilience.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@
The goal is to make her dramatically better at surfacing risk honestly,
preempting common failures, and turning repair proposals into validated action.
"""

from __future__ import annotations


import ast
import asyncio
import inspect
Expand Down
3 changes: 2 additions & 1 deletion core/adaptation/epistemic_humility.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
are wrong and autonomously adjust your own operating parameters to compensate.
"""

from core.utils.task_tracker import get_task_tracker
import asyncio
from collections import Counter
import json
Expand Down Expand Up @@ -57,7 +58,7 @@ def __init__(self, orchestrator):
async def start(self):
if self.running: return
self.running = True
self._task = asyncio.create_task(self._critic_loop(), name="EpistemicHumility.critic_loop")
self._task = get_task_tracker().create_task(self._critic_loop(), name="EpistemicHumility.critic_loop")
logger.info("🙇 Epistemic Humility ONLINE — ready to learn from mistakes.")

async def stop(self):
Expand Down
7 changes: 4 additions & 3 deletions core/adaptation/self_optimizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
This allows Aura to update her own weights based on captured experiences.
"""

from core.utils.task_tracker import get_task_tracker
import os
import json
import logging
Expand Down Expand Up @@ -63,7 +64,7 @@ async def optimize(self, iters: int = 50, batch_size: int = 2) -> Dict[str, Any]
logger.info("🧠 Nucleus: Starting self-optimization (LoRA) cycle...")

if self.event_bus:
asyncio.create_task(self.event_bus.publish("core/optimizer/started", {
get_task_tracker().create_task(self.event_bus.publish("core/optimizer/started", {
"model": self.base_model_path.name,
"iters": iters,
"batch_size": batch_size
Expand Down Expand Up @@ -131,7 +132,7 @@ async def optimize(self, iters: int = 50, batch_size: int = 2) -> Dict[str, Any]

if process.returncode == 0:
if self.event_bus:
asyncio.create_task(self.event_bus.publish("core/optimizer/completed", {
get_task_tracker().create_task(self.event_bus.publish("core/optimizer/completed", {
"status": "success",
"duration": duration,
"samples": len(data)
Expand All @@ -152,7 +153,7 @@ async def optimize(self, iters: int = 50, batch_size: int = 2) -> Dict[str, Any]
logger.debug('Ignored Exception in self_optimizer.py: %s', _e)

if self.event_bus:
asyncio.create_task(self.event_bus.publish("core/optimizer/completed", {
get_task_tracker().create_task(self.event_bus.publish("core/optimizer/completed", {
"status": "failed",
"error": error_msg
}))
Expand Down
Loading