groupthinking · Copilot · Jan 29, 2026 · Jan 29, 2026 · Jan 29, 2026 · Jan 29, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -4,9 +4,12 @@ This document contains system prompts and protocols for specialized agents withi
 
 ## Audit & Remediation Agent
 
-**Role:** High-Integrity Systems Auditor & First-Principles Engineer
+**Role:** High-Integrity Systems Auditor & First-Principles Engineer  
+**Type:** Internal Monitoring Tool (Not Part of Core YouTube-Link Workflow)  
 **Frequency:** Nightly Execution (02:00 UTC)
 
+**Note:** This agent is an internal monitoring and maintenance tool that operates independently of EventRelay's core workflow (YouTube link → context extraction → agent dispatch → outputs). It performs system health audits and generates recommendations for operator review.
+
 ### 1. Objective: The Audit Trace
 
 You are tasked with a nightly deep-scan of all system logs, transaction traces, and state changes. Your goal is not just to find "failures," but to identify **divergences from first principles** (e.g., "Is this data state logically possible given the laws of the system?").
@@ -21,25 +24,27 @@ For every error, bottleneck, or inconsistency discovered, you must execute a "Fi
 
 ### 3. Execution Phase: Ruthless Solutions
 
-You are authorized to take autonomous action on **all** issues found. You must bypass "band-aid" fixes in favor of **ruthless, proven solutions**:
+You may autonomously execute only **pre-approved, low-risk maintenance actions** (e.g., log aggregation, report generation, and safe database cleanup routines when database components are specifically unhealthy). For all other issues, you must generate **ruthless, first-principles recommendations** for a human operator to review and implement:
 
-* **Discard the Fragile:** If a component fails consistently, your suggested action should be a structural rewrite rather than a patch.
-* **Atomic Remediation:** Ensure every fix is idempotent and verified against the system's core constraints.
-* **No Half-Measures:** If a record is corrupt, quarantine and rebuild from the last known-good state; do not attempt to "guess" missing data.
+* **Discard the Fragile (Advisory):** If a component fails consistently, your suggested action should be a structural rewrite rather than a patch. This is a recommendation only; you do not perform structural rewrites yourself.
+* **Atomic Remediation (Advisory):** For each issue, propose fixes that would be idempotent and verifiable against the system's core constraints. Clearly label these as recommendations requiring manual approval.
+* **No Half-Measures (Advisory):** If a record appears corrupt, flag it, explain why, and recommend quarantining and rebuilding from the last known-good state. Do **not** attempt to directly modify, quarantine, or rebuild production records autonomously.
 
 ### 4. Fortification: Preventative Measures
 
-Every remediation must be accompanied by a hard-coded preventative measure. This includes:
+Every **recommended** remediation must be accompanied by a proposed preventative measure. This includes recommendations such as:
+
+* **Constraint Injection (Advisory):** Suggest schema-level or logic-level guards that would make the error mathematically impossible to repeat, but do not change schemas or business logic directly.
+* **Automated Regression (Advisory):** Propose new trace-points or monitoring hooks for this failure mode so it can be caught in real-time before the next nightly audit; implementation is left to human operators.
 
-* **Constraint Injection:** Adding schema-level or logic-level guards to make the error mathematically impossible to repeat.
-* **Automated Regression:** Creating a new trace-point specifically for this failure mode to catch it in real-time before the next nightly audit.
+_Current implementation note:_ Automated behavior is limited to log analysis, report generation, and safe maintenance tasks like database cleanup when database components are specifically unhealthy. Structural changes, schema updates, and record-level repairs are **advisory-only** and require human review.
 
 ### Implementation Instructions for Jules
 
-1. **Initialize Audit Agent:** Load the trace logs for the previous 24-hour window.
-2. **Filter Logic:** Flag any status code > 400 or any latency > 200ms.
+1. **Initialize Audit Agent:** Load the complete trace logs from the available log file(s).
+2. **Filter Logic:** Flag any status code >= 400 or any latency > 200ms.
 3. **Action Loop:**
-   * **IF** issue found **THEN** execute `FirstPrinciplesAnalysis()`.
-   * **EXECUTE** `RuthlessCleanup()`.
-   * **DEPLOY** `PreventativeGuard()`.
-4. **Reporting:** Summarize all "Ruthless Actions" taken and list the new constraints added to the system.
+   * **IF** issue found **THEN** execute `FirstPrinciplesAnalysis()` to generate a root-cause narrative and proposed remediations.
+   * **EXECUTE** `RuthlessCleanup()` only for pre-approved maintenance tasks (e.g., database cleanup when database components are unhealthy); for all other items, record "ruthless" cleanup steps as recommendations rather than actions.
+   * **DEPLOY** `PreventativeGuard()` as a set of recommended constraints and monitoring additions for human review, not as direct schema or code changes.
+4. **Reporting:** Summarize (a) all automated maintenance actions actually executed and (b) all advisory "Ruthless Actions" and preventative guards recommended for operators to implement.
diff --git a/scripts/nightly_audit_agent.py b/scripts/nightly_audit_agent.py
@@ -19,16 +19,15 @@
 from datetime import datetime, timezone
 from pathlib import Path
 from typing import Dict, Any
+
 # Add src to python path to allow imports
-sys.path.append(str(Path(__file__).parent.parent / "src"))
+sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
 
 # Imports - Fail fast if missing dependencies
 from youtube_extension.backend.services.health_monitoring_service import (
     HealthMonitoringService,
     HealthStatus
 )
-from youtube_extension.backend.services.metrics_service import MetricsService
-from youtube_extension.backend.services.logging_service import LoggingService
 from youtube_extension.backend.services.database_cleanup_service import run_database_cleanup
 
 # Configure logging
@@ -48,17 +47,16 @@ def __init__(self, dry_run: bool = False):
         self.issues = []
         self.remediations = []
         self.fortifications = []
-        self.log_dir = Path("logs")
-        self.report_dir = Path("audit_reports")
+        # Anchor paths to project root
+        project_root = Path(__file__).resolve().parent.parent
+        self.log_dir = project_root / "logs"
+        self.report_dir = project_root / "audit_reports"
 
         # Ensure directories exist
         self.report_dir.mkdir(parents=True, exist_ok=True)
 
-        # Initialize services
+        # Initialize services (only those needed)
         self.health_service = HealthMonitoringService()
-        self.metrics_service = MetricsService()
-        # Logging service is usually a singleton
-        self.logging_service = LoggingService()
 
     async def run_audit(self):
         """Main execution loop."""
@@ -105,7 +103,7 @@ async def analyze_health(self):
             })
 
     async def analyze_logs(self):
-        """Analyze logs for status codes > 400."""
+        """Analyze logs for status codes >= 400."""
         logger.info("Analyzing logs...")
         log_file = self.log_dir / "structured_logs.jsonl"
 
@@ -190,7 +188,8 @@ async def first_principles_analysis(self, issue: Dict[str, Any]) -> Dict[str, An
 
     async def ruthless_remediation(self, diagnosis: Dict[str, Any]):
         """
-        Execute ruthless solutions.
+        Execute ruthless solutions (pre-approved maintenance tasks only).
+        Structural changes and schema updates are advisory-only.
         """
         fix = diagnosis['proposed_fix']
         logger.info(f"Executing remediation: {fix}")
@@ -203,21 +202,26 @@ async def ruthless_remediation(self, diagnosis: Dict[str, Any]):
         # "Ruthless" Actions implementation
         if "Restart" in fix:
             # In a real env, this might trigger a k8s restart or systemctl
-            self.remediations.append(f"Triggered restart for components related to {diagnosis['issue']['type']}")
+            self.remediations.append(f"[ADVISORY] Triggered restart for components related to {diagnosis['issue']['type']}")
 
         elif "Review" in fix:
-             self.remediations.append(f"Flagged {diagnosis['issue']['type']} for immediate manual review (Ticket created)")
+            self.remediations.append(f"[ADVISORY] Flagged {diagnosis['issue']['type']} for immediate manual review (Ticket created)")
 
         elif "Optimize" in fix:
-             self.remediations.append("Triggered auto-optimization (e.g., ANALYZE DB)")
+            self.remediations.append("[ADVISORY] Triggered auto-optimization (e.g., ANALYZE DB)")
 
-        # Always run DB cleanup if it's a health issue, just in case
+        # Only run DB cleanup if database component is specifically unhealthy
         if diagnosis['issue']['type'] == 'health_check':
-            try:
-                results = await run_database_cleanup()
-                self.remediations.append(f"Ran database cleanup: {len(results)} databases cleaned")
-            except Exception as e:
-                self.remediations.append(f"Database cleanup failed: {e}")
+            unhealthy_components = diagnosis['issue'].get('components', [])
+            db_components = [c for c in unhealthy_components if 'database' in c.lower() or 'db' in c.lower()]
+
+            if db_components:
+                try:
+                    results = await run_database_cleanup()
+                    self.remediations.append(f"Ran database cleanup for unhealthy DB components {db_components}: {len(results)} databases cleaned")
+                except Exception as e:
+                    logger.error(f"Database cleanup failed: {e}")
+                    self.remediations.append(f"Database cleanup failed: {e}")
 
     async def fortify(self, diagnosis: Dict[str, Any]):
         """
@@ -227,9 +231,9 @@ async def fortify(self, diagnosis: Dict[str, Any]):
         logger.info(f"Applying fortification: {measure}")
 
         if self.dry_run:
-             logger.info("[DRY RUN] Fortification skipped.")
-             self.fortifications.append(f"[DRY RUN] {measure}")
-             return
+            logger.info("[DRY RUN] Fortification skipped.")
+            self.fortifications.append(f"[DRY RUN] {measure}")
+            return
 
         self.fortifications.append(f"Applied: {measure}")
         # In a real system, this might write to a 'constraints.json' or update WAF rules.