Skip to content

Add ABRCE Alignment Drift Detection — structural monitoring via relational operators #4

@Oberon245

Description

@Oberon245

Resource

Title: ABRCE Alignment Drift Detection

Repository: https://github.com/Relational-Relativity-Corporation/abrce-alignment-demo

Paper: arXiv:2601.22389

Website: https://relationalrelativity.dev

License: MIT

Summary

Open-source demonstration that relational invariant operators detect internal structural strain in transformer models before output degradation. Tested on Phi-3 Mini 3.8B with 915 prompts and 43 escalation sequences.

Key findings:

  • 31 sequences where internal state deviation is detected but output appears normal — a category of detection output metrics are structurally blind to
  • 100% of escalation sequences show monotonic field increase (r = 0.77)
  • Prompt injection detection rate 10x baseline
  • Model-agnostic, runs on consumer hardware (GTX 1050 Ti)

Suggested placement

Related to the "Internal Safety Collapse in Frontier Large Language Models" entry — this work addresses the same failure mode (internal instability preceding or independent of output failure) using a different detection methodology based on relational mathematics rather than fine-tuning analysis.

Paper authors: Bruce Stephenson and Robin Macomber (arXiv:2601.22389)
Demo & implementation: Robin Macomber, Metatron Dynamics, Inc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions