You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Structural stability architecture for self-modifying optimisation systems. Defines structural, dynamic, and perceptual control constraints that preserve coherence and stability before value alignment.
Sixteen small, fully-reproducible (CPU, numpy-only) experiments showing the normative anchor of AI alignment is supplied, not discovered — across verification, optimization, social emergence, and value learning. Includes a preregistered experiment with an honest negative. A synthesis, not a novelty claim.
A structural account of why honesty may be the path of least resistance for superintelligence. Research hypothesis with formal proof, experimental design, and four-AI collaborative analysis
Rigorous framework for evaluating AI alignment properties — sycophancy, corrigibility, deception, goal stability, and power-seeking — with statistical confidence intervals
On the infantile expectation of controlling what we cannot comprehend. A philosophical critique of the ASI control paradigm, developed through four-AI adversarial debate. Extension of the Coherence Basin Hypothesis
Toy 7. An elimination-filter landscape applying two structural constraints simultaneously to map which objective classes can persist under sustained optimization pressure — and which cannot. Includes a four-stage scenario engine and open-question frontier. Companion simulation for The Shape of What Does Not End — Series 2, Part 4.
Research trail of honest bridges in AI alignment: pre-registered toy experiments + field ownership. Current: a type-blind arbiter holding population equilibrium against reward-hacking under hard optimization