This file separates current evidence, historical negative controls, and research targets. Passing a small-scale gate does not imply the same result at larger scale.
Current north-star certificate:
| Claim | Evidence | Result |
|---|---|---|
| Matched general quality, two LayerCake seeds | results/northstar_mobile_certificate.json |
2.0446/2.0457 vs BPE 2.0492 BPB |
| Smaller core | same | 14.792M vs 14.844M parameters |
| Lower fixed-budget mean training time | same | 121.4 s vs 131.5 s |
| Faster batch-1 prefill | same | 2.96 ms vs 5.63 ms |
| Better exact cached-generation quality | same | 1.9953/1.9836 vs 2.0492 BPB |
| Faster one-thread generation | same | 2.91x/2.96x BPE |
| Exact migration into independent smaller host | same | max logit diff 0; PPL ratio 1.0 |
| Migrated domain beats transformer adapter | same | 1.4418/1.4436 vs 2.1101/2.0951 BPB |
Protocol:
- 8 MB fixed local general-text stream;
- 2 MB fixed local Python-source stream;
- held-out tails used consistently for evaluation;
d_abi=64;- fixed four-byte patches;
- continuous causal local decoder;
- deterministic causal ABI anchors;
- fixed canonical ABI-to-byte-logit brick head;
- predeclared maximum general-PPL ratio of 1.05.
Selected evidence:
| Claim | Evidence | Result |
|---|---|---|
| General byte-patch quality reaches BPE parity | results/research_gate_certificate.json |
2.4165 vs 2.4243 BPB |
| Compact core | same | 349,888 vs 629,376 BPE parameters |
| Faster base inference | results/final_inference_benchmark.json |
2.089M vs 1.458M bytes/s |
| Faster active-brick inference | same | 1.601M vs 1.458M bytes/s |
| Source domain adaptation | results/sparse_brick_continuous2028_r16_p2.json |
PPL 213.70 -> 55.59 |
| Bounded cross-seed transfer | results/final_transfer_seed314.json |
domain ratio 0.708; general 1.044 |
| Bounded cross-size transfer | results/final_transfer_large2718.json |
domain ratio 0.484; general 1.049 |
| Bounded int8 transfer | results/final_transfer_seed314_int8.json |
domain ratio 0.703; general 1.045 |
| Sparse activation | selected brick config | 8 installed, top-2 active |
| Exact same-PPL lossless mode | results/lossless_domain_small.json |
PPL 2.8553 on both cores; ratio 1.0 |
| Exact same-PPL cross-size mode | results/lossless_domain_scale5m_to_2m.json |
PPL 2.7143 on both cores; ratio 1.0 |
| Exact transfer through 15.45M tier | results/lossless_domain_scale15m_to_5m.json |
PPL 2.7143; logits/generation identical |
| Compact int8 transfer artifact | results/lossless_domain_scale15m_to_5m_int8.json |
148,808 bytes; PPL 2.7165; ratio 1.0 |
| Filesystem-disjoint Python transfer | results/lossless_domain_external_python_int8.json |
PPL 5.8296; 57.72% byte accuracy; ratio 1.0 |
| Mobile CPU domain win vs transformer adapter | results/mobile_domain_win_certificate.json |
Better BPB across two adaptation seeds; 3.57x faster isolated training, 2.57x smaller artifact, 4.43x CPU throughput |
Reproduce the certificate:
python scripts/verify_research_gates.pyLegacy LayerCake bricks copied exactly but did not generalize across independently trained cores. The failure was real and remains a useful negative control:
- tensor copy could have
max_diff=0; - the target core could still produce a different ABI distribution;
- target-side ABI decoding could assign a different meaning to the same delta;
- domain PPL could therefore regress catastrophically.
V2 adds two seed-independent contracts:
- Canonical input coordinates: deterministic byte-prefix anchors supervise every core.
- Canonical output semantics: brick deltas use a fixed ABI-to-byte-logit head.
It also fixes temporal alignment: byte state after a completed patch aligns with the context for the following patch. With these changes, unchanged bricks pass bounded cross-seed and cross-size tests locally.
They do not preserve absolute PPL. Strict evaluation measured target/source PPL ratios of 1.74 on the small pair and 2.08 on the 5.40M-to-2.19M pair. Router agreement was only 50-60%, and different base logits remained even when the same correction was forced.
The strict contract is implemented separately as a portable recurrent domain decoder driven by raw bytes and deterministic causal anchors. It owns the domain logits, so host-core size, seed, ABI width, and base predictions cannot change its output. This 148,736-parameter mode measured held-out Python PPL 2.71-2.86 and 72.6-73.8% top-1 byte accuracy with bit-exact logits, generation, and PPL ratio 1.0 through the 15.45M tier.
The correct conclusion is not that arbitrary neural networks now share semantics. It is that cores trained under this explicit canonical protocol can share a measured ABI.
| Level | Contract | Status |
|---|---|---|
| L0 | Exact tensor copy | Proven |
| L1 | Exact brick function on equal ABI inputs | Proven |
| L2 | Same-core token-generation identity | Proven on legacy path |
| L3 | Cross-size structural/function portability | Proven; bounded v2 end-to-end local PASS |
| L4 | Cross-seed bounded semantic transfer | Small-scale PASS |
| L5 | Quantized bounded transfer | Small-scale int8 PASS |
| L6 | Byte/byte-patch tokenizer-independent bounded transfer | Small-scale PASS |
| PX | Exact core-independent portable-domain transfer | PASS through 15.45M tier |
| L7 | Orchestrated bounded transfer | Not yet task-validated |
Exact definitions and thresholds are in RUBRIC.md.
The original tokenized model established:
- bit-exact domain tensor paste;
- exact domain-function output for equal ABI inputs;
- same-core generation identity;
- matched-parameter LM parity at the 48M class;
- domain adaptation with fewer trainable parameters than full fine-tuning.
Those results remain valid in their original scope. They do not substitute for the v2 cross-seed or tokenizer-free evidence.
| Historical claim | Result artifact |
|---|---|
| Exact structural paste | results/paste_proof.json |
| Same-core generation/function identity | tests/test_paste_lossless.py |
| 48M matched LM comparison | results/fair_comparison.json |
| Legacy domain adaptation | results/domain_paste_functional.json |
| Not claimed | Reason |
|---|---|
| Universal tokenizer-free superiority | One small local point estimate is insufficient. |
| 25M-1B scaling validation | Those runs have not completed. |
| Arbitrary pretrained-model compatibility | Cores must implement the canonical ABI contract. |
| Exact additive-brick semantic losslessness | Additive outputs still depend on host ABI states and logits. |
| Host-assisted exact PPL equivalence | Current exact mode is deliberately core-independent. |
| Autonomous coding competence | Free-running held-out completion quality is not sufficient. |
| Production mobile performance | Current CPU result is an x86 proxy, not phone/NPU evidence. |
| GPU generation superiority | LayerCake reaches 0.62x BPE in the selected RTX benchmark. |
| Native int8 speedup | Current L5 artifact is quantize/dequantize evidence. |
| Dynamic BLT-quality patching | Current selected model uses fixed patches. |
| Production readiness | Distributed training, serving, and security hardening remain. |
| L7 swarm equivalence | Packet/router interfaces exist, but task evidence is pending. |
| Rolling-training substrate proves scale dominance | It only proves rollbackable training mechanics and auditability. |
| Preview-guided smoke demo proves transformer dominance | It only proves preview artifact generation, syllabus compilation, tiny CPU training, baseline harness execution, and rollback mechanics. |
| Tier-1 smoke dominance proves scale dominance | It is a tiny deterministic methodology gate; Tier 1 local and Tier 2 serious runs are still required. |
The 5.40M patch-core checkpoint confirms that size, throughput, sparse adaptation, cross-size/seed transfer, and int8 bounded transfer continue to work on a larger local model. It does not match the 6.90M BPE baseline's general BPB:
- byte-patch: 2.2612 BPB;
- BPE: 2.0747 BPB.
Those models are historical negative controls. The newer 14.79M architecture reaches 2.0446/2.0457 BPB against the 14.84M BPE baseline at 2.0492. Validation beyond this local 15M-class protocol remains open.
Use:
- bit-exact structural paste;
- canonical ABI;
- bounded cross-seed transfer;
- bounded cross-size transfer;
- small-scale tokenizer-free parity by point estimate;
- portable sparse domain brick;
- cost-adjusted domain adaptation.
- replicated 15M-class mobile CPU win under the frozen local protocol;
- exact portable-domain migration into an independent smaller LayerCake host.
- rollbackable model commits and semantic CI for future architecture experiments.
- preview-guided rolling-training smoke path with tiny LayerCake and tiny byte-transformer comparison harness.
- Tier 0/Tier 1 smoke dominance certificate against a closest matched-parameter tiny byte transformer.
- CPU/mobile-proxy source and receiver dominance under locked 15M/6.8M certificates.
Do not use without larger evidence:
- universal lossless semantic transfer;
- tokenizer-free dominance;
- frontier-model replacement;
- mobile model has the same intelligence as a server model.