Apex MVP

Self-Developed Algorithm · Iterative Refinement Architecture · Spiral Memory Mechanism

Apex is a fully self-developed iterative refinement reasoning framework. Its core innovation lies in formalizing the Proposal → Review → Refinement multi-round self-correction pipeline as a trainable neural network loop. By leveraging self-critique and verification feedback, the model continuously refines its output during inference, overcoming the single-pass forward limitation of standard Transformer architectures.

Why Apex?

Architectural Bottlenecks of Current LLMs

Despite their scale, major LLMs (GPT, Claude, LLaMA, etc.) share fundamental architectural weaknesses:

Issue	Description
Single-Pass Inference	Each token is processed once; no built-in self-correction
No Feedback Loop	Output does not feed back into input for revision
Error Accumulation	Early-token errors compound along the autoregressive chain
Linear Reasoning	Chain-of-Thought strategies are unidirectional, lacking divergent review

Apex's Approach: Multi-Round Iterative Refinement

Rather than scaling up parameters and data, Apex innovates at the reasoning architecture level:

Standard LLM:   Input → [Transformer × N] → Output              (single pass)

Apex:           Input → Prelude Encoding → [Refinement Loop × K]:
                                           ├─ Proposal Head
                                           ├─ Review Head
                                           ├─ Refinement Head
                                           ├─ Scoring Verifier
                                           └─ Spiral Memory Update
                                       → Decode Output            (self-corrective reasoning)

Core Innovations

1. Iterative Refinement Loop

A differentiable self-correction loop that generates three distinct representations per step — Proposal (candidate generation), Review (defect detection), and Refinement (fusion and improvement):

$$ \begin{aligned} P_t &= \text{Proj}_P(x_t) \quad &\text{(Proposal Head)} \\ C_t &= \text{Proj}_C(x_t) \quad &\text{(Review Head)} \\ R_t &= \text{Proj}_R(x_t) \quad &\text{(Refinement Head)} \end{aligned} $$

All three heads share the same Transformer backbone and project the same hidden state into different subspaces, forming a self-adversarial and convergent refinement loop. The number of steps $K$ is configurable (default $K=3$).

2. Spiral Memory

Unlike standard RNNs that update state from only the previous step, Spiral Memory compresses five-dimensional information into a unified memory state:

$$ M_{t} = \text{Compress}(P_t \oplus C_t \oplus R_t \oplus \text{ScoreEmb}_t \oplus \text{Invariant}) $$

where $\oplus$ denotes feature concatenation. The memory encodes not only the current round's three reasoning representations but also the verification score and an invariant anchor embedding, enabling the model to track which reasoning paths are more reliable across steps.

3. Scoring Verifier

Verifier(refinement) → score ∈ [0, 1]

A trainable scoring network that assesses the quality of each round's refinement output. The score is fed back into the Spiral Memory, driving the model to improve low-scoring outputs in subsequent steps — forming a closed-loop feedback system for reasoning quality.

4. Hybrid Attention Mechanisms

Mechanism	Function
GQA (Grouped Query Attention)	Reduces KV cache footprint, accelerates inference
Sliding Window Attention	Local attention with O(wn) complexity
Full Attention (every 4 layers)	Global attention every 4th layer, balancing local efficiency with global context
Memory Cross-Attention	Cross-attends to Spiral Memory at every layer, injecting historical reasoning context

5. SwiGLU Activation

Replaces standard FFN with SwiGLU for enhanced non-linear expressiveness:

$$ \text{SwiGLU}(x) = W_3(\text{SiLU}(W_1 x) \odot W_2 x) $$

6. Rotary Position Embedding (RoPE)

Self-implemented RoPE with zero external dependencies, supporting extrapolation to sequence lengths unseen during training.

Architecture Overview

                          ┌──────────────────────┐
     Input ────────────▶ │   SimpleTokenizer    │
                          └──────────┬───────────┘
                                     │
                          ┌──────────▼───────────┐
                          │   Embedding Table    │
                          └──────────┬───────────┘
                                     │
                    ┌────────────────▼────────────────┐
                    │      Pre-lude Transformer       │
                    │   (GQA + SWA + Cross-Attn) × N  │
                    └────────────────┬────────────────┘
                                     │
              ┌──────────────────────▼──────────────────────┐
              │          Refinement Loop × K                │
              │                                             │
              │  ┌─────────────────────────────────────┐    │
              │  │   Shared Core Transformer Blocks    │    │
              │  │   (GQA + SWA/Full + Mem Cross-Attn) │    │
              │  └─────────────────┬───────────────────┘    │
              │                    │                        │
              │     ┌──────────────┼──────────────┐         │
              │     ▼              ▼              ▼         │
              │  Proposal       Review       Refinement     │
              │     │              │              │         │
              │     └──────────────┼──────────────┘         │
              │                    ▼                        │
              │           ┌───────────────┐                 │
              │           │   Verifier    │──▶ Score       │
              │           └───────┬───────┘                 │
              │                   │                         │
              │    ┌──────────────▼──────────────┐          │
              │    │     Spiral Memory Update    │          │
              │    │  (P⊕C⊕R⊕Score⊕Invariant} │          │
              │    └──────────────┬──────────────┘          │
              │                   │                         │
              │                   ▼ (next step)             │
              └─────────────────────────────────────────────┘
                                     │
                          ┌──────────▼───────────┐
                          │    Linear Decoder    │
                          └──────────┬───────────┘
                                     │
                          ┌──────────▼───────────┐
                          │     Output Text      │
                          └──────────────────────┘

Comparison with Mainstream Architectures

Feature	Standard Transformer	Chain-of-Thought	Tree-of-Thought	Apex (Self-Dev)
Inference passes	1	1 + prompt	N tree searches	K loop steps
Self-correction	❌	❌ (prompt-dependent)	Partial	✅ Built-in
Quality feedback	❌	❌	External	✅ Built-in scorer
Cross-step memory	❌	❌	❌	✅ Spiral Memory
Compute overhead	Baseline	+prompt length	+tree nodes	+K × shared layers
End-to-end differentiable	✅	✅ (no special design)	❌	✅ Full pipeline

Project Structure

Apex/
├── README.md                    # This file (English)
├── README-CN.md                 # Chinese documentation
├── LICENSE                      # MIT License
├── pyproject.toml               # Python package config
├── requirements.txt             # Dependencies (torch>=2.0.0)
├── configs/                     # Hyperparameter configs
│
├── apex/                        # Core package
│   ├── model/                   # Model components
│   │   ├── rope.py              #   Rotary Position Embedding (self-dev)
│   │   ├── attention.py         #   GQA + Sliding Window + SwiGLU (self-dev)
│   │   ├── transformer.py       #   Shared Transformer Block (self-dev)
│   │   ├── memory.py            #   Spiral Memory compression (core innovation)
│   │   ├── heads.py             #   Three-way reasoning heads + decoder (core innovation)
│   │   ├── dialectic.py         #   Refinement step + ApexMVP model (core innovation)
│   │   └── recurrent.py         #   Gated recurrent state cell
│   │
│   ├── runtime/                 # Runtime control
│   │   ├── loop.py              #   Training / validation loop
│   │   ├── verifier.py          #   Scoring verifier interface (core innovation)
│   │   ├── scheduler.py         #   Loop-step / LR scheduler (self-dev)
│   │   └── controller.py        #   Inference controller
│   │
│   ├── data/                    # Data pipeline
│   │   ├── tokenizer.py         #   Character-level tokenizer (self-dev, zero deps)
│   │   ├── dataset.py           #   Dataset loaders
│   │   └── preprocess.py        #   Preprocessing utilities
│   │
│   ├── train/                   # Training system
│   │   ├── trainer.py           #   Trainer
│   │   ├── losses.py            #   Combined loss (verification + consistency)
│   │   └── optim.py             #   Optimizer factory
│   │
│   └── utils/                   # Utility functions
│
├── scripts/                     # Run scripts
│   ├── train.sh                 #   Training entrypoint
│   ├── eval.sh                  #   Evaluation entrypoint
│   └── benchmark.sh             #   Performance benchmark
│
├── examples/                    # Usage examples
│   ├── code_repair.py           #   Code repair example
│   ├── math_reasoning.py        #   Math reasoning example
│   └── verifier_loop.py         #   Verifier loop analysis
│
├── docs/                        # Detailed docs
│   ├── architecture.md          #   Architecture design doc
│   ├── runtime.md               #   Runtime mechanics
│   └── training.md              #   Training guide
│
├── benchmarks/                  # Evaluation benchmarks
├── experiments/                 # Experiment configs
├── checkpoints/                 # Model checkpoints
├── outputs/                     # Output directory
└── datasets/                    # Dataset directory

Quick Start

Requirements

Python >= 3.10
PyTorch >= 2.0.0

Installation

git clone <repo-url>
cd Apex
pip install -r requirements.txt

Run Examples

# Code repair
python examples/code_repair.py

# Math reasoning
python examples/math_reasoning.py

# Detailed verifier loop analysis
python examples/verifier_loop.py

Training

bash scripts/train.sh

Or in Python:

from apex import ApexMVP
from apex.data import make_toy_dataset
from apex.train import Trainer

model = ApexMVP(
    vocab_size=32000,
    dim=512,
    prelude_layers=2,
    shared_layers=4,
    num_heads=8,
    num_kv_heads=2,
    window_size=128,
    loop_steps=3,
)

dataset = make_toy_dataset()
trainer = Trainer(model, device="cuda", lr=1e-4)
trainer.fit(dataset, epochs=50)

Inference

from apex import ApexMVP
from apex.runtime import RuntimeController

model = ApexMVP(dim=512, loop_steps=3)
controller = RuntimeController(model, device="cuda")

result, scores, history = controller.run("Your question...")
print(f"Result: {result}")
print(f"Verification scores: {[round(s.item(), 4) for s in scores]}")

Algorithm Formalization

Full Forward Pass

Given input text $\mathcal{T}$:

Step 1: Tokenization & Embedding

$$ \mathbf{T} = \text{Tokenize}(\mathcal{T}) \in \mathbb{Z}^{S},\quad \mathbf{X}^{(0)} = \text{Embed}(\mathbf{T}) \in \mathbb{R}^{B \times S \times D} $$

Step 2: Prelude Encoding

$$ \mathbf{X}^{(p)} = \text{Prelude}_p \circ \cdots \circ \text{Prelude}_1(\mathbf{X}^{(0)}), \quad \mathbf{M}^{(0)} = \mathbf{0} $$

Step 3: Iterative Refinement Loop (repeated $K$ times, $t = 1, \dots, K$)

$$ \begin{aligned} \mathbf{X}^{(t)} &= \text{SharedCore}_{1:L}(\mathbf{X}^{(t-1)}, \mathbf{M}^{(t-1)}) \\ [P_t, C_t, R_t, I_t] &= \text{ReasoningHeads}(\mathbf{X}^{(t)}) \\ s_t &= \text{Verifier}(R_t) = \sigma\left(\frac{1}{|D|}\sum_{i,j} W_s R_t[i,j]\right) \\ \mathbf{M}^{(t)} &= \text{SpiralMem}(P_t, C_t, R_t, s_t, I_t) \end{aligned} $$

where $P_t$ is the proposal, $C_t$ the review, $R_t$ the refinement, $I_t$ the invariant anchor, and $s_t$ the quality score.

Step 4: Final Decoding

$$ \mathcal{O} = \text{Decode}(\text{Decoder}(\mathbf{X}^{(K)} + \mathbf{M}^{(K)})) $$

The final hidden state is fused with the spiral memory via residual addition before the linear decoder generates the output.

Loss Functions

Apex uses a three-component loss:

$$ \mathcal{L} = \mathcal{L}_{\text{CE}} + \alpha \mathcal{L}_{\text{VF}} + \beta \mathcal{L}_{\text{CF}} $$

Term	Formula	Purpose
$\mathcal{L}_{\text{CE}}$	Cross-Entropy(logits, targets)	Standard next-token prediction
$\mathcal{L}_{\text{VF}}$	$\mathbb{E}_t[(s_t - 1.0)^2]$	Encourages high verifier scores on refinements
$\mathcal{L}_{\text{CF}}$	MSE(Proposal, Refinement) + MSE(Review, Refinement)	Keeps representations semantically consistent

Key Hyperparameters

Parameter	Default	Description
`dim`	512	Hidden dimension
`prelude_layers`	2	Number of prelude Transformer layers
`shared_layers`	4	Number of shared core layers
`num_heads`	8	Number of query attention heads
`num_kv_heads`	2	Number of KV heads (GQA groups)
`window_size`	128	Sliding window size
`loop_steps`	3	Number of refinement loop steps
`vocab_size`	32000	Vocabulary size

Use Cases

Domain	Apex Advantage
Code Repair	Multi-round self-review detects and fixes defects
Math Reasoning	Verifier scores intermediate conclusions, selects correct paths
Logical Reasoning	Review head identifies inconsistencies in reasoning chains
Text Quality	Iteratively corrects grammar, logic, and style
Multi-Step Planning	Spiral Memory stores intermediate planning states

Citation

@misc{apex2025,
  title={Apex: Self-Refining Reasoning via Iterative Refinement Loops and Spiral Memory},
  author={Apex Contributors},
  year={2025},
  note={Self-developed innovative algorithm},
}

Roadmap

MVP core architecture (iterative refinement loop + spiral memory + scoring verifier)
GQA + sliding window attention + SwiGLU activation
Complete train / eval / inference pipeline
Real dataset training (CodeNet, GSM8K)
Dynamic loop-step scheduling
Multi-task fine-tuning support
Distributed training (FSDP)
ONNX / TensorRT inference acceleration
Open-weight release

License

MIT License. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Apex MVP

Why Apex?

Architectural Bottlenecks of Current LLMs

Apex's Approach: Multi-Round Iterative Refinement

Core Innovations

1. Iterative Refinement Loop

2. Spiral Memory

3. Scoring Verifier

4. Hybrid Attention Mechanisms

5. SwiGLU Activation

6. Rotary Position Embedding (RoPE)

Architecture Overview

Comparison with Mainstream Architectures

Project Structure

Quick Start

Requirements

Installation

Run Examples

Training

Inference

Algorithm Formalization

Full Forward Pass

Loss Functions

Key Hyperparameters

Use Cases

Citation

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
apex		apex
benchmarks		benchmarks
configs		configs
docs		docs
examples		examples
experiments		experiments
scripts		scripts
LICENSE		LICENSE
README-CN.md		README-CN.md
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Apex MVP

Why Apex?

Architectural Bottlenecks of Current LLMs

Apex's Approach: Multi-Round Iterative Refinement

Core Innovations

1. Iterative Refinement Loop

2. Spiral Memory

3. Scoring Verifier

4. Hybrid Attention Mechanisms

5. SwiGLU Activation

6. Rotary Position Embedding (RoPE)

Architecture Overview

Comparison with Mainstream Architectures

Project Structure

Quick Start

Requirements

Installation

Run Examples

Training

Inference

Algorithm Formalization

Full Forward Pass

Loss Functions

Key Hyperparameters

Use Cases

Citation

Roadmap

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages