within provides high-performance solvers for projecting out high-dimensional fixed effects from regression problems.
By the Frisch-Waugh-Lovell theorem, estimating a regression of the form y = Xβ + Dα + ε reduces to a sequence of least-squares projections, one for y and one for each column of X, followed by a cheap regression fit on the resulting residuals. The projection step of solving the normal equations D'Dx = D'z is the computational bottleneck, which is the problem within is designed to solve.
within's solvers are tailored to the structure of fixed effects problems, which can be represented as a graph (as first noted by Correia, 2016). Concretely, within uses modified LSMR with a domain decomposition (Schwarz) preconditioner, backed by approximate Cholesky local solvers (Gao et al, 2025).
You can install Python bindings from PyPi by running
pip install within_pywithin's main user-facing function is solve. Provide a 2-D uint32 array of category codes (one column per fixed-effect factor) and a response vector y. The solver finds x in the normal equations D'D x = D'y, where D is the sparse categorical design matrix.
import numpy as np
from within import solve, solve_batch, LsmrOptions
np.random.seed(1)
n = 100_000
fe = np.asfortranarray(np.column_stack([
np.random.randint(0, 500, n).astype(np.uint32),
np.random.randint(0, 200, n).astype(np.uint32),
]))
y = np.random.randn(n)
# Default: additive Schwarz + LSMR
result = solve(fe, y)
# Custom tolerance / iteration cap
result = solve(fe, y, options=LsmrOptions(tol=1e-10, maxiter=2000))
# Weighted solve
result = solve(fe, y, weights=np.ones(n))beta_true = np.array([1.0, -2.0, 0.5])
X = np.random.randn(n, 3)
y = X @ beta_true + np.random.randn(n)
result = solve_batch(fe, np.column_stack([y, X]))
y_tilde, X_tilde = result.demeaned[:, 0], result.demeaned[:, 1:]
beta_hat = np.linalg.lstsq(X_tilde, y_tilde, rcond=None)[0]
print(np.round(beta_hat, 4)) # [ 0.9982 -2.006 0.5005]| Function | Description |
|---|---|
solve(categories, y, options?, weights?, preconditioner?) |
Solve a single right-hand side. Returns SolveResult. |
solve_batch(categories, Y, options?, weights?, preconditioner?) |
Solve multiple RHS vectors in parallel. Y has shape (n_obs, k). Returns BatchSolveResult. |
categories is a 2-D uint32 array of shape (n_obs, n_factors). A UserWarning is emitted when a C-contiguous array is passed — use np.asfortranarray(categories) for best performance.
For repeated solves with the same design matrix, Solver builds the preconditioner once and reuses it.
from within import Solver
solver = Solver(fe)
r = solver.solve(y) # reuses preconditioner
r = solver.solve_batch(np.column_stack([y, X]))
precond = solver.preconditioner # picklable property
solver2 = Solver(fe, preconditioner=precond) # skip re-factorization| Property / Method | Description |
|---|---|
Solver(categories, weights?, preconditioner?) |
Build solver. Factorizes the preconditioner at construction. |
.solve(y, options?) |
Solve a single RHS with the given LSMR tuning. Returns SolveResult. |
.solve_batch(Y, options?) |
Solve multiple RHS columns in parallel. Returns BatchSolveResult. |
.preconditioner |
Return the built Preconditioner (picklable), or None. Reuse via Solver(fe, preconditioner=p). |
| Class | Description |
|---|---|
LsmrOptions(tol=1e-8, maxiter=1000, local_size=None) |
Modified LSMR. local_size enables windowed reorthogonalization. |
The preconditioner argument accepts any of:
| Form | Meaning |
|---|---|
None (default) |
Library default — Additive Schwarz with sensible defaults. |
PreconditionerConfig.Off |
Explicit identity — solve unpreconditioned. |
PreconditionerConfig.Additive |
Additive Schwarz shortcut, equivalent to None. |
AdditiveSchwarz(local_solver?, reduction?) |
Tuned Schwarz config — import from within.config. |
Preconditioner instance |
Reuse a previously-built preconditioner across solvers. |
| Class | Description |
|---|---|
LocalSolverConfig(approx_chol?, approx_schur?, dense_threshold=24) |
Schur reduction + approximate Cholesky. Omit approx_schur for the library-default approximate variant; pass approx_schur=None to request an exact Schur (slower, used for validation). |
ApproxCholConfig(seed=0, split=1) |
Approximate Cholesky parameters. |
ApproxSchurConfig(seed=0, split=1) |
Approximate Schur complement sampling parameters. |
ReductionStrategy enum |
Auto (default), AtomicScatter, ParallelReduction. |
SolveResult: x (coefficients), demeaned (residuals), converged, iterations, residual, time_total, time_setup, time_solve.
BatchSolveResult: Same fields, with converged, iterations, residual, and time_solve as lists (one entry per RHS).
use ndarray::Array2;
use within::{solve, LsmrOptions, PreconditionerConfig};
use within::config::{LocalSolverConfig, ReductionStrategy};
let categories = /* Array2<u32> of shape (n_obs, n_factors) */;
let y: &[f64] = /* response vector */;
// Default: LSMR + additive Schwarz (None → library default)
let r = solve(categories.view(), &y, None, &LsmrOptions::default(), None)?;
assert!(r.converged);
// Tighter tolerance with an explicit additive preconditioner
let lsmr = LsmrOptions { tol: 1e-10, ..LsmrOptions::default() };
let precond = PreconditionerConfig::Additive {
local_solver: LocalSolverConfig::default(),
reduction: ReductionStrategy::default(),
};
let r = solve(categories.view(), &y, None, &lsmr, Some(&precond))?;Persistent solver — build once, solve many:
use within::Solver;
let solver = Solver::new(categories.view(), None, None)?;
let r1 = solver.solve(&y, &LsmrOptions::default())?;
let r2 = solver.solve(&another_y, &LsmrOptions::default())?; // reuses preconditionerTwo-channel preconditioner signaling: Option<&PreconditionerConfig> where
None is the library default and Some(PreconditionerConfig::Off) is the
explicit identity preconditioner.
| Type | Variants / Fields |
|---|---|
LsmrOptions |
{ tol: f64, maxiter: usize, local_size: Option<usize> } |
PreconditionerConfig |
Off | Additive { local_solver: LocalSolverConfig, reduction: ReductionStrategy } (#[non_exhaustive]) |
LocalSolverConfig |
{ approx_chol, approx_schur, dense_threshold } |
Preconditioner |
Opaque built handle — reuse via Solver::new(.., precond) (owned or &) |
| Module | Visibility | Key types |
|---|---|---|
within::config |
public | LsmrOptions, PreconditionerConfig, LocalSolverConfig, ApproxCholConfig, ApproxSchurConfig, ReductionStrategy |
within::observation |
public | Store trait, FactorMajorStore, ArrayStore, FactorMeta |
within::error |
public | WithinError, BuildError, SolveError |
domain / operator / solver / orchestrate |
pub(crate) |
implementation layers — public items are re-exported at the crate root |
| Feature | Default | Effect |
|---|---|---|
ndarray |
yes | Enables from_array constructors for ndarray::ArrayView2 interop. |
crates/
schwarz-precond/ Generic domain decomposition library (traits, solvers, Schwarz preconditioners)
within/ Core fixed-effects solver (observation stores, domains, operators, orchestration)
within-py/ PyO3 bridge (cdylib → within._within)
python/within/ Python package re-exporting the Rust extension
benchmarks/ Python benchmark framework
Uses pixi as the task runner.
pixi run develop # Build Rust extension (release mode)
pixi run test # Rebuild + pytest
cargo test --workspace # Rust tests only
cargo bench -p within # Criterion benchmarks
pixi run bench run all # Python benchmarksRust changes require rebuilding before running Python code (pixi run develop).
MIT
- Correia, Sergio. "A feasible estimator for linear models with multi-way fixed effects." Preprint at http://scorreia.com/research/hdfe.pdf (2016).
- Gao, Y., Kyng, R. & Spielman, D. A. (2025). AC(k): Robust Solution of Laplacian Equations by Randomized Approximate Cholesky Factorization. SIAM Journal on Scientific Computing.
- Toselli & Widlund (2005). Domain Decomposition Methods — Algorithms and Theory. Springer.
- Xu, J. (1992). Iterative Methods by Space Decomposition and Subspace Correction. SIAM Review, 34(4), 581--613.