Architecture — fect

Generated by scriber for run REQ-20260401-104739 on 2026-04-01.

Overview

fect is an R package for estimating causal effects in panel data using counterfactual imputation methods (Fixed Effects Counterfactual Estimators). It targets causal panel analysis with binary treatments under the parallel trends assumption, supporting treatment switching and limited carryover effects. The core abstraction is counterfactual imputation: impute missing potential outcomes Y(0) for treated units using control units, then compute the Average Treatment Effect on the Treated (ATT) as the gap between observed and imputed outcomes. The package is an R/C++ hybrid using Rcpp and RcppArmadillo for numerically intensive linear algebra (SVD, EM iterations, matrix factorization). Key external dependencies include fixest (initial FE regression), ggplot2 (visualization), doParallel/doFuture/future.apply (parallel bootstrap), MASS (generalized inverse), and mvtnorm (multivariate normal draws). Estimation methods include FE (fixed effects), IFE (interactive fixed effects / factor model), MC (matrix completion via nuclear norm regularization), CFE (complex fixed effects with structured covariates), and wrappers for modern DID estimators. Version 2.2.0. References: Liu, Wang, and Xu (2024); Chiu et al. (2025).

Module Structure

%%{init: {'theme': 'neutral'}}%%
graph TD
    subgraph API["API Layer"]
        A1["default.R — fect() entry"]
        A2["interFE.R — interFE()"]
        A3["did_wrapper.R — DID wraps"]
        A4["fect_mspe.R — MSPE comp"]
    end

    subgraph Est["Estimation Layer"]
        E1["fe.R — fect_fe() IFE"]
        E2["mc.R — fect_mc() MC"]
        E3["cfe.R — fect_cfe() CFE"]
        E4["fect_nevertreated.R"]
    end

    subgraph CV["Cross-Validation"]
        V1["cv.R — fect_cv()"]
        V2["cv_binary.R — binary CV"]
    end

    subgraph Inf["Inference Layer"]
        I1["boot.R — fect_boot()"]
    end

    subgraph Diag["Diagnostics & Sensitivity"]
        D1["diagtest.R — pre-trend"]
        D2["fittest.R — fitness test"]
        D3["fect_sens.R — sensitivity"]
        D4["fect_iden.R — identification"]
    end

    subgraph Viz["Visualization Layer"]
        P1["plot.R — plot.fect()"]
        P2["esplot.R — esplot()"]
    end

    subgraph Cpp["C++ Core (RcppArmadillo)"]
        C1["ife.cpp / ife_sub.cpp"]
        C2["mc.cpp"]
        C3["cfe.cpp / cfe_sub.cpp"]
        C4["fe_sub.cpp — shared utils"]
        C5["auxiliary.cpp — EM helpers"]
        C6["binary_*.cpp — probit"]
    end

    subgraph Util["Support & Data"]
        U1["support.R — data helpers"]
        U2["polynomial.R — trends"]
        U3["effect.R / cumu.R — ATT"]
        U4["score.R / permutation.R"]
        U5["getcohort.R — cohorts"]
        U6["print.R — S3 print"]
    end

    A1 --> E1
    A1 --> E2
    A1 --> E3
    A1 --> E4
    A1 --> V1
    A1 --> I1
    A1 --> D1
    A1 --> P1
    V1 --> E1
    V1 --> E2
    I1 --> E1
    I1 --> E2
    I1 --> E3
    E1 --> C1
    E2 --> C2
    E3 --> C3
    C1 --> C4
    C2 --> C4
    C3 --> C4
    C1 --> C5
    E1 --> U1
    E2 --> U1
    E3 --> U1
    A1 --> U1
    A1 --> U2

    style A1 fill:#1e90ff,stroke:#1565c0,color:#fff
    style V1 fill:#1e90ff,stroke:#1565c0,color:#fff
    style E1 fill:#1e90ff,stroke:#1565c0,color:#fff
    style U1 fill:#1e90ff,stroke:#1565c0,color:#fff
    style C1 fill:#1e90ff,stroke:#1565c0,color:#fff

Module Reference

Module / File	Layer	Purpose	Key Exports	Changed
`R/default.R` (2,919 lines)	API	Main entry point, parameter validation, method routing; added `n.init` parameter for multi-start initialization	`fect()`, `fect.formula()`, `fect.default()`	yes
`R/interFE.R` (515 lines)	API	Standalone interactive fixed effects estimator	`interFE()`	no
`R/did_wrapper.R` (656 lines)	API	Modern DID estimator wrappers (did, DIDmultiplegtDYN)	`did_wrapper()`	no
`R/fect_mspe.R` (344 lines)	API	MSPE computation for model comparison	`fect_mspe()`	no
`R/fe.R` (954 lines)	Estimation	Interactive Fixed Effects / factor model estimation; added convergence warning on non-convergence	`fect_fe()`	yes
`R/mc.R` (804 lines)	Estimation	Matrix Completion via nuclear norm regularization	`fect_mc()`	no
`R/cfe.R` (1,172 lines)	Estimation	Complex Fixed Effects with structured covariates	`fect_cfe()`	no
`R/fect_nevertreated.R` (3,166 lines)	Estimation	Never-treated comparison group variant	`fect_nevertreated()`	no
`R/cv.R` (1,526 lines)	Cross-Validation	Hyperparameter selection (r, lambda) via MSPE/PC; added warm-start CV, multi-start initialization, convergence warning	`fect_cv()`	yes
`R/cv_binary.R` (421 lines)	Cross-Validation	Cross-validation for binary/probit models	`fect_cv_binary()`	no
`R/boot.R` (4,884 lines)	Inference	Bootstrap/jackknife/parametric inference with parallel support	`fect_boot()`	no
`R/diagtest.R` (215 lines)	Diagnostics	Pre-trend F-test, equivalence (TOST), placebo, carryover tests	`diagtest()`	no
`R/fittest.R` (636 lines)	Diagnostics	Fitness/wild bootstrap test	`fect_test()`	no
`R/fect_sens.R` (232 lines)	Diagnostics	Sensitivity analysis via HonestDiDFEct	`fect_sens()`	no
`R/fect_iden.R` (224 lines)	Diagnostics	Identification analysis	`fect_iden()`	no
`R/plot.R` (5,019 lines)	Visualization	Comprehensive ggplot2 plotting (gap, equiv, status, exit, factors, loadings, calendar, counterfactual, heterogeneous)	`plot.fect()`	no
`R/esplot.R` (1,118 lines)	Visualization	Standalone event-study plots	`esplot()`	no
`R/plot_return.R` (9 lines)	Visualization	Plot return object class definition	(internal)	no
`R/support.R` (676 lines)	Utilities	Data manipulation, initial FE fit, helper functions; added `perturbedFit()` for multi-start initialization	`get_term()`, `align_beta0()`, `perturbedFit()` (internal)	yes
`R/polynomial.R` (844 lines)	Utilities	Polynomial/B-spline trend specification	`fect_polynomial()`	no
`R/effect.R` (397 lines)	Utilities	Treatment effect decomposition by sub-group	`effect()`	no
`R/cumu.R` (206 lines)	Utilities	Cumulative ATT computation	`att.cumu()`	no
`R/score.R` (105 lines)	Utilities	Score-based inference	(internal)	no
`R/permutation.R` (264 lines)	Utilities	Permutation test for treatment effects	(internal)	no
`R/getcohort.R` (264 lines)	Utilities	Treatment cohort identification	`get.cohort()`	no
`R/print.R` (111 lines)	Utilities	S3 print methods for fect and interFE objects	`print.fect()`, `print.interFE()`	no
`R/RcppExports.R` (191 lines)	Utilities	Auto-generated Rcpp function bindings	(auto-generated)	no
`src/ife.cpp` (534 lines)	C++ Core	IFE algorithm: `inter_fe()`, `inter_fe_ub()`, `inter_fe_d()`; added `converged` flag propagation	(Rcpp exports)	yes
`src/ife_sub.cpp` (577 lines)	C++ Core	IFE sub-routines: SVD factor estimation, EM iterations, alternating minimization; burn-in fix preserves converged fit, `converged` flag in all 5 iteration functions	(internal)	yes
`src/mc.cpp` (223 lines)	C++ Core	Matrix completion: `inter_fe_mc()`, nuclear norm penalization	(Rcpp exports)	no
`src/cfe.cpp` (203 lines)	C++ Core	Complex FE: `complex_fe_ub()`	(Rcpp exports)	no
`src/cfe_sub.cpp` (564 lines)	C++ Core	Complex FE sub-routines: `cfe_iter()`, structured covariate handling	(internal)	no
`src/fe_sub.cpp` (291 lines)	C++ Core	Shared FE utilities: `Y_demean()`, `panel_beta()`, `panel_factor()`, `panel_FE()`, `XXinv()`	(internal)	no
`src/binary_sub.cpp` (539 lines)	C++ Core	Probit model sub-routines for binary outcomes	(internal)	no
`src/binary_qr.cpp` (347 lines)	C++ Core	QR-based probit estimation	(internal)	no
`src/binary_svd.cpp` (302 lines)	C++ Core	SVD-based probit estimation	(internal)	no
`src/auxiliary.cpp` (396 lines)	C++ Core	EM helpers, matrix utilities, log-likelihood computation	(internal)	no
`src/fect.h` (60 lines)	C++ Core	Header file with all C++ function declarations	(header)	no

Function Call Graph

Main Estimation Pipeline

%%{init: {'theme': 'neutral'}}%%
graph TD
    F1["fect()"]
    F2["fect.formula()"]
    F3["fect.default()"]
    F4["fect_cv()"]
    F5["fect_fe()"]
    F6["fect_mc()"]
    F7["fect_cfe()"]
    F8["fect_nevertreated()"]
    C1["inter_fe_ub() [C++]"]
    C2["inter_fe_mc() [C++]"]
    C3["complex_fe_ub() [C++]"]
    C4["inter_fe_d_qr_ub() [C++]"]
    S1["panel_factor() [C++]"]
    S2["panel_FE() [C++]"]
    S3["Y_demean() [C++]"]
    S4["cfe_iter() [C++]"]

    F1 --> F2
    F2 --> F3
    F3 -->|"CV=TRUE"| F4
    F3 -->|"method=ife/fe"| F5
    F3 -->|"method=mc"| F6
    F3 -->|"method=cfe"| F7
    F3 -->|"nevertreated"| F8
    F4 --> F5
    F4 --> F6
    F8 --> F5
    F8 --> F6
    F8 --> F7
    F5 --> C1
    F5 -->|"binary=TRUE"| C4
    F6 --> C2
    F7 --> C3
    C1 --> S1
    C1 --> S3
    C2 --> S2
    C2 --> S3
    C3 --> S4
    C3 --> S3
    F4 -.->|"warm-start"| C1
    U4["perturbedFit()"]
    F4 -->|"n.init > 1"| U4
    U4 --> C1

    style F1 fill:#1e90ff,stroke:#1565c0,color:#fff
    style F3 fill:#1e90ff,stroke:#1565c0,color:#fff
    style F4 fill:#1e90ff,stroke:#1565c0,color:#fff
    style F5 fill:#1e90ff,stroke:#1565c0,color:#fff
    style C1 fill:#1e90ff,stroke:#1565c0,color:#fff
    style U4 fill:#1e90ff,stroke:#1565c0,color:#fff

Inference and Diagnostics

%%{init: {'theme': 'neutral'}}%%
graph TD
    F3["fect.default()"]
    B1["fect_boot()"]
    D1["diagtest()"]
    D2["fittest()"]
    D3["fect_sens()"]
    D4["fect_iden()"]
    F5["fect_fe()"]
    F6["fect_mc()"]
    F7["fect_cfe()"]
    PL["plot.fect()"]
    ES["esplot()"]

    F3 -->|"se=TRUE"| B1
    F3 --> D1
    F3 --> D2
    B1 --> F5
    B1 --> F6
    B1 --> F7
    F3 --> PL
    F3 --> ES
    D3 -.->|"optional"| F3
    D4 -.->|"optional"| F3

Function Reference

Function	Defined In	Called By	Calls	Changed	Purpose
`fect()`	`R/default.R`	user / exported	`UseMethod("fect")`	yes	S3 generic entry point; added `n.init` parameter
`fect.formula()`	`R/default.R`	`fect()`	`fect.default()`	yes	Parse formula, added `n.init` pass-through
`fect.default()`	`R/default.R`	`fect.formula()`, user	`fect_cv()`, `fect_fe()`, `fect_mc()`, `fect_cfe()`, `fect_boot()`, `diagtest()`	yes	Added `n.init` validation and threading
`fect_fe()`	`R/fe.R`	`fect.default()`, `fect_cv()`, `fect_boot()`	`inter_fe_ub()`, `inter_fe_d_qr_ub()` (C++)	yes	IFE estimation; added convergence warning check
`fect_mc()`	`R/mc.R`	`fect.default()`, `fect_cv()`, `fect_boot()`	`inter_fe_mc()` (C++)	no	Matrix completion estimation (nuclear norm regularization)
`fect_cfe()`	`R/cfe.R`	`fect.default()`, `fect_boot()`	`complex_fe_ub()` (C++)	no	Complex FE with structured covariates (Z, Q, gamma, kappa)
`fect_nevertreated()`	`R/fect_nevertreated.R`	`fect.default()`	`fect_fe()`, `fect_mc()`, `fect_cfe()`	no	Wrapper for never-treated-only estimation sample
`fect_cv()`	`R/cv.R`	`fect.default()`	`fect_fe()`, `fect_mc()`, `perturbedFit()`	yes	CV with warm-start across r/lambda candidates, multi-start init, convergence warning
`perturbedFit()`	`R/support.R`	`fect_cv()`	`rnorm()`, `sd()`	yes	Generate perturbed initial values for multi-start robustness (internal)
`fect_boot()`	`R/boot.R`	`fect.default()`	`fect_fe()`, `fect_mc()`, `fect_cfe()`	no	Bootstrap/jackknife inference engine with parallel support
`interFE()`	`R/interFE.R`	user / exported	`inter_fe()` (C++)	no	Standalone interactive fixed effects estimator
`did_wrapper()`	`R/did_wrapper.R`	user / exported	`fixest::feols()`, `did::att_gt()`	no	Modern DID estimator wrappers
`plot.fect()`	`R/plot.R`	user / exported	ggplot2 functions	no	Comprehensive visualization with 10+ plot types
`esplot()`	`R/esplot.R`	user / exported	ggplot2 functions	no	Standalone event-study plot
`effect()`	`R/effect.R`	user / exported	(internal helpers)	no	Treatment effect decomposition by sub-group
`att.cumu()`	`R/cumu.R`	user / exported	(internal helpers)	no	Cumulative ATT computation
`diagtest()`	`R/diagtest.R`	`fect.default()`	(statistical computations)	no	Pre-trend, placebo, carryover, equivalence tests
`fect_sens()`	`R/fect_sens.R`	user / exported	HonestDiDFEct functions	no	Sensitivity analysis
`fect_iden()`	`R/fect_iden.R`	user / exported	(internal helpers)	no	Identification analysis
`inter_fe_ub()`	`src/ife.cpp`	`fect_fe()`	`panel_factor()`, `fe_ub()`, `Y_demean()`, `fe_ad_inter_iter()`	yes	C++ IFE with unbalanced panels; now returns `converged` flag
`inter_fe_mc()`	`src/mc.cpp`	`fect_mc()`	`panel_FE()`, `Y_demean()`	no	C++ matrix completion with nuclear norm
`complex_fe_ub()`	`src/cfe.cpp`	`fect_cfe()`	`cfe_iter()`, `Y_demean()`	no	C++ complex FE estimation
`panel_factor()`	`src/fe_sub.cpp`	`inter_fe_ub()`, others	SVD routines	no	Extract latent factors via SVD
`panel_FE()`	`src/fe_sub.cpp`	`inter_fe_mc()`, others	soft-thresholding	no	Nuclear norm regularization / soft-thresholding
`Y_demean()`	`src/fe_sub.cpp`	most C++ estimators	(arma operations)	no	Remove unit and/or time fixed effects

Data Flow

%%{init: {'theme': 'neutral'}}%%
graph TD
    IN["User Input (formula/data + params)"]
    FP["Formula Parsing (fect.formula)"]
    PV["Parameter Validation (fect.default)"]
    DP["Data Preprocessing (long to T x N matrices)"]
    INIT["initialFit() + perturbedFit()"]
    MI{{"n.init > 1?"}}
    MS["Multi-Start: trial runs, select best sigma2"]
    CV{{"CV=TRUE?"}}
    CVR["CV with Warm-Start (fect_cv)"]
    OPT["Optimal r/lambda selected"]
    NT{{"nevertreated?"}}
    NTW["fect_nevertreated() wrapper"]
    MR{{"Method?"}}
    IFE["fect_fe() -> inter_fe_ub() C++"]
    MC["fect_mc() -> inter_fe_mc() C++"]
    CFE["fect_cfe() -> complex_fe_ub() C++"]
    CI["Counterfactual Imputation (Y.ct)"]
    ATT["ATT = Y.obs - Y.ct"]
    SE{{"se=TRUE?"}}
    BOOT["fect_boot() — resample + re-estimate"]
    SECI["SEs, CIs, p-values"]
    DIAG["Diagnostic Tests (diagtest)"]
    OBJ["S3 Object Assembly (class fect)"]
    OUT["Output (print / plot / esplot)"]

    IN --> FP
    FP --> PV
    PV --> DP
    DP --> INIT
    INIT --> MI
    MI -- yes --> MS
    MS --> CV
    MI -- no --> CV
    CV -- yes --> CVR
    CVR --> OPT
    OPT --> NT
    CV -- no --> NT
    NT -- yes --> NTW
    NTW --> MR
    NT -- no --> MR
    MR -- ife/fe --> IFE
    MR -- mc --> MC
    MR -- cfe --> CFE
    IFE --> CC{{"converged?"}}
    CC -- no --> WARN["Emit convergence warning"]
    CC -- yes --> CI
    WARN --> CI
    MC --> CI
    CFE --> CI
    CI --> ATT
    ATT --> SE
    SE -- yes --> BOOT
    BOOT --> SECI
    SECI --> DIAG
    SE -- no --> DIAG
    DIAG --> OBJ
    OBJ --> OUT

    style INIT fill:#1e90ff,stroke:#1565c0,color:#fff
    style MI fill:#1e90ff,stroke:#1565c0,color:#fff
    style MS fill:#1e90ff,stroke:#1565c0,color:#fff
    style CVR fill:#1e90ff,stroke:#1565c0,color:#fff
    style CC fill:#1e90ff,stroke:#1565c0,color:#fff
    style WARN fill:#1e90ff,stroke:#1565c0,color:#fff

Architectural Patterns

S3 Dispatch with Formula Interface: fect() uses UseMethod() to support both formula and direct (Y, D, X) interfaces. fect.formula() parses the formula into variable names, fect.default() does the computation. Same pattern for interFE().
R/C++ Layered Computation: All numerically intensive operations (SVD, EM iterations, demeaning, matrix factorization) are implemented in C++ via RcppArmadillo. R handles data wrangling, parameter validation, control flow, and result assembly. The boundary is at the estimation functions: R fect_fe() calls C++ inter_fe_ub().
Method-Agnostic Pipeline: fect.default() provides a single preprocessing, CV, estimation, inference, diagnostics pipeline. Method-specific logic is encapsulated in fect_fe(), fect_mc(), fect_cfe(). Adding a new estimation method requires only a new estimation function and a routing entry.
Matrix-Oriented Data Representation: Panel data is converted from long-form data frames to T x N matrices early in fect.default(). Covariates become T x N x p arrays. All downstream computation operates on these matrix forms, enabling efficient C++ computation.
Two-Tier Tolerance: Cross-validation uses a looser tolerance (max(tol, 1e-3)) for speed during hyperparameter search, while final estimation uses the user-specified tolerance for precision.
Warm-Start CV: When sweeping over consecutive r candidates (IFE) or lambda candidates (MC), the fitted values from the previous candidate are reused as the starting point for the next. Per-fold and full-data caches (warm_fit_cv, warm_fit_full) store the $fit matrix between iterations, reducing EM iteration counts for adjacent hyperparameter values. Unobserved entries are zeroed before reuse to prevent stale value leakage.
Multi-Start Initialization: The n.init parameter (default 1, preserving existing behavior) controls the number of perturbed starting points. When n.init > 1, perturbedFit() generates Gaussian-perturbed copies of the base Y0 (5% of data SD) and beta0 (10% of coefficient magnitude), runs trial estimations, and selects the initialization with the lowest residual variance (sigma2). This mitigates local optima sensitivity in the EM algorithm.
Burn-in Warm-Start: In the weighted IFE estimation, the burn-in phase progressively reduces the rank from d down to r. Previously, upon convergence during burn-in, both fit and fit_old were reset to the initial Y0, discarding the converged solution. Now only fit_old is reset (to the current fit), preserving the converged state as the starting point for the real estimation phase.
Convergence Diagnostics: All five C++ iteration functions (fe_ad_iter, fe_ad_covar_iter, fe_ad_inter_iter, fe_ad_inter_covar_iter, beta_iter) return a converged flag (1 if dif <= tol, 0 if max_iter reached). This flag propagates through inter_fe_ub() to R, where fect_cv() and fect_fe() emit a warning() on non-convergence. CV inner-loop calls do not warn (non-convergence at loose tolerance is expected).
Parallel Bootstrap via foreach: fect_boot() uses foreach with doParallel/doFuture backends for parallel bootstrap replication. Includes trim_closure_env() optimization to reduce serialization overhead by keeping only referenced symbols in function environments.
Counterfactual Imputation as Core Abstraction: All methods share the same conceptual framework: impute Y(0) for treated units using untreated observations, compute ATT as the gap. FE uses additive fixed effects, IFE adds latent factors (F * L'), MC uses nuclear norm regularization, CFE adds structured covariates.
Never-Treated vs Not-Yet-Treated Estimation Samples: The package supports two estimation sample strategies. "notyettreated" includes not-yet-treated observations (requiring EM for missing data), "nevertreated" uses only never-treated units (allowing direct SVD). The fect_nevertreated() wrapper handles the latter.
Comprehensive Diagnostic Suite: Built-in tests (F-test, TOST equivalence, placebo, carryover) allow users to validate the parallel trends assumption without external tools. Sensitivity analysis via optional HonestDiDFEct integration.

Notes

FE is internally treated as IFE with r = 0 (zero latent factors). The code sets method = "ife" when method = "fe" and r = 0.
The gsynth method is a compatibility alias that forces time.component.from = "nevertreated" and em = FALSE, matching the behavior of the gsynth package.
boot.R (4,884 lines) and plot.R (5,019 lines) are the two largest files. Both could benefit from modular decomposition in future refactors.
The binary option (probit models) is only available with method = "ife" and has dedicated C++ implementations (binary_qr.cpp, binary_svd.cpp, binary_sub.cpp).
The package uses fixest::feols() for initial OLS regression to obtain starting values for iterative estimation.
Vignettes are organized as a Quarto book (vignettes/_quarto.yml) with 9 chapters covering getting started, FE, IFE/MC, CFE, heterogeneous effects, plots, gsynth compatibility, panel diagnostics, and sensitivity analysis.
10 bundled datasets (simdata, sim_base, sim_gsynth, sim_linear, sim_region, sim_trend, turnout, gs2020, hh2019, simgsynth) support examples and testing.
11 exported functions and 8 S3 methods registered in NAMESPACE.
Total R source: 27,872 lines across 27 files. Total C++ source: 4,848 lines across 12 files (plus header).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture — fect

Overview

Module Structure

Module Reference

Function Call Graph

Main Estimation Pipeline

Inference and Diagnostics

Function Reference

Data Flow

Architectural Patterns

Notes

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture — fect

Overview

Module Structure

Module Reference

Function Call Graph

Main Estimation Pipeline

Inference and Diagnostics

Function Reference

Data Flow

Architectural Patterns

Notes