Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 59 additions & 16 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Architecture — fect

> Generated by scriber for run `20260329-arch-docs` on 2026-03-29.
> Generated by scriber for run `REQ-20260401-104739` on 2026-04-01.

## Overview

Expand Down Expand Up @@ -91,21 +91,27 @@ graph TD
E3 --> U1
A1 --> U1
A1 --> U2

style A1 fill:#1e90ff,stroke:#1565c0,color:#fff
style V1 fill:#1e90ff,stroke:#1565c0,color:#fff
style E1 fill:#1e90ff,stroke:#1565c0,color:#fff
style U1 fill:#1e90ff,stroke:#1565c0,color:#fff
style C1 fill:#1e90ff,stroke:#1565c0,color:#fff
```

### Module Reference

| Module / File | Layer | Purpose | Key Exports | Changed |
| --- | --- | --- | --- | --- |
| `R/default.R` (2,919 lines) | API | Main entry point, parameter validation, method routing | `fect()`, `fect.formula()`, `fect.default()` | no |
| `R/default.R` (2,919 lines) | API | Main entry point, parameter validation, method routing; added `n.init` parameter for multi-start initialization | `fect()`, `fect.formula()`, `fect.default()` | **yes** |
| `R/interFE.R` (515 lines) | API | Standalone interactive fixed effects estimator | `interFE()` | no |
| `R/did_wrapper.R` (656 lines) | API | Modern DID estimator wrappers (did, DIDmultiplegtDYN) | `did_wrapper()` | no |
| `R/fect_mspe.R` (344 lines) | API | MSPE computation for model comparison | `fect_mspe()` | no |
| `R/fe.R` (954 lines) | Estimation | Interactive Fixed Effects / factor model estimation | `fect_fe()` | no |
| `R/fe.R` (954 lines) | Estimation | Interactive Fixed Effects / factor model estimation; added convergence warning on non-convergence | `fect_fe()` | **yes** |
| `R/mc.R` (804 lines) | Estimation | Matrix Completion via nuclear norm regularization | `fect_mc()` | no |
| `R/cfe.R` (1,172 lines) | Estimation | Complex Fixed Effects with structured covariates | `fect_cfe()` | no |
| `R/fect_nevertreated.R` (3,166 lines) | Estimation | Never-treated comparison group variant | `fect_nevertreated()` | no |
| `R/cv.R` (1,526 lines) | Cross-Validation | Hyperparameter selection (r, lambda) via MSPE/PC | `fect_cv()` | no |
| `R/cv.R` (1,526 lines) | Cross-Validation | Hyperparameter selection (r, lambda) via MSPE/PC; added warm-start CV, multi-start initialization, convergence warning | `fect_cv()` | **yes** |
| `R/cv_binary.R` (421 lines) | Cross-Validation | Cross-validation for binary/probit models | `fect_cv_binary()` | no |
| `R/boot.R` (4,884 lines) | Inference | Bootstrap/jackknife/parametric inference with parallel support | `fect_boot()` | no |
| `R/diagtest.R` (215 lines) | Diagnostics | Pre-trend F-test, equivalence (TOST), placebo, carryover tests | `diagtest()` | no |
Expand All @@ -115,7 +121,7 @@ graph TD
| `R/plot.R` (5,019 lines) | Visualization | Comprehensive ggplot2 plotting (gap, equiv, status, exit, factors, loadings, calendar, counterfactual, heterogeneous) | `plot.fect()` | no |
| `R/esplot.R` (1,118 lines) | Visualization | Standalone event-study plots | `esplot()` | no |
| `R/plot_return.R` (9 lines) | Visualization | Plot return object class definition | (internal) | no |
| `R/support.R` (676 lines) | Utilities | Data manipulation, initial FE fit, helper functions | `get_term()`, `align_beta0()` | no |
| `R/support.R` (676 lines) | Utilities | Data manipulation, initial FE fit, helper functions; added `perturbedFit()` for multi-start initialization | `get_term()`, `align_beta0()`, `perturbedFit()` (internal) | **yes** |
| `R/polynomial.R` (844 lines) | Utilities | Polynomial/B-spline trend specification | `fect_polynomial()` | no |
| `R/effect.R` (397 lines) | Utilities | Treatment effect decomposition by sub-group | `effect()` | no |
| `R/cumu.R` (206 lines) | Utilities | Cumulative ATT computation | `att.cumu()` | no |
Expand All @@ -124,8 +130,8 @@ graph TD
| `R/getcohort.R` (264 lines) | Utilities | Treatment cohort identification | `get.cohort()` | no |
| `R/print.R` (111 lines) | Utilities | S3 print methods for fect and interFE objects | `print.fect()`, `print.interFE()` | no |
| `R/RcppExports.R` (191 lines) | Utilities | Auto-generated Rcpp function bindings | (auto-generated) | no |
| `src/ife.cpp` (534 lines) | C++ Core | IFE algorithm: `inter_fe()`, `inter_fe_ub()`, `inter_fe_d()` | (Rcpp exports) | no |
| `src/ife_sub.cpp` (577 lines) | C++ Core | IFE sub-routines: SVD factor estimation, EM iterations, alternating minimization | (internal) | no |
| `src/ife.cpp` (534 lines) | C++ Core | IFE algorithm: `inter_fe()`, `inter_fe_ub()`, `inter_fe_d()`; added `converged` flag propagation | (Rcpp exports) | **yes** |
| `src/ife_sub.cpp` (577 lines) | C++ Core | IFE sub-routines: SVD factor estimation, EM iterations, alternating minimization; burn-in fix preserves converged fit, `converged` flag in all 5 iteration functions | (internal) | **yes** |
| `src/mc.cpp` (223 lines) | C++ Core | Matrix completion: `inter_fe_mc()`, nuclear norm penalization | (Rcpp exports) | no |
| `src/cfe.cpp` (203 lines) | C++ Core | Complex FE: `complex_fe_ub()` | (Rcpp exports) | no |
| `src/cfe_sub.cpp` (564 lines) | C++ Core | Complex FE sub-routines: `cfe_iter()`, structured covariate handling | (internal) | no |
Expand Down Expand Up @@ -184,6 +190,17 @@ graph TD
C2 --> S3
C3 --> S4
C3 --> S3
F4 -.->|"warm-start"| C1
U4["perturbedFit()"]
F4 -->|"n.init > 1"| U4
U4 --> C1

style F1 fill:#1e90ff,stroke:#1565c0,color:#fff
style F3 fill:#1e90ff,stroke:#1565c0,color:#fff
style F4 fill:#1e90ff,stroke:#1565c0,color:#fff
style F5 fill:#1e90ff,stroke:#1565c0,color:#fff
style C1 fill:#1e90ff,stroke:#1565c0,color:#fff
style U4 fill:#1e90ff,stroke:#1565c0,color:#fff
```

### Inference and Diagnostics
Expand Down Expand Up @@ -219,14 +236,15 @@ graph TD

| Function | Defined In | Called By | Calls | Changed | Purpose |
| --- | --- | --- | --- | --- | --- |
| `fect()` | `R/default.R` | user / exported | `UseMethod("fect")` | no | S3 generic entry point for counterfactual estimation |
| `fect.formula()` | `R/default.R` | `fect()` | `fect.default()` | no | Parse formula, extract variable names, delegate to default method |
| `fect.default()` | `R/default.R` | `fect.formula()`, user | `fect_cv()`, `fect_fe()`, `fect_mc()`, `fect_cfe()`, `fect_boot()`, `diagtest()` | no | Workhorse: validation, preprocessing, method routing, inference, diagnostics |
| `fect_fe()` | `R/fe.R` | `fect.default()`, `fect_cv()`, `fect_boot()` | `inter_fe_ub()`, `inter_fe_d_qr_ub()` (C++) | no | IFE estimation (factor model with r latent factors) |
| `fect()` | `R/default.R` | user / exported | `UseMethod("fect")` | **yes** | S3 generic entry point; added `n.init` parameter |
| `fect.formula()` | `R/default.R` | `fect()` | `fect.default()` | **yes** | Parse formula, added `n.init` pass-through |
| `fect.default()` | `R/default.R` | `fect.formula()`, user | `fect_cv()`, `fect_fe()`, `fect_mc()`, `fect_cfe()`, `fect_boot()`, `diagtest()` | **yes** | Added `n.init` validation and threading |
| `fect_fe()` | `R/fe.R` | `fect.default()`, `fect_cv()`, `fect_boot()` | `inter_fe_ub()`, `inter_fe_d_qr_ub()` (C++) | **yes** | IFE estimation; added convergence warning check |
| `fect_mc()` | `R/mc.R` | `fect.default()`, `fect_cv()`, `fect_boot()` | `inter_fe_mc()` (C++) | no | Matrix completion estimation (nuclear norm regularization) |
| `fect_cfe()` | `R/cfe.R` | `fect.default()`, `fect_boot()` | `complex_fe_ub()` (C++) | no | Complex FE with structured covariates (Z, Q, gamma, kappa) |
| `fect_nevertreated()` | `R/fect_nevertreated.R` | `fect.default()` | `fect_fe()`, `fect_mc()`, `fect_cfe()` | no | Wrapper for never-treated-only estimation sample |
| `fect_cv()` | `R/cv.R` | `fect.default()` | `fect_fe()`, `fect_mc()` | no | Cross-validation to select r (IFE) or lambda (MC) |
| `fect_cv()` | `R/cv.R` | `fect.default()` | `fect_fe()`, `fect_mc()`, `perturbedFit()` | **yes** | CV with warm-start across r/lambda candidates, multi-start init, convergence warning |
| `perturbedFit()` | `R/support.R` | `fect_cv()` | `rnorm()`, `sd()` | **yes** | Generate perturbed initial values for multi-start robustness (internal) |
| `fect_boot()` | `R/boot.R` | `fect.default()` | `fect_fe()`, `fect_mc()`, `fect_cfe()` | no | Bootstrap/jackknife inference engine with parallel support |
| `interFE()` | `R/interFE.R` | user / exported | `inter_fe()` (C++) | no | Standalone interactive fixed effects estimator |
| `did_wrapper()` | `R/did_wrapper.R` | user / exported | `fixest::feols()`, `did::att_gt()` | no | Modern DID estimator wrappers |
Expand All @@ -237,7 +255,7 @@ graph TD
| `diagtest()` | `R/diagtest.R` | `fect.default()` | (statistical computations) | no | Pre-trend, placebo, carryover, equivalence tests |
| `fect_sens()` | `R/fect_sens.R` | user / exported | HonestDiDFEct functions | no | Sensitivity analysis |
| `fect_iden()` | `R/fect_iden.R` | user / exported | (internal helpers) | no | Identification analysis |
| `inter_fe_ub()` | `src/ife.cpp` | `fect_fe()` | `panel_factor()`, `fe_ub()`, `Y_demean()` | no | C++ IFE with unbalanced panels (EM algorithm) |
| `inter_fe_ub()` | `src/ife.cpp` | `fect_fe()` | `panel_factor()`, `fe_ub()`, `Y_demean()`, `fe_ad_inter_iter()` | **yes** | C++ IFE with unbalanced panels; now returns `converged` flag |
| `inter_fe_mc()` | `src/mc.cpp` | `fect_mc()` | `panel_FE()`, `Y_demean()` | no | C++ matrix completion with nuclear norm |
| `complex_fe_ub()` | `src/cfe.cpp` | `fect_cfe()` | `cfe_iter()`, `Y_demean()` | no | C++ complex FE estimation |
| `panel_factor()` | `src/fe_sub.cpp` | `inter_fe_ub()`, others | SVD routines | no | Extract latent factors via SVD |
Expand All @@ -255,8 +273,11 @@ graph TD
FP["Formula Parsing (fect.formula)"]
PV["Parameter Validation (fect.default)"]
DP["Data Preprocessing (long to T x N matrices)"]
INIT["initialFit() + perturbedFit()"]
MI{{"n.init > 1?"}}
MS["Multi-Start: trial runs, select best sigma2"]
CV{{"CV=TRUE?"}}
CVR["Cross-Validation (fect_cv)"]
CVR["CV with Warm-Start (fect_cv)"]
OPT["Optimal r/lambda selected"]
NT{{"nevertreated?"}}
NTW["fect_nevertreated() wrapper"]
Expand All @@ -276,7 +297,11 @@ graph TD
IN --> FP
FP --> PV
PV --> DP
DP --> CV
DP --> INIT
INIT --> MI
MI -- yes --> MS
MS --> CV
MI -- no --> CV
CV -- yes --> CVR
CVR --> OPT
OPT --> NT
Expand All @@ -287,7 +312,10 @@ graph TD
MR -- ife/fe --> IFE
MR -- mc --> MC
MR -- cfe --> CFE
IFE --> CI
IFE --> CC{{"converged?"}}
CC -- no --> WARN["Emit convergence warning"]
CC -- yes --> CI
WARN --> CI
MC --> CI
CFE --> CI
CI --> ATT
Expand All @@ -298,6 +326,13 @@ graph TD
SE -- no --> DIAG
DIAG --> OBJ
OBJ --> OUT

style INIT fill:#1e90ff,stroke:#1565c0,color:#fff
style MI fill:#1e90ff,stroke:#1565c0,color:#fff
style MS fill:#1e90ff,stroke:#1565c0,color:#fff
style CVR fill:#1e90ff,stroke:#1565c0,color:#fff
style CC fill:#1e90ff,stroke:#1565c0,color:#fff
style WARN fill:#1e90ff,stroke:#1565c0,color:#fff
```

---
Expand All @@ -314,6 +349,14 @@ graph TD

- **Two-Tier Tolerance**: Cross-validation uses a looser tolerance (`max(tol, 1e-3)`) for speed during hyperparameter search, while final estimation uses the user-specified tolerance for precision.

- **Warm-Start CV**: When sweeping over consecutive `r` candidates (IFE) or `lambda` candidates (MC), the fitted values from the previous candidate are reused as the starting point for the next. Per-fold and full-data caches (`warm_fit_cv`, `warm_fit_full`) store the `$fit` matrix between iterations, reducing EM iteration counts for adjacent hyperparameter values. Unobserved entries are zeroed before reuse to prevent stale value leakage.

- **Multi-Start Initialization**: The `n.init` parameter (default 1, preserving existing behavior) controls the number of perturbed starting points. When `n.init > 1`, `perturbedFit()` generates Gaussian-perturbed copies of the base `Y0` (5% of data SD) and `beta0` (10% of coefficient magnitude), runs trial estimations, and selects the initialization with the lowest residual variance (`sigma2`). This mitigates local optima sensitivity in the EM algorithm.

- **Burn-in Warm-Start**: In the weighted IFE estimation, the burn-in phase progressively reduces the rank from `d` down to `r`. Previously, upon convergence during burn-in, both `fit` and `fit_old` were reset to the initial `Y0`, discarding the converged solution. Now only `fit_old` is reset (to the current `fit`), preserving the converged state as the starting point for the real estimation phase.

- **Convergence Diagnostics**: All five C++ iteration functions (`fe_ad_iter`, `fe_ad_covar_iter`, `fe_ad_inter_iter`, `fe_ad_inter_covar_iter`, `beta_iter`) return a `converged` flag (1 if `dif <= tol`, 0 if `max_iter` reached). This flag propagates through `inter_fe_ub()` to R, where `fect_cv()` and `fect_fe()` emit a `warning()` on non-convergence. CV inner-loop calls do not warn (non-convergence at loose tolerance is expected).

- **Parallel Bootstrap via foreach**: `fect_boot()` uses `foreach` with `doParallel`/`doFuture` backends for parallel bootstrap replication. Includes `trim_closure_env()` optimization to reduce serialization overhead by keeping only referenced symbols in function environments.

- **Counterfactual Imputation as Core Abstraction**: All methods share the same conceptual framework: impute Y(0) for treated units using untreated observations, compute ATT as the gap. FE uses additive fixed effects, IFE adds latent factors (F * L'), MC uses nuclear norm regularization, CFE adds structured covariates.
Expand Down
2 changes: 1 addition & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ useDynLib(fect, .registration=TRUE)
importFrom(Rcpp, evalCpp)
importFrom("stats", "na.omit", "quantile", "sd", "var", "cov", "pchisq", "lm",
"as.formula", "median", "pnorm", "predict", "qnorm", "reshape",
"dnorm", "pf", "rbinom", "loess", "aggregate", "pt", "qt", "setNames", "time")
"dnorm", "pf", "rbinom", "rnorm", "loess", "aggregate", "pt", "qt", "setNames", "time")
importFrom("foreach","foreach","%dopar%")
importFrom("parallel", "detectCores", "stopCluster", "makeCluster")
importFrom("doRNG","registerDoRNG")
Expand Down
Loading