EstWarden Research

Research notebooks and optimization tools for the EstWarden Baltic Security Monitor. All analysis runs against the public dataset.

Quick start

git clone https://github.com/Estwarden/research.git
git clone https://github.com/Estwarden/dataset.git data
cd research
pip install -r requirements.txt
jupyter notebook notebooks/

Notebooks

Run in order — notebook 01 produces daily_matrix.parquet that all others load.

#	Notebook	Question	Method
01	Data Profile	What shape is the data?	Align 20K signals with 497 indicator labels, build daily matrix
02	Lead Indicators	Which sources spike before YELLOW?	Point-biserial correlation at lag 0-3, Random Forest importance
03	Anomaly Thresholds	What z-score = real anomaly?	Per-source ROC, Youden's J, bootstrap stability
04	Narrative Velocity	How fast do narratives spread?	Velocity, amplification ratio, campaign predictor
05	Source Independence	Are sources redundant?	Correlation matrix, mutual information, PCA decomposition
06	CTI Rebuild	Can we beat hand-tuned weights?	Logistic regression vs gradient boosting, time-series CV

Key formulas

From notebook 03 — per-source anomaly score: $$\text{AnomalyScore}(t) = \sum_i \text{AUC}_i \cdot \max(0,; z_i(t) - \tau_i^*)$$

From notebook 04 — campaign prediction: $$CS(t) = \sum_k \max(0,; v_k(t)) \cdot s_k(t)$$

From notebook 06 — logistic CTI (deployable): $$P(\text{YELLOW}) = \sigma!\left(w_0 + \sum_i w_i \cdot z_i + \sum_i w'_i \cdot v_i\right)$$

Autoresearch

Automated CTI optimization (Karpathy pattern):

cd autoresearch
python3 optimize.py      # Phase 1: 85K trials, 3-fold CV
./run.sh                 # Phase 2: LLM structural improvements

Methodology

Composite Threat Index — formula, weights, thresholds
Findings — what changed in production

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
autoresearch		autoresearch
data		data
methodology		methodology
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Repo	What
Dataset	27K signals, 20 sources, indicators, campaigns
Collectors	Data collection pipelines (Dagu DAGs)
Integrations	MCP server, Home Assistant, CLI
estwarden.eu	Live dashboard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EstWarden Research

Quick start

Notebooks

Key formulas

Autoresearch

Methodology

Related

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EstWarden Research

Quick start

Notebooks

Key formulas

Autoresearch

Methodology

Related

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages