Replication package for "Evaluating Probabilistic Classifiers: The Triptych"

Timo Dimitriadis and Alexander I. Jordan

Overview & contents

The code in this replication material generates the 12 figures and 3 tables for the paper "Evaluating Probabilistic Classifiers: The Triptych". Each figure and table is generated separately by its corresponding script file Figure_[xx]_*.R or Table_[xx]_*.R, respectively.

The main contents of the repository are the following:

plots/: folder of generated plots as PDF files
tables/: folder of generated tables as txt files
data-raw/: folder of raw data files and the functions for processing them
data/: folder of processed data files
Figure_[xx]_*.R: R scripts to create the respective figures
Table_[xx]_*.R: R scripts to create the respective tables

Instructions & computational requirements.

All file paths are relative to the root of the replication package. Please set your working directory accordingly, or open the .Rproj file using RStudio.

The analysis files Figure_[xx]_*.R and Table_[xx]_*.R can be run individually, in any order.

These analyses were run on R 4.3.1, and we explicitly use the following packages in the analysis files: triptych (0.1.2), ggplot2 (3.4.3), patchwork (1.1.3), dplyr (1.1.3), tidyr (1.3.0), purrr (1.0.2), grid (base R), lubridate (1.9.2).

A comprehensive list of dependencies can be found in the renv.lock file. For a convenient setup in a (local) R session, we recommend using the renv package. The following steps are required once:

# install.packages("renv")
renv::activate()
renv::restore() # install dependencies
renv::status() # check environment

Data availability and provenance

Solar Flare Forecasts

The prepared forecast-observation data are located at data/C1_flares.rda and data/M1_flares.rda, for the classes C1.0+ and M1.0+ of solar flare intensity. These files are generated by the script data-raw/prepare_SolarFlares.R using the pre-processed data files SF.FC.C1.rda and SF.FC.M1.rda from Dimitriadis and Jordan (2021, https://doi.org/10.5281/zenodo.4699945). That replication package contains a description of the pre-processing of the original data on solar flares from Leka and Park (2019, https://doi.org/10.7910/DVN/HYP74O).

SPF Forecasts for Economic Recessions

The prepared forecast-observation data are located at data/spf.gpd.long.rda. They are also available from Dimitriadis and Jordan (2021, https://doi.org/10.5281/zenodo.4699945), a replication package that contains a description of the pre-processing of the original data from the Federal Reserve Bank of Philadelphia (https://www.philadelphiafed.org/surveys-and-data/).

Fragile Family Challenge

The Fragile Family Challenge (FFC) is a scientific mass collaboration where 160 teams built predictions for six variables, where we analyze two binary ones (eviction and job training). The prepared forecast and outcome data are located in the data/ folder, as files FFC_Eviction.rda and FFC_JobTraining.rda.

The forecasts (submissions) of the 160 teams together with the realizations originate from Salganik et al (2020, https://doi.org/10.7910/DVN/CXSECU), located in the data/derived/submissions.csv.zip file. The 9 benchmark forecasts have to be generated separately by obtaining data files from https://opr.princeton.edu/archive/ as described in Salganik et al (2020). We prepare the FFC data using these two (in this repository unavailable) files within the script prepare_FragileFamilyChallenge.R.

References

Dimitriadis T, Jordan AI. 2021. Replication package for "Stable reliability diagrams for probabilistic classifiers" (v1.0.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4699946

Leka KD, Park S. 2019. A Comparison of Flare Forecasting Methods II: Data and Supporting Code. Harvard Dataverse, V1, UNF:6:yz1noMojlzL7SZM+9flXhQ== [fileUNF]. https://doi.org/10.7910/DVN/HYP74O

Salganik M, Lundberg I, Kindel A, McLanahan S. 2020. Replication materials for "Measuring the predictability of life outcomes using a scientific mass collaboration". Harvard Dataverse, V3, UNF:6:Cj8wiioSf8JGyRLcDo5d3w== [fileUNF]. https://doi.org/10.7910/DVN/CXSECU

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data-raw		data-raw
data		data
plots		plots
renv		renv
tables		tables
.DS_Store		.DS_Store
.gitignore		.gitignore
Figure_01_Triptych_C1Flares.R		Figure_01_Triptych_C1Flares.R
Figure_02_MCBDSC_M1_JobTrain.R		Figure_02_MCBDSC_M1_JobTrain.R
Figure_03_Murphy_C1Flares.R		Figure_03_Murphy_C1Flares.R
Figure_04_ROC_C1Flares.R		Figure_04_ROC_C1Flares.R
Figure_05_Theoretical_Guarantees.R		Figure_05_Theoretical_Guarantees.R
Figure_06_C1_Flares_Uncertainty.R		Figure_06_C1_Flares_Uncertainty.R
Figure_07_MCBDSC_M1Flares.R		Figure_07_MCBDSC_M1Flares.R
Figure_08_Triptych_M1Flares.R		Figure_08_Triptych_M1Flares.R
Figure_09_Triptych_SPF.R		Figure_09_Triptych_SPF.R
Figure_10_MCBDSC_FFC_Eviction.R		Figure_10_MCBDSC_FFC_Eviction.R
Figure_11_Triptych_FFC_Eviction.R		Figure_11_Triptych_FFC_Eviction.R
Figure_12_crossingpoints_plots.R		Figure_12_crossingpoints_plots.R
README.html		README.html
README.md		README.md
Table_01_Mean_Scores_C1Flares.R		Table_01_Mean_Scores_C1Flares.R
Table_02_Score_Decompositions_C1Flares.R		Table_02_Score_Decompositions_C1Flares.R
Table_03_Score_Decompositions_SPF.R		Table_03_Score_Decompositions_SPF.R
renv.lock		renv.lock
replication_triptych.Rproj		replication_triptych.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Replication package for "Evaluating Probabilistic Classifiers: The Triptych"

Overview & contents

Instructions & computational requirements.

Data availability and provenance

Solar Flare Forecasts

SPF Forecasts for Economic Recessions

Fragile Family Challenge

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Replication package for "Evaluating Probabilistic Classifiers: The Triptych"

Overview & contents

Instructions & computational requirements.

Data availability and provenance

Solar Flare Forecasts

SPF Forecasts for Economic Recessions

Fragile Family Challenge

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages