Independent Python implementation of the Global Macro Database pipeline, aligned to Stata do/ado semantics but not dependent on the Stata repository runtime.
pip install -e .import global_macro_data as gmdgmd.download_source(source, ...): download one supported raw source.gmd.clean_source(source, ...): clean one source into GMD clean format.gmd.rebuild_clean_sources(...): batch clean sources (fromsource_list.csvor explicit list).gmd.combine_variable(varname, ...): build one final chainlinked variable.gmd.combine_all(...): build all combine outputs from bundled pipeline specs.gmd.build_documentation_all(...): build documentation outputs.gmd.merge_final_data(...): producedata_final.dta.gmd.run_master_pipeline(...): run initialize/clean/combine/document/final stages.
from pathlib import Path
import global_macro_data as gmd
data_root = Path("path/to/data") # contains raw/helpers
work = Path("path/to/work")
gmd.rebuild_clean_sources(
data_raw_dir=data_root / "raw",
data_clean_dir=work / "clean",
data_helper_dir=data_root / "helpers",
data_temp_dir=work / "tempfiles",
)
gmd.combine_all(
data_clean_dir=work / "clean",
data_temp_dir=work / "tempfiles",
data_final_dir=work / "final",
)
gmd.merge_final_data(
data_temp_dir=work / "tempfiles",
data_final_dir=work / "final",
data_helper_dir=data_root / "helpers",
)