Add acled-sub-saharan-africa-events: ACLED Sub-Saharan Africa event-level conflict data#119
Open
jonaraphael wants to merge 1 commit into
Open
Conversation
New private release-oriented vector asset: ACLED Sub-Saharan Africa event-level conflict data (47,937 points, events 2025-06-30 through 2026-06-30). feature_id is copied from the verified-unique source field event_id_cnty. Canonical FGB, metadata-lookup PMTiles, metadata sidecar, schema, and manifest are staged under _scratch/pending-publishes for promotion via the reviewed publish plan in the PR. This intake also serves as the end-to-end test of the publish pipeline on the repo-hardening-ci-publish-schema branch. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Publish new shared dataset
acled-sub-saharan-africa-events: ACLED Sub-Saharan Africa Event-Level Conflict Data.Validation
Dataset Admission
Publish Plan
Repo Changes
docs/assets/acled-sub-saharan-africa-events.md(new asset doc with admission evidence, data profile, feature identity, and release metadata contract)catalog/shared-datasets-catalog.csv(regenerated, +1 row)docs/assets/index.md(regenerated, +1 row)Feature Identity Decision
feature_idstrategy:source_fieldcopied fromevent_id_cnty. Exact full-row profile over all 47,937 rows: 47,937 distinct values, 0 blank, 0 non-URL-safe, no duplicates. Generated-sequence fallback not needed. Maintainer confirmed this choice and chose no autogenerated metadata translations for the first upload.Source Evidence (not promoted)
gs://skytruth-shared-datasets-1/_scratch/pending-publishes/acled-sub-saharan-africa-events/add-acled-sub-saharan-africa-events/source/ACLED_Dataset_Sub-SaharanAfrica.xlsx(generation1783047721882269, sha25676271676f26740fad075acc3c0f8d4144a8e41747ca47b296e1679360ca97a36). Staged as review evidence only; XLSX is not an approved canonical format, so the canonical artifacts above were built from it.Review / Merge Notes
jonaraphael; GitHub blocks self-review requests, so this PR records restricted self-acceptance per AGENTS.md instead of a reviewer request.repo-hardening-ci-publish-schema(PR Add CI quality gates and extract tested publish workflow modules #118) because it is the end-to-end test of that branch's publish pipeline. It must be retargeted tomain(or re-verified) after Add CI quality gates and extract tested publish workflow modules #118 merges; promotion only runs after this PR's plan reachesmain.access_tier: private. The SkyTruth ACLED authorization recorded 2026-05-08 for the aggregated assets is assumed to cover this event-level export — reviewer should confirm.check-schema-compatibilitycould not run locally (GDAL 3.6.2 lacksogrinfo -jsonvector support, added in 3.7). New asset slug, so there is no prior schema snapshot; the protected workflow's schema preflight is expected to pass with no waiver.Pipeline-Test Observations
scripts/vector_asset.py buildmis-detects canonical-JSON GeoJSONSeq sources (lines start with{"geometry":...because keys are sorted), so GDAL's plain GeoJSON driver silently read 1 feature. Ingestion jobs avoid this by forcing theGeoJSONSeq:driver prefix; the manual path had to route through a GPKG intermediate. Worth hardening invector_asset.py.scripts/catalog_docs.py generaterewraps prose in four unrelated asset docs (drc-cami-mining-cadastre,lsib,world-polygons,wri-forest-atlas-drc-mining-permits) in this working tree. Those files were intentionally left unstaged to keep this PR focused; if the branch CI check fails on doc drift, that is generator/env drift, not this asset.publishing_concierge.pystaged-object validation reports 1-based indexes in error messages, which reads as an off-by-one when debugging evidence lists.Validation Commands Run
uv run ogr2ogr -f GeoJSONSeq ... -dialect sqlite -sql 'SELECT MakePoint(...)'(xlsx -> WGS84 points)PYTHONPATH=. uv run python enrich_acled.py(feature_id/geometry_hash/properties_hash + sidecar + schema viaingestion.common.feature_metadata)uv run python scripts/vector_asset.py build ... --metadata-lookup --pmtiles-detail-hint detailed(FGB + PMTiles, auto maxzoom -> 12)pmtiles verify/pmtiles show/ magic-byte check / zoom-0 decode viavalidate_release_vector_contractuv run python scripts/catalog_docs.py generate && check(25 asset docs current) andexport-readmesuv run python scripts/catalog_site.py --out .../catalog-web(catalog.json includes the new asset withmetadata_sidecar_schemacolorizer)/Users/jonathanraphael/miniforge3/bin/ogr2ogr), Tippecanoe v2.79.0 (/usr/local/bin/tippecanoe), pmtiles CLI dev (/usr/local/bin/pmtiles)