Skip to content

Improve metadata: add descriptions, fix field types, add wikidata_id to schema#102

Merged
olayway merged 3 commits into
mainfrom
improve-metadata
May 8, 2026
Merged

Improve metadata: add descriptions, fix field types, add wikidata_id to schema#102
olayway merged 3 commits into
mainfrom
improve-metadata

Conversation

@olayway
Copy link
Copy Markdown
Contributor

@olayway olayway commented May 8, 2026

Summary

  • Add description at package level (was missing)
  • Add description at resource level for country-codes (was missing)
  • Add wikidata_id field to schema.fields — column exists in data/country-codes.csv (written by wd_countries.py) but had no schema entry
  • Fix M49 field type from number to integer (all values are whole numbers)
  • Fix Geoname ID field type from number to integer (all values are whole integers)
  • Add Wikidata SPARQL endpoint to sources (used by wd_countries.sh to populate wikidata_id but previously absent from sources)
  • Fix README UN Protocol & Liaison Service link from the Permanent Missions page to the actual UNTERM Excel file URL
  • Add Wikidata to README data sources section

olayway and others added 3 commits May 8, 2026 11:41
… Wikidata source

- Add package-level description
- Add resource-level description for country-codes
- Add wikidata_id field to schema (was in CSV but missing from schema)
- Fix M49 and Geoname ID types from number to integer
- Add Wikidata SPARQL endpoint to sources
- Fix README UN Protocol URL to match actual data source
- Add Wikidata to README data sources

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use https for ODC-PDDL license URL
- Fix ISO3166-1-numeric, Global Code, Intermediate Region Code, Sub-region Code,
  Region Code types from string to integer (all values are whole numbers)
- Update CLDR source path to raw content URL (matches what scripts actually fetch)
- Remove orphaned Developed/Developing Countries from config.py COLUMN_NAMES
  (not present in CSV output or schema)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
These are identifier codes, not quantities. ISO3166-1-numeric in particular
is a 3-digit code with leading zeros per the standard (004, 008) — integer
type would silently drop those if the data is ever corrected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@olayway olayway merged commit 49b38b7 into main May 8, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant