feat(schema): audit and update base schema files#9215
Draft
gkatre wants to merge 2 commits into
Draft
Conversation
- Fix type mismatches: app_version (INTEGER->STRING), os_version (INTEGER->STRING) in global.yaml - Add mode: NULLABLE to all 43 fields in global.yaml and 76 fields in ads_derived.yaml that were missing explicit mode - Fix BOOL->BOOLEAN for targets_default_site and targets_default_zone in ads_derived.yaml - Fill missing/minimal descriptions: payout (was null), price (was "Price."), creative_type (was describing flight_name), legacy_telemetry_client_id, uapi_clicks in ads_derived.yaml - Add missing type: STRING for sites and zones fields in ads_derived.yaml - Promote 13 cross-dataset fields to global.yaml: activated, android_sdk_version, app_build, app_channel, city, days_since_first_seen, days_since_seen, default_search_engine, device_manufacturer, device_model, is_new_profile, metric_date, new_profiles - Create search_derived.yaml with 10 search-specific fields (tagged_sap, tagged_follow_on, search_with_ads, ad_click, organic, sap, normalized_engine, unknown, ad_click_organic, search_with_ads_organic) - Create firefox_desktop_derived.yaml with 5 desktop-specific fields (attribution_dlsource, attribution_ua, is_dau, windows_build_number, windows_version) - Add SCHEMA_AUDIT_RECOMMENDATIONS.md with 13 items requiring human review Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Comprehensive audit and update of all base schema files in
bigquery_etl/schema/, performed by the base-schema-curator agent.Changes Applied (Safe, Mechanical)
app_version(INTEGER -> STRING) andos_version(INTEGER -> STRING) inglobal.yaml-- descriptions clearly indicate these hold version strings like "1.0.3" and "100.9.11"mode: NULLABLEto all 43 fields inglobal.yamland 76 fields inads_derived.yamlthat lacked amodekeyBOOLtoBOOLEANfortargets_default_siteandtargets_default_zoneinads_derived.yamltype: STRINGtositesandzonesinads_derived.yamlpayout(was null),price(was "Price."),creative_type(was describing flight_name),legacy_telemetry_client_id(was generic),uapi_clicks(was incorrectly labeled "impressions")activated,android_sdk_version,app_build,app_channel,city,days_since_first_seen,days_since_seen,default_search_engine,device_manufacturer,device_model,is_new_profile,metric_date,new_profiles-- each appears in 4+ distinct datasetssearch_derived.yaml(10 fields):tagged_sap,tagged_follow_on,search_with_ads,ad_click,organic,sap,normalized_engine,unknown,ad_click_organic,search_with_ads_organicfirefox_desktop_derived.yaml(5 fields):attribution_dlsource,attribution_ua,is_dau,windows_build_number,windows_versionItems Requiring Human Review (13)
See
bigquery_etl/schema/SCHEMA_AUDIT_RECOMMENDATIONS.mdfor the full list. Key items:source_filecanonical/alias conflict betweenglobal.yamlandads_derived.yamlcreative_typedescription vs. field name mismatch (needs source verification)dauandprofile_group_idcross-file duplicates betweenglobal.yamlandads_derived.yamlapp_newtab.yaml,app_mobile.yaml, and dataset-specific schema files for 7 additional datasetsTest plan
yaml.safe_load()./bqetl query schema update <table> --use-global-schemaon a sample table to verify new global fields are applied correctly🤖 Generated with Claude Code