feat: Support device_id as bucketing identifier for local evaluation#424
feat: Support device_id as bucketing identifier for local evaluation#424
Conversation
Add support for `bucketing_identifier` field on feature flags to allow using `device_id` instead of `distinct_id` for hashing/bucketing in local evaluation. - When `bucketing_identifier: "device_id"`, use device_id for hash calculations instead of distinct_id - device_id can be passed as method parameter or resolved from context via `get_context_device_id()` - If device_id is required but not provided, raises InconclusiveMatchError to trigger server fallback - Group flags ignore bucketing_identifier and always use group identifier
83cbc1f to
03d2ff8
Compare
posthog-python Compliance ReportDate: 2026-02-09 11:06:29 UTC
|
| Test | Status | Duration |
|---|---|---|
| Format Validation.Event Has Required Fields | ✅ | 517ms |
| Format Validation.Event Has Uuid | ✅ | 1506ms |
| Format Validation.Event Has Lib Properties | ✅ | 1507ms |
| Format Validation.Distinct Id Is String | ✅ | 1507ms |
| Format Validation.Token Is Present | ✅ | 1507ms |
| Format Validation.Custom Properties Preserved | ✅ | 1507ms |
| Format Validation.Event Has Timestamp | ✅ | 1506ms |
| Retry Behavior.Retries On 503 | ✅ | 8020ms |
| Retry Behavior.Does Not Retry On 400 | ✅ | 3509ms |
| Retry Behavior.Does Not Retry On 401 | ✅ | 3508ms |
| Retry Behavior.Respects Retry After Header | ❌ | 6782ms |
| Retry Behavior.Implements Backoff | ✅ | 20509ms |
| Retry Behavior.Retries On 500 | ✅ | 7175ms |
| Retry Behavior.Retries On 502 | ✅ | 6546ms |
| Retry Behavior.Retries On 504 | ✅ | 6972ms |
| Retry Behavior.Max Retries Respected | ✅ | 22730ms |
| Deduplication.Generates Unique Uuids | ✅ | 1496ms |
| Deduplication.Preserves Uuid On Retry | ✅ | 7092ms |
| Deduplication.Preserves Uuid And Timestamp On Retry | ✅ | 12689ms |
| Deduplication.Preserves Uuid And Timestamp On Batch Retry | ✅ | 6695ms |
| Deduplication.No Duplicate Events In Batch | ✅ | 1503ms |
| Deduplication.Different Events Have Different Uuids | ✅ | 1507ms |
| Compression.Sends Gzip When Enabled | ✅ | 1507ms |
| Batch Format.Uses Proper Batch Structure | ✅ | 1507ms |
| Batch Format.Flush With No Events Sends Nothing | ✅ | 1005ms |
| Batch Format.Multiple Events Batched Together | ✅ | 1504ms |
| Error Handling.Does Not Retry On 403 | ✅ | 3509ms |
| Error Handling.Does Not Retry On 413 | ✅ | 3507ms |
| Error Handling.Retries On 408 | ❌ | 6507ms |
Failures
retry_behavior.respects_retry_after_header
Retry delay too short: 272ms < 2500ms
error_handling.retries_on_408
Expected at least 2 requests, got 1
…aluation - Rename _hash param from distinct_id to identifier since it now receives device IDs, group keys, and distinct IDs - Replace skip_bucketing_identifier boolean with explicit hashing_identifier param so callers pass the resolved identifier directly for group flags - Make hashing_identifier a required keyword arg in is_condition_match, removing a dead fallback that silently masked potential bugs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
dmarticus
left a comment
There was a problem hiding this comment.
A few questions but nothing blocking.
Also, re: your open question about design, I agree with option (a). Silently falling back to distinct_id would mask configuration bugs and defeat the purpose of device_id bucketing. The current behavior of raising InconclusiveMatchError → server fallback is the right call. It's observable (you can log/monitor fallback rates) and self-correcting (server has the device_id from the request context or can handle it differently).
| focused_group_properties, | ||
| self.feature_flags_by_key, | ||
| evaluation_cache, | ||
| bucketing_value=groups[group_name], |
There was a problem hiding this comment.
I don't think breaks anything, but it feels a little confusing technically – this is passed positionally to match_feature_flag_properties, but bucketing_value is a keyword-only arg (or at least should be treated as one given the * separator in is_condition_match). More importantly, you're adding bucketing_value as a kwarg to match_feature_flag_properties but passing it
positionally here; I'd double check this actually lands in the right parameter. Looking at the signature, it seems fine since it's a named kwarg, but worth verifying.
There was a problem hiding this comment.
oh and also, If group_name isn't in groups, this will raise a KeyError. The existing code presumably handles this elsewhere, but worth confirming since you're now accessing it in a new location (before, group_name was only used in the properties lookup path).
Problem
Feature flags need to support using
device_idinstead ofdistinct_idfor bucketing/hashing in local evaluation. This allows consistent flag experiences across anonymous and identified users on the same device.Changes
bucketing_identifierfield support onflag["filters"]- can be"distinct_id","device_id", or null/missing (defaults to"distinct_id")bucketing_identifier: "device_id", use device_id for hash calculations instead of distinct_idget_context_device_id()InconclusiveMatchErrorto trigger server fallbackbucketing_identifierand always use group identifier (viaskip_bucketing_identifierparameter)Question to the feature flags team: If a flag is configured to use
device_idhas bucketing identifier, but none is provided, should we:a) raise an error and return "false" (current behavior)
b) fall back to use
distinct_idand return whatever that resolves toPersonally I prefer a), as I would like to know when this happens and fix my code.
Testing
bucketing_identifier: "device_id"uses device_id for hashingbucketing_identifieruses distinct_idget_context_device_id()is used when param not providedaggregation_group_type_indexcontinues to use group identifieronly_evaluate_locally=Truereturns None when device_id required but missingget_all_flagsproperly handles device_id bucketing