Skip to content

feat: Support device_id as bucketing identifier for local evaluation#424

Open
andehen wants to merge 3 commits intomasterfrom
flags/support-device-id-in-local-eval
Open

feat: Support device_id as bucketing identifier for local evaluation#424
andehen wants to merge 3 commits intomasterfrom
flags/support-device-id-in-local-eval

Conversation

@andehen
Copy link
Contributor

@andehen andehen commented Feb 5, 2026

Problem

Feature flags need to support using device_id instead of distinct_id for bucketing/hashing in local evaluation. This allows consistent flag experiences across anonymous and identified users on the same device.

Changes

  • Add bucketing_identifier field support on flag["filters"] - can be "distinct_id", "device_id", or null/missing (defaults to "distinct_id")
  • When bucketing_identifier: "device_id", use device_id for hash calculations instead of distinct_id
  • device_id can be passed as a method parameter or resolved from context via get_context_device_id()
  • If device_id is required but not provided, raise InconclusiveMatchError to trigger server fallback
  • Group flags ignore bucketing_identifier and always use group identifier (via skip_bucketing_identifier parameter)

Question to the feature flags team: If a flag is configured to use device_id has bucketing identifier, but none is provided, should we:
a) raise an error and return "false" (current behavior)
b) fall back to use distinct_id and return whatever that resolves to

Personally I prefer a), as I would like to know when this happens and fix my code.

Testing

  • Basic device_id bucketing - flag with bucketing_identifier: "device_id" uses device_id for hashing
  • Same device_id with different distinct_ids produces same result
  • Missing device_id fallback - raises InconclusiveMatchError, triggers server evaluation
  • Default to distinct_id - null/missing bucketing_identifier uses distinct_id
  • Multivariate with device_id - variant selection uses device_id hash
  • device_id from context - get_context_device_id() is used when param not provided
  • Group flags unaffected - aggregation_group_type_index continues to use group identifier
  • only_evaluate_locally=True returns None when device_id required but missing
  • get_all_flags properly handles device_id bucketing
  • All existing tests pass (631 passed)

Add support for `bucketing_identifier` field on feature flags to allow
using `device_id` instead of `distinct_id` for hashing/bucketing in
local evaluation.

- When `bucketing_identifier: "device_id"`, use device_id for hash
  calculations instead of distinct_id
- device_id can be passed as method parameter or resolved from context
  via `get_context_device_id()`
- If device_id is required but not provided, raises InconclusiveMatchError
  to trigger server fallback
- Group flags ignore bucketing_identifier and always use group identifier
@andehen andehen force-pushed the flags/support-device-id-in-local-eval branch from 83cbc1f to 03d2ff8 Compare February 5, 2026 12:40
@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2026

posthog-python Compliance Report

Date: 2026-02-09 11:06:29 UTC
Duration: 145416ms

⚠️ Some Tests Failed

27/29 tests passed, 2 failed


Capture Tests

⚠️ 27/29 tests passed, 2 failed

View Details
Test Status Duration
Format Validation.Event Has Required Fields 517ms
Format Validation.Event Has Uuid 1506ms
Format Validation.Event Has Lib Properties 1507ms
Format Validation.Distinct Id Is String 1507ms
Format Validation.Token Is Present 1507ms
Format Validation.Custom Properties Preserved 1507ms
Format Validation.Event Has Timestamp 1506ms
Retry Behavior.Retries On 503 8020ms
Retry Behavior.Does Not Retry On 400 3509ms
Retry Behavior.Does Not Retry On 401 3508ms
Retry Behavior.Respects Retry After Header 6782ms
Retry Behavior.Implements Backoff 20509ms
Retry Behavior.Retries On 500 7175ms
Retry Behavior.Retries On 502 6546ms
Retry Behavior.Retries On 504 6972ms
Retry Behavior.Max Retries Respected 22730ms
Deduplication.Generates Unique Uuids 1496ms
Deduplication.Preserves Uuid On Retry 7092ms
Deduplication.Preserves Uuid And Timestamp On Retry 12689ms
Deduplication.Preserves Uuid And Timestamp On Batch Retry 6695ms
Deduplication.No Duplicate Events In Batch 1503ms
Deduplication.Different Events Have Different Uuids 1507ms
Compression.Sends Gzip When Enabled 1507ms
Batch Format.Uses Proper Batch Structure 1507ms
Batch Format.Flush With No Events Sends Nothing 1005ms
Batch Format.Multiple Events Batched Together 1504ms
Error Handling.Does Not Retry On 403 3509ms
Error Handling.Does Not Retry On 413 3507ms
Error Handling.Retries On 408 6507ms

Failures

retry_behavior.respects_retry_after_header

Retry delay too short: 272ms < 2500ms

error_handling.retries_on_408

Expected at least 2 requests, got 1

…aluation

- Rename _hash param from distinct_id to identifier since it now receives
  device IDs, group keys, and distinct IDs
- Replace skip_bucketing_identifier boolean with explicit hashing_identifier
  param so callers pass the resolved identifier directly for group flags
- Make hashing_identifier a required keyword arg in is_condition_match,
  removing a dead fallback that silently masked potential bugs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@andehen andehen requested review from a team February 6, 2026 11:37
@posthog-project-board-bot posthog-project-board-bot bot moved this to In Review in Feature Flags Feb 6, 2026
Copy link
Contributor

@dmarticus dmarticus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions but nothing blocking.

Also, re: your open question about design, I agree with option (a). Silently falling back to distinct_id would mask configuration bugs and defeat the purpose of device_id bucketing. The current behavior of raising InconclusiveMatchError → server fallback is the right call. It's observable (you can log/monitor fallback rates) and self-correcting (server has the device_id from the request context or can handle it differently).

focused_group_properties,
self.feature_flags_by_key,
evaluation_cache,
bucketing_value=groups[group_name],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think breaks anything, but it feels a little confusing technically – this is passed positionally to match_feature_flag_properties, but bucketing_value is a keyword-only arg (or at least should be treated as one given the * separator in is_condition_match). More importantly, you're adding bucketing_value as a kwarg to match_feature_flag_properties but passing it
positionally here; I'd double check this actually lands in the right parameter. Looking at the signature, it seems fine since it's a named kwarg, but worth verifying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh and also, If group_name isn't in groups, this will raise a KeyError. The existing code presumably handles this elsewhere, but worth confirming since you're now accessing it in a new location (before, group_name was only used in the properties lookup path).

@github-project-automation github-project-automation bot moved this from In Review to Approved in Feature Flags Feb 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Approved

Development

Successfully merging this pull request may close these issues.

2 participants