Experimental support for MSC4429: Profile Updates for Legacy Sync#19556
Experimental support for MSC4429: Profile Updates for Legacy Sync#19556anoadragon453 wants to merge 11 commits intodevelopfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Implements experimental MSC4429 support by introducing a new profile_updates stream (persisted + replicated) and surfacing profile field changes in legacy /sync when enabled.
Changes:
- Add
profile_updatesstream tracking (DB schema, stream token field, notifier + replication stream plumbing). - Emit MSC4429 profile updates in
/syncresponses and add filter support for selecting profile fields. - Record profile field updates on profile mutations (set/unset displayname/avatar/custom fields, deactivation cleanup).
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/rest/client/test_rooms.py | Updates hardcoded stream tokens to match the new StreamToken shape. |
| tests/rest/admin/test_room.py | Updates hardcoded stream tokens to match the new StreamToken shape. |
| synapse/types/init.py | Adds PROFILE_UPDATES stream key and extends StreamToken serialization/parsing. |
| synapse/streams/events.py | Includes profile_updates_key in current/bounded stream tokens. |
| synapse/storage/schema/main/delta/94/01_profile_updates_seq.sql.postgres | Adds Postgres sequence for the new stream. |
| synapse/storage/schema/main/delta/94/01_profile_updates.sql | Adds profile_updates table + indexes. |
| synapse/storage/schema/init.py | Bumps schema version to 94 and documents the change. |
| synapse/storage/databases/main/profile.py | Adds profile update persistence/query APIs and stream ID generator wiring. |
| synapse/rest/client/versions.py | Advertises org.matrix.msc4429 in unstable_features. |
| synapse/rest/client/sync.py | Encodes MSC4429 profile update payload into /sync response. |
| synapse/replication/tcp/streams/_base.py | Adds ProfileUpdatesStream replication stream. |
| synapse/replication/tcp/streams/init.py | Registers ProfileUpdatesStream for replication. |
| synapse/replication/tcp/handler.py | Replicates ProfileUpdatesStream from configured writers. |
| synapse/replication/tcp/client.py | Handles incoming profile update replication and notifies listeners. |
| synapse/notifier.py | Allows StreamKeyType.PROFILE_UPDATES to wake /sync waiters. |
| synapse/handlers/sync.py | Generates initial + incremental MSC4429 profile update sections in /sync. |
| synapse/handlers/profile.py | Records profile update markers on profile mutations and notifies. |
| synapse/config/experimental.py | Adds msc4429_enabled config flag. |
| synapse/api/filtering.py | Adds /sync filter support for selecting profile fields (MSC4429-gated). |
| docker/complement/conf/workers-shared-extra.yaml.j2 | Enables MSC4429 in complement worker test config. |
| changelog.d/19556.feature | Adds changelog entry for MSC4429 experimental support. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
83e699f to
a846945
Compare
dc8e9f8 to
a309372
Compare
We create a stream table dedicated to tracking profile updates. The sync machinary can this stream to understand whether a profile update needs to be included in a client's sync response.
Add a new stream type for profile updates. This allows sync processes to determine which profile updates a given user has or hasn't seen yet.
Replicate changes to the profile updates stream so that sync workers know there's a new profile update, and wake up sync waiters accordingly. Updates are keyed by `room_id` - sync waiters should only wake up if a user they share a room with updates their profile.
Based on the provided since token and filter.
And update relevant documentation/integration test config.
Prevent the `ProfileRestServlet` from handling requests intended for the `ProfileFieldRestServlet`. If a requester tried to call PUT `.../profile/@user:domain/(key_id?)` a worker that did not mount `ProfileFieldRestServlet`, they would then receive a `405 Method Not Allowed` instead of the expected `404 Not Found`.
a309372 to
f27ca70
Compare
| the following endpoints to a worker: | ||
|
|
||
| ``` | ||
| /_matrix/client/(api/v1|r0|v3)/profile/<user_id>/(<field_name>?) |
There was a problem hiding this comment.
This isn't ideal, as it breaks backwards compat for people. I reckon you'll see quite a lot of breakage if we require this.
What we do elsewhere is to route requests that go to the wrong worker to the right worker internally via http replication endpoint.
| sync_result_builder: | ||
| profile_fields: The list of field IDs to filter for. | ||
| """ | ||
| user_ids = await self.store.get_users_who_share_room_with_user(user_id) |
There was a problem hiding this comment.
I wonder if we should instead use all the members that we have pulled out when calculating the room membership lists? That way we'd also only pull out profiles for lazy loaded members?
There was a problem hiding this comment.
This feels expensive. Not against it as a prototyping measure but maybe the sort of thing we should track to improve before stabilisation. (Not sure off the top of my head what other streams do, but maybe updates are tracked at room granularity generally?)
| return | ||
|
|
||
| updated_user_ids = {update.user_id for update in updates} | ||
| shared_user_ids = await self.store.do_users_share_a_room( |
There was a problem hiding this comment.
This feels very similar to how we used to do device list updates. We ended up moving to a world where we recorded the device lists changes that happened in each room, we may want to do the same thing with profiles?
reivilibre
left a comment
There was a problem hiding this comment.
A lot to digest here, I think probably the most tricky thing is going to be making the sync performance less scary.
Probably we will wind up needing an extra table synchronised with the stream profile_updates_rooms (stream_id, instance_name, room_id)?
Pity I don't have the room NIDs ready yet :-)
| -- Synapse streams start at 2, because the default position is 1 | ||
| -- so any item inserted at position 1 is ignored. | ||
| -- We have to use nextval not START WITH 2, see https://github.com/element-hq/synapse/issues/18712 | ||
| SELECT nextval('profile_updates_sequence'); |
| ## Profile Updates Stream Writer Workers | ||
|
|
||
| This version of Synapse adds a new `profile_updates` writer stream. The | ||
| following endpoints may now only be handled by either the main process, or a |
There was a problem hiding this comment.
I think this wording is confusing; if the main process isn't designated as a stream writer anymore then you can't use the main process for that. The current wording implies main process OR stream_writer is acceptable (imo)
| # MSC4133: Custom profile fields | ||
| self.msc4133_enabled: bool = experimental.get("msc4133_enabled", False) | ||
|
|
||
| # MSC4429: Profile updates for legacy /sync |
There was a problem hiding this comment.
please add an Experimental Feature Tracking issue (see Tracked lines later): https://element-hq.github.io/synapse/latest/development/experimental_features.html#tracking-issues-for-experimental-features
| stream_id BIGINT NOT NULL PRIMARY KEY, | ||
| instance_name TEXT NOT NULL, | ||
|
|
||
| user_id TEXT NOT NULL, |
There was a problem hiding this comment.
Just out of paranoia around the profiles tables which currently don't do a good job of this, it'd be nice to add a comment on this column (-- ) saying this is the full user ID
| instance_name TEXT NOT NULL, | ||
|
|
||
| user_id TEXT NOT NULL, | ||
| field_name TEXT NOT NULL, |
There was a problem hiding this comment.
a small comment here would be decent too IMO
| sync_result_builder.profile_updates = all_updates | ||
| return | ||
|
|
||
| async def _generate_sync_entry_for_profile_updates( |
There was a problem hiding this comment.
it's not clear that this only returns profile updates for local users.
(Also not clear on the MSC that this is a constraint clients should expect?)
At least, I think it's only doing that, because the profile updates table has a constraint on local users and I don't see any mechanism that we have to track profile updates for remote users
| ) | ||
| self._worker_locks = hs.get_worker_locks_handler() | ||
|
|
||
| async def _notify_profile_update(self, user_id: UserID, stream_id: int) -> None: |
There was a problem hiding this comment.
would benefit from docstring.
Not at first glance sure why this is pulled out from recordprofile_updates
| StreamKeyType.PROFILE_UPDATES, stream_id, rooms=room_ids | ||
| ) | ||
|
|
||
| async def _record_profile_updates( |
There was a problem hiding this comment.
would benefit from docstring
| # Typing: `user_ids_to_room_ids.values` is a frozenset, but one | ||
| # can still iterate over it, hence the ignore. | ||
| for batched_room_ids in user_ids_to_room_ids.values(): # type: ignore[assignment] |
There was a problem hiding this comment.
This type ignore and the comment doesn't make sense to me and I can't see why it would be necessary. For a start, .values() is being called on the dict, not on the `frozenset`. Happy to share the error?
| # TODO: Is it possible to still allow any generic_worker to handle the | ||
| # `GET` endpoint? |
There was a problem hiding this comment.
I feel like I've seen this around, but it feels like it should be possible. If not it might be preferable to have the condition check inside the on_PUT handler
This PR implements experimental support for MSC4429: Profile Updates for Legacy Sync.
Recommended to review commit-by-commit.
Paired with matrix-org/complement#849
Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.