Skip to content

Experimental support for MSC4429: Profile Updates for Legacy Sync#19556

Open
anoadragon453 wants to merge 11 commits intodevelopfrom
anoa/msc4429
Open

Experimental support for MSC4429: Profile Updates for Legacy Sync#19556
anoadragon453 wants to merge 11 commits intodevelopfrom
anoa/msc4429

Conversation

@anoadragon453
Copy link
Copy Markdown
Member

@anoadragon453 anoadragon453 commented Mar 13, 2026

This PR implements experimental support for MSC4429: Profile Updates for Legacy Sync.

Recommended to review commit-by-commit.

Paired with matrix-org/complement#849

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Code style is correct (run the linters)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements experimental MSC4429 support by introducing a new profile_updates stream (persisted + replicated) and surfacing profile field changes in legacy /sync when enabled.

Changes:

  • Add profile_updates stream tracking (DB schema, stream token field, notifier + replication stream plumbing).
  • Emit MSC4429 profile updates in /sync responses and add filter support for selecting profile fields.
  • Record profile field updates on profile mutations (set/unset displayname/avatar/custom fields, deactivation cleanup).

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/rest/client/test_rooms.py Updates hardcoded stream tokens to match the new StreamToken shape.
tests/rest/admin/test_room.py Updates hardcoded stream tokens to match the new StreamToken shape.
synapse/types/init.py Adds PROFILE_UPDATES stream key and extends StreamToken serialization/parsing.
synapse/streams/events.py Includes profile_updates_key in current/bounded stream tokens.
synapse/storage/schema/main/delta/94/01_profile_updates_seq.sql.postgres Adds Postgres sequence for the new stream.
synapse/storage/schema/main/delta/94/01_profile_updates.sql Adds profile_updates table + indexes.
synapse/storage/schema/init.py Bumps schema version to 94 and documents the change.
synapse/storage/databases/main/profile.py Adds profile update persistence/query APIs and stream ID generator wiring.
synapse/rest/client/versions.py Advertises org.matrix.msc4429 in unstable_features.
synapse/rest/client/sync.py Encodes MSC4429 profile update payload into /sync response.
synapse/replication/tcp/streams/_base.py Adds ProfileUpdatesStream replication stream.
synapse/replication/tcp/streams/init.py Registers ProfileUpdatesStream for replication.
synapse/replication/tcp/handler.py Replicates ProfileUpdatesStream from configured writers.
synapse/replication/tcp/client.py Handles incoming profile update replication and notifies listeners.
synapse/notifier.py Allows StreamKeyType.PROFILE_UPDATES to wake /sync waiters.
synapse/handlers/sync.py Generates initial + incremental MSC4429 profile update sections in /sync.
synapse/handlers/profile.py Records profile update markers on profile mutations and notifies.
synapse/config/experimental.py Adds msc4429_enabled config flag.
synapse/api/filtering.py Adds /sync filter support for selecting profile fields (MSC4429-gated).
docker/complement/conf/workers-shared-extra.yaml.j2 Enables MSC4429 in complement worker test config.
changelog.d/19556.feature Adds changelog entry for MSC4429 experimental support.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread synapse/replication/tcp/client.py Outdated
Comment thread synapse/storage/schema/main/delta/94/01_profile_updates.sql Outdated
Comment thread synapse/storage/databases/main/profile.py
Comment thread synapse/handlers/sync.py
@anoadragon453 anoadragon453 force-pushed the anoa/msc4429 branch 2 times, most recently from 83e699f to a846945 Compare March 17, 2026 16:21
anoadragon453 added a commit to matrix-org/sytest that referenced this pull request Mar 17, 2026
anoadragon453 added a commit to matrix-org/sytest that referenced this pull request Mar 17, 2026
@anoadragon453 anoadragon453 marked this pull request as ready for review March 18, 2026 15:58
@anoadragon453 anoadragon453 requested a review from a team as a code owner March 18, 2026 15:58
We create a stream table dedicated to tracking profile updates. The sync
machinary can this stream to understand whether a profile update
needs to be included in a client's sync response.
Add a new stream type for profile updates. This allows sync processes to
determine which profile updates a given user has or hasn't seen yet.
Replicate changes to the profile updates stream so that sync workers
know there's a new profile update, and wake up sync waiters accordingly.

Updates are keyed by `room_id` - sync waiters should only wake up if
a user they share a room with updates their profile.
Based on the provided since token and filter.
And update relevant documentation/integration test config.
Prevent the `ProfileRestServlet` from handling requests intended for the
`ProfileFieldRestServlet`.

If a requester tried to call PUT `.../profile/@user:domain/(key_id?)` a
worker that did not mount `ProfileFieldRestServlet`, they would then
receive a `405 Method Not Allowed` instead of the expected `404 Not
Found`.
Comment thread docs/upgrade.md
the following endpoints to a worker:

```
/_matrix/client/(api/v1|r0|v3)/profile/<user_id>/(<field_name>?)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't ideal, as it breaks backwards compat for people. I reckon you'll see quite a lot of breakage if we require this.

What we do elsewhere is to route requests that go to the wrong worker to the right worker internally via http replication endpoint.

Comment thread synapse/handlers/sync.py
sync_result_builder:
profile_fields: The list of field IDs to filter for.
"""
user_ids = await self.store.get_users_who_share_room_with_user(user_id)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should instead use all the members that we have pulled out when calculating the room membership lists? That way we'd also only pull out profiles for lazy loaded members?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels expensive. Not against it as a prototyping measure but maybe the sort of thing we should track to improve before stabilisation. (Not sure off the top of my head what other streams do, but maybe updates are tracked at room granularity generally?)

Comment thread synapse/handlers/sync.py
return

updated_user_ids = {update.user_id for update in updates}
shared_user_ids = await self.store.do_users_share_a_room(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels very similar to how we used to do device list updates. We ended up moving to a world where we recorded the device lists changes that happened in each room, we may want to do the same thing with profiles?

Copy link
Copy Markdown
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot to digest here, I think probably the most tricky thing is going to be making the sync performance less scary.

Probably we will wind up needing an extra table synchronised with the stream profile_updates_rooms (stream_id, instance_name, room_id)?
Pity I don't have the room NIDs ready yet :-)

-- Synapse streams start at 2, because the default position is 1
-- so any item inserted at position 1 is ignored.
-- We have to use nextval not START WITH 2, see https://github.com/element-hq/synapse/issues/18712
SELECT nextval('profile_updates_sequence');
Copy link
Copy Markdown
Contributor

@reivilibre reivilibre Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New since the PR opened: we need to add portdb logic for new sequences; see #19671 and #19675

Comment thread docs/upgrade.md
## Profile Updates Stream Writer Workers

This version of Synapse adds a new `profile_updates` writer stream. The
following endpoints may now only be handled by either the main process, or a
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this wording is confusing; if the main process isn't designated as a stream writer anymore then you can't use the main process for that. The current wording implies main process OR stream_writer is acceptable (imo)

# MSC4133: Custom profile fields
self.msc4133_enabled: bool = experimental.get("msc4133_enabled", False)

# MSC4429: Profile updates for legacy /sync
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stream_id BIGINT NOT NULL PRIMARY KEY,
instance_name TEXT NOT NULL,

user_id TEXT NOT NULL,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of paranoia around the profiles tables which currently don't do a good job of this, it'd be nice to add a comment on this column (-- ) saying this is the full user ID

instance_name TEXT NOT NULL,

user_id TEXT NOT NULL,
field_name TEXT NOT NULL,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a small comment here would be decent too IMO

Comment thread synapse/handlers/sync.py
sync_result_builder.profile_updates = all_updates
return

async def _generate_sync_entry_for_profile_updates(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not clear that this only returns profile updates for local users.

(Also not clear on the MSC that this is a constraint clients should expect?)


At least, I think it's only doing that, because the profile updates table has a constraint on local users and I don't see any mechanism that we have to track profile updates for remote users

)
self._worker_locks = hs.get_worker_locks_handler()

async def _notify_profile_update(self, user_id: UserID, stream_id: int) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would benefit from docstring.

Not at first glance sure why this is pulled out from recordprofile_updates

StreamKeyType.PROFILE_UPDATES, stream_id, rooms=room_ids
)

async def _record_profile_updates(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would benefit from docstring

Comment on lines +276 to +278
# Typing: `user_ids_to_room_ids.values` is a frozenset, but one
# can still iterate over it, hence the ignore.
for batched_room_ids in user_ids_to_room_ids.values(): # type: ignore[assignment]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type ignore and the comment doesn't make sense to me and I can't see why it would be necessary. For a start, .values() is being called on the dict, not on the `frozenset`. Happy to share the error?

Comment on lines +293 to +294
# TODO: Is it possible to still allow any generic_worker to handle the
# `GET` endpoint?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I've seen this around, but it feels like it should be possible. If not it might be preferable to have the condition check inside the on_PUT handler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants