Skip to content

Introduce spam_checker_spammy internal event metadata.#19453

Merged
reivilibre merged 12 commits intodevelopfrom
rei/sticky_events_spam_checker_spammy
Apr 15, 2026
Merged

Introduce spam_checker_spammy internal event metadata.#19453
reivilibre merged 12 commits intodevelopfrom
rei/sticky_events_spam_checker_spammy

Conversation

@reivilibre
Copy link
Copy Markdown
Contributor

Follows: #19365

Part of: MSC4354 Sticky Events (experimental feature #19409)

This PR introduces a spam_checker_spammy flag, analogous to policy_server_spammy, as an explicit flag
that an event was decided to be spammy by a spam-checker module.

The original Sticky Events PR (#18968) just reused policy_server_spammy, but it didn't sit right with me
because we (at least appear to be experimenting with features that) allow users to opt-in to seeing
policy_server_spammy events (presumably for moderation purposes).

Keeping these flags separate felt best, therefore.

As for why we need this flag: soon soft-failed status won't be permanent, at least for sticky events.
The spam checker modules currently work by making events soft-failed.
We want to prevent spammy events from getting reconsidered/un-soft-failed, so it seems like we need
a flag to track spam-checker spamminess separately from soft-failed.

Should be commit-by-commit friendly, but is also small.

@reivilibre reivilibre marked this pull request as ready for review February 12, 2026 15:49
@reivilibre reivilibre requested a review from a team as a code owner February 12, 2026 15:49
@reivilibre
Copy link
Copy Markdown
Contributor Author

reivilibre commented Feb 12, 2026

Opening this for feedback, still aware I need to add a test but I doubt I will have time today or tomorrow Test added

Comment thread changelog.d/19453.misc
Comment on lines 57 to +58
PolicyServerSpammy(bool),
SpamCheckerSpammy(bool),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an enum which would allow future adaptions like this? It makes sense that these could be both true so that may not apply in any case. I'm guessing we have some early-returns if either things marks the event as spammy though?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An enum isn't a bad idea actually, but I can't think of a pleasant way of doing that migration, which leads me to think we might be better off keeping the 2 bools but also addressing the 'early-returns' point you make.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early-return removed at be8c05b so we evaluate both attributes

Comment on lines 57 to +58
PolicyServerSpammy(bool),
SpamCheckerSpammy(bool),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "spam checker" only for the module API?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, or really there is also the pre-module-API 'spam checker API' but it got superseded by the module API, I think we have a compatibility shim.

Comment on lines +197 to +198
# Mark this as spam so we don't re-evaluate soft-failure status.
redacted_event.internal_metadata.spam_checker_spammy = True
Copy link
Copy Markdown
Contributor

@MadLittleMods MadLittleMods Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on how this was used in #18968, perhaps insert_sticky_events_txn should be updated to instead skip any soft-failed events 🤔

Especially as the reasoning there is "Skipping the insertion of these types of 'invalid' events is useful for performance reasons because they would fill up the table yet we wouldn't show them to clients anyway." and soft_failed covers the behavior of showing events to clients.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, this comment below clarifies why we distinguish this:

Note: Soft-failed sticky events ARE inserted, as their soft-failed status could be re-evaluated later.

Comment thread tests/module_api/test_spamchecker.py
{self.remote_user_join_event.event_id},
)

def test_federated_events_with_spam_checker_metadata(self) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add a in-repo Complement test for this sort of thing. Only hiccup would be setting up/configuring up the spam checker module but we could always configure one and have some specific constant that triggers this behavior SPAM_CHECKER_SPAM

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how much it fits; I feel the trial test is adequate and is nicer to develop on.
The spam_checker_spammy field is also internal so wouldn't be visible externally.
Am I missing something really beneficial that would be covered?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make it more clear, I think the trial test is sufficient 👍

It was more an ideal. One benefit would be avoiding the room version details (like this problem). We could also avoid the internal details of manually puppeting federation here.

In terms of checking, we would probably check via /sync (send a sentinel event after the spammy one) and/or admin API's if the spam_checker_spammy value was actually important.

Comment thread tests/module_api/test_spamchecker.py Outdated
Comment thread synapse/storage/databases/main/sticky_events.py
@reivilibre reivilibre force-pushed the rei/sticky_events_spam_checker_spammy branch from 07b866c to 71137f6 Compare April 10, 2026 12:33
Comment thread tests/storage/test_sticky_events.py Outdated
token = self.login(user_id, "pass")
room_id = self.helper.create_room_as(user_id, tok=token)

start_id = self.store.get_max_sticky_events_stream_id()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly a problem of this PR specifically but the name get_max_sticky_events_stream_id() feels a bit off.

get_current_sticky_events_stream_id seems more appropriate. I see how it got the name "max" because that's what the underlying get_current_token() says but then it gets confusing when we also have get_max_allocated_token()

Recently dealt with this in #19558 / #19677 and ended up with

async def get_current_quarantined_media_stream_id(self) -> int:
"""Gets the position of the quarantined media changes stream.
Returns:
int - the current stream ID
"""
return self._quarantined_media_changes_id_gen.get_current_token()
async def get_max_allocated_quarantined_media_stream_id(self) -> int:
"""Gets the maximum allocated position of the quarantined media changes stream.
Returns:
int - the maximum stream ID
"""
return await self._quarantined_media_changes_id_gen.get_max_allocated_token()

Copy link
Copy Markdown
Contributor Author

@reivilibre reivilibre Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this name got introduced is because this is what account_data does, which I was using as a reference for thread subscriptions (iirc).

And of course since documenting the 'how to add a stream' cheatsheet, everyone has been cribbing off that, copying the pattern further.

You are very correct though, in a multi-writer stream, max_stream_id (or thereabouts) sounds like it would be the maximum position any writer had persisted.

Whereas the current one actually sounds like it would make more sense being called 'min', based on this docstring of the underlying function:

    def get_persisted_upto_position(self) -> int:
        """Get the max position where all previous positions have been
        persisted.

        Note: In the worst case scenario this will be equal to the minimum            # <--------------------
        position across writers. This means that the returned position here can
        lag if one writer doesn't write very often.
        """

I do prefer the 'current' idea and I think I will update all streams to match that (in a subsequent PR of course)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have even more variants, have asked internally to give people a chance to provide an opinion on whether any one of them has a better reason to be standard than the others.

  • stream_token e.g. get_to_device_stream_token
  • current_key e.g. self.sources.account_data.get_current_key() (TIL about this variant)
  • max_stream_id e.g. get_max_push_rules_stream_id
  • current_stream_id e.g. get_current_quarantined_media_stream_id

It might still make sense to exclude the events stream from the discussion/any potential change, since we have both a min and max there and therefore it's hard to drop the 'max' name.... (although we call those the room_min and room_max even though it's about the events stream :S)

After people have a chance to have some input I will go through and try to unify these

@reivilibre reivilibre merged commit 52c05c5 into develop Apr 15, 2026
109 of 115 checks passed
@reivilibre reivilibre deleted the rei/sticky_events_spam_checker_spammy branch April 15, 2026 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants