Skip to content

[Core] Publish platform events via Ray Event Recorder#63329

Merged
edoakes merged 3 commits into
ray-project:masterfrom
richabanker:platform-events-ray-event-recorder
Jun 2, 2026
Merged

[Core] Publish platform events via Ray Event Recorder#63329
edoakes merged 3 commits into
ray-project:masterfrom
richabanker:platform-events-ray-event-recorder

Conversation

@richabanker

Copy link
Copy Markdown
Contributor

Description

Add support for publishing Platform events via the python ray event exporter framework

@richabanker richabanker requested review from a team, MengjinYan, dayshah and edoakes as code owners May 13, 2026 22:30

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the PlatformEventBuilder class to support infrastructure platform events, such as those from Kubernetes, and integrates it into the Ray dashboard's observability module. The changes include initializing the EventRecorder in the dashboard head and emitting events during processing callbacks. Feedback suggests allowing the EventRecorder to generate unique IDs for event updates to avoid deduplication issues, refactoring environment variable checks for efficiency, and moving imports out of the event processing hot path.

Comment thread python/ray/dashboard/modules/platform_events/platform_event_head.py
Comment thread python/ray/dashboard/modules/platform_events/platform_event_head.py Outdated
Comment on lines +134 to +142
if os.environ.get("RAY_ENABLE_PYTHON_RAY_EVENT", "False").lower() in (
"true",
"1",
):
try:
from ray._common.observability.platform_events import (
PlatformEventBuilder,
)
from ray._raylet import EventRecorder

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Several modules are imported inside _process_event_callback for every event. Since this callback can be invoked frequently, these imports introduce unnecessary overhead. It is recommended to move these imports to the top of the file or at least outside the hot path of the callback.

@richabanker richabanker force-pushed the platform-events-ray-event-recorder branch from 8718785 to b35d418 Compare May 13, 2026 22:40
Comment thread python/ray/dashboard/modules/aggregator/tests/test_ray_platform_events.py Outdated
@richabanker richabanker force-pushed the platform-events-ray-event-recorder branch from b35d418 to bac7c35 Compare May 13, 2026 23:25
Comment thread python/ray/dashboard/modules/platform_events/platform_event_head.py
@richabanker richabanker force-pushed the platform-events-ray-event-recorder branch from bac7c35 to 271db30 Compare May 13, 2026 23:36
Comment thread python/ray/dashboard/modules/platform_events/tests/test_platform_event_head.py Outdated
@richabanker richabanker force-pushed the platform-events-ray-event-recorder branch 3 times, most recently from a8a923c to 1e59f09 Compare May 14, 2026 00:21
Comment thread python/ray/dashboard/modules/platform_events/platform_event_head.py
@richabanker

Copy link
Copy Markdown
Contributor Author

@sampan-s-nayak could you please help take a pass at this PR whenever possible for you? Thanks!

@@ -69,32 +103,56 @@ def thread_safe_callback(ray_event: RayEvent):
)

def _process_event_callback(self, ray_event: RayEvent):
"""Callback running in the main asyncio loop to cache events."""
"""Thread-safe entry point that dispatches event caching to the main asyncio loop."""

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change driven by this comment

@ray-gardener ray-gardener Bot added core Issues that should be addressed in Ray Core observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling community-contribution Contributed by the community labels May 14, 2026
@edoakes

edoakes commented May 15, 2026

Copy link
Copy Markdown
Collaborator

@sampan-s-nayak PTAL

Comment thread python/ray/_private/ray_constants.py Outdated
Comment thread python/ray/dashboard/modules/platform_events/platform_event_head.py Outdated
Comment thread python/ray/dashboard/modules/platform_events/platform_event_head.py Outdated
@richabanker richabanker force-pushed the platform-events-ray-event-recorder branch 2 times, most recently from 64e7752 to 07e69ed Compare May 19, 2026 22:58
@richabanker richabanker force-pushed the platform-events-ray-event-recorder branch 2 times, most recently from 8a3ecd9 to 61c73d8 Compare May 28, 2026 00:02
@richabanker richabanker force-pushed the platform-events-ray-event-recorder branch 2 times, most recently from 38f99f7 to 7286bd3 Compare May 28, 2026 00:12
@richabanker

Copy link
Copy Markdown
Contributor Author

@sampan-s-nayak I think all comments are now addressed, any other concerns or feedback you have for this?

@sampan-s-nayak sampan-s-nayak left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this, LGTM

@sampan-s-nayak

Copy link
Copy Markdown
Contributor

@richabanker looks like there are a couple of CI failures

@richabanker richabanker force-pushed the platform-events-ray-event-recorder branch from ccd1cc8 to 961765a Compare June 1, 2026 18:56

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 961765aa39c29585ca1a20a9ca786f02368da2bb. Configure here.

)
EventRecorder.emit(cython_event)
except Exception as e:
logger.warning(f"Failed to emit platform event via EventRecorder: {e}")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing initialization guard causes repeated emit failures per event

Low Severity

If EventRecorder.initialize() is never called (because head_node_id_hex is None or the agent address lookup returns None), _process_event_internal still attempts EventRecorder.emit() on every single incoming event. Each call hits the except branch and logs a warning. In a busy Kubernetes cluster this could produce hundreds of warning-level log messages per minute with no mechanism to stop retrying.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 961765aa39c29585ca1a20a9ca786f02368da2bb. Configure here.

Signed-off-by: Richa Banker <richabanker@google.com>
Signed-off-by: Richa Banker <richabanker@google.com>
Signed-off-by: Richa Banker <richabanker@google.com>
@richabanker richabanker force-pushed the platform-events-ray-event-recorder branch from e9beaf2 to 5936b11 Compare June 1, 2026 21:08
@richabanker

Copy link
Copy Markdown
Contributor Author

@richabanker looks like there are a couple of CI failures

Fixed them now, thanks!

@edoakes edoakes added the go add ONLY when ready to merge, run all tests label Jun 2, 2026
@edoakes edoakes enabled auto-merge (squash) June 2, 2026 19:02
@edoakes edoakes merged commit 080c195 into ray-project:master Jun 2, 2026
9 checks passed
rueian pushed a commit to rueian/ray that referenced this pull request Jun 4, 2026
)

## Description
Add support for publishing Platform events via the python ray event
exporter framework

---------

Signed-off-by: Richa Banker <richabanker@google.com>
@richabanker richabanker deleted the platform-events-ray-event-recorder branch June 4, 2026 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants