Skip to content

mlog: Add async I/O and sampling support to MLogger infrastructure#209

Open
akash-a-n wants to merge 3 commits into
mainfrom
feature/mlog-sampling
Open

mlog: Add async I/O and sampling support to MLogger infrastructure#209
akash-a-n wants to merge 3 commits into
mainfrom
feature/mlog-sampling

Conversation

@akash-a-n
Copy link
Copy Markdown

@akash-a-n akash-a-n commented May 13, 2026

Enhance MoQ-level logging (mlog) with off-thread I/O and per-session sampling:

  • Async off-thread disk I/O: FileMLogger now supports optional setWriteExecutor() for asynchronous file writes. Pre-formatting of log events happens on the calling thread (safe on session's event loop), while disk I/O happens on a background executor. Avoids this capture by moving all data into the lambda.

  • Path-agnostic factory: Simplified FileMLoggerFactory constructor to accept only VantagePoint. Callers now set path per-logger via setPath(), enabling per-session file naming without factory reconstruction.

  • Session sampling: New SamplingMLoggerFactory wraps any MLoggerFactory and applies probabilistic sampling via folly::Random::oneIn() (thread-safe). Enables selective logging (e.g., 1% of sessions) for production deployments.

  • moxygen/mlog/FileMLogger.{h,cpp}: Add setWriteExecutor() and async outputLogs()

  • moxygen/mlog/FileMLoggerFactory.h: Path-agnostic API, remove path constructor param

  • moxygen/mlog/SamplingMLoggerFactory.{h,cpp}: New sampling decorator (generic utility)

  • moxygen/mlog/CMakeLists.txt: Add moxygen_mlog_sampling_mlogger_factory target

  • moxygen/moqtest/MoQTestServerMain.cpp: Update to use new path-agnostic API

  • moxygen/samples/date/MoQDateServer.cpp: Explicit setPath() call for sample code

All data passed to background executor is moved (no references, no this capture):

  • Pre-format logs on event thread (where formatLog() is safe)
  • Move serialized strings + path into lambda on background executor
  • Safe for logger destruction before async task completes

Sampling uses xoshiro256pp PRNG via folly::Random::oneIn(), providing:

  • Thread-safe sampling across EventBase threads
  • Better statistical properties than modulo-based approaches
  • O(1) session creation overhead regardless of sample rate

This change is Reviewable

@akash-a-n akash-a-n force-pushed the feature/mlog-sampling branch from fe14f0b to d0a073c Compare May 13, 2026 14:32
@akash-a-n akash-a-n changed the title mlog: Add async I/O and sampling support to MLogger infrastructure [WIP] mlog: Add async I/O and sampling support to MLogger infrastructure May 14, 2026
@akash-a-n akash-a-n force-pushed the feature/mlog-sampling branch 2 times, most recently from b5baec9 to 0a7bdb5 Compare May 14, 2026 18:45
@akash-a-n akash-a-n changed the title [WIP] mlog: Add async I/O and sampling support to MLogger infrastructure mlog: Add async I/O and sampling support to MLogger infrastructure May 14, 2026
@akash-a-n akash-a-n force-pushed the feature/mlog-sampling branch 2 times, most recently from 57d0ded to 37aa9fc Compare May 20, 2026 17:00
Copy link
Copy Markdown
Collaborator

@afrind afrind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core functionality looks good. I think some of these changes can go upstream -- at least FileMLogger/Factory? If so having a stack of commits like:

FileMLogger changes
SamplingMlogger
Add MLogger to pico

As long as there's a named branch for the upstreamable piece, I can make a PR from it.

@afrind reviewed 11 files and all commit messages, and made 5 comments.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on akash-a-n, gmarzot, michalhosna, mondain, Oxyd, peterchave, suhasHere, and TimEvens).


moxygen/mlog/FileMLogger.cpp line 18 at r1 (raw file):

  std::string path;
  if (dir_) {
    // Try DCID first, then fall back to server's own CID (srcCid), otherwise

Can we require dcid and remove the fallback logic?


moxygen/mlog/FileMLogger.cpp line 49 at r1 (raw file):

  if (writeExecutor_) {
    // Pre-format logs on the calling thread (where formatLog() is safe to

Maybe make this a lamba that's executed in both cases, either inline or via ->add


moxygen/mlog/SamplingMLoggerFactory.cpp line 16 at r1 (raw file):

    : inner_(std::move(inner)), sampleRate_(sampleRate) {
  // bucketSize = how many sessions per one logged session, e.g. 0.01 -> 100
  bucketSize_ = static_cast<uint32_t>(1.0f / sampleRate_);

Watch for divide by 0 here? Can you handle the sample rate checks you have in createMLogger here so it's always sane there (would oneIn(0) always return false?


moxygen/mlog/test/SamplingMLoggerFactoryTest.cpp line 74 at r1 (raw file):

// Use a large sample to stay within a wide tolerance (30–70%) with
// overwhelming probability, avoiding flakiness.
TEST_F(SamplingMLoggerFactoryTest, HalfRate_ApproximatelyHalfSampled) {

Seems like it's just testing oneIn - not sure it's necessary? Actually that may be more or less true for most of these tests?

@akash-a-n akash-a-n force-pushed the feature/mlog-sampling branch 3 times, most recently from dfab932 to 78c2dfa Compare May 26, 2026 15:27
Copy link
Copy Markdown
Author

@akash-a-n akash-a-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split it into 3 commits as suggested.
created: #236 that contains only FileMLogger changes to be upstreamed

@akash-a-n made 5 comments.
Reviewable status: 6 of 11 files reviewed, 4 unresolved discussions (waiting on afrind, gmarzot, michalhosna, mondain, Oxyd, peterchave, suhasHere, and TimEvens).


moxygen/mlog/FileMLogger.cpp line 18 at r1 (raw file):

Previously, afrind wrote…

Can we require dcid and remove the fallback logic?

Done.


moxygen/mlog/FileMLogger.cpp line 49 at r1 (raw file):

Previously, afrind wrote…

Maybe make this a lamba that's executed in both cases, either inline or via ->add

Done.


moxygen/mlog/SamplingMLoggerFactory.cpp line 16 at r1 (raw file):

Previously, afrind wrote…

Watch for divide by 0 here? Can you handle the sample rate checks you have in createMLogger here so it's always sane there (would oneIn(0) always return false?

Done.


moxygen/mlog/test/SamplingMLoggerFactoryTest.cpp line 74 at r1 (raw file):

Previously, afrind wrote…

Seems like it's just testing oneIn - not sure it's necessary? Actually that may be more or less true for most of these tests?

Done.

Copy link
Copy Markdown
Author

@akash-a-n akash-a-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akash-a-n made 1 comment.
Reviewable status: 6 of 11 files reviewed, 4 unresolved discussions (waiting on afrind, gmarzot, michalhosna, mondain, Oxyd, peterchave, suhasHere, and TimEvens).


moxygen/mlog/FileMLogger.cpp line 18 at r1 (raw file):

Previously, akash-a-n wrote…

Done.

@afrind I was doing some more testing today with conformance scripts and noticed that mvfst allows setting 0 len connection-id which is used by all conformance tests.
Is this not a real production use case? Do we still want to require DCID and fail logging if it isn't present?
PS: the current impl has a small bug, it checks for dcid only; where as the dcid_->hex() can be "".

@akash-a-n akash-a-n force-pushed the feature/mlog-sampling branch from 78c2dfa to 27bd07d Compare May 29, 2026 07:26
akash-a-n added 3 commits May 29, 2026 17:26
- Add setWriteExecutor() for asynchronous off-thread disk I/O
- Pre-format logs on calling thread, move serialized data to executor
- Add setDir() for directory prefix support via derivePath()
- Avoid this capture by moving all data into lambda
- Update FileMLoggerFactory to support new features
- Add test coverage for FileMLogger and FileMLoggerFactory
- New wrapper factory that applies per-session sampling
- Uses folly::Random::oneIn() for thread-safe sampling
- Enables selective logging (e.g., 1%% of sessions) in production
- Add test coverage for SamplingMLoggerFactory
- Log DCID for incoming QUIC connections in onNewConnectionImpl()
- Log DCID for WebTransport CONNECT in onWebTransportConnectImpl()
- Enable MLogger creation and attachment to MoQ sessions
@akash-a-n akash-a-n force-pushed the feature/mlog-sampling branch from 27bd07d to 6ae1cc9 Compare May 29, 2026 11:57
Copy link
Copy Markdown
Collaborator

@afrind afrind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afrind reviewed 9 files and all commit messages, made 2 comments, and resolved 4 discussions.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on gmarzot, michalhosna, mondain, Oxyd, peterchave, suhasHere, and TimEvens).


moxygen/mlog/FileMLogger.cpp line 18 at r1 (raw file):

Previously, akash-a-n wrote…

@afrind I was doing some more testing today with conformance scripts and noticed that mvfst allows setting 0 len connection-id which is used by all conformance tests.
Is this not a real production use case? Do we still want to require DCID and fail logging if it isn't present?
PS: the current impl has a small bug, it checks for dcid only; where as the dcid_->hex() can be "".

I guess we could fallback to a hash of the address tuple?


moxygen/mlog/FileMLogger.cpp line 20 at r3 (raw file):

    fileObj << jsonLog << std::endl;
  }
  fileObj.close();

Is this implicit in fileObj dtor?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants