DI: support per-method iseqs for line probes on pre-loaded files#5501
DI: support per-method iseqs for line probes on pre-loaded files#5501
Conversation
BenchmarksBenchmark execution time: 2026-04-08 21:43:25 Comparing candidate commit bc3339c in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 45 metrics, 1 unstable metrics.
|
9012b29 to
dee647b
Compare
Typing analysisNote: Ignored files are excluded from the next sections.
|
* origin/di-per-method-iseq: (23 commits) Improve DITargetNotInRegistry error messages Update remote config test for new error message format Fix throwable integration test to include stacktrace Add integration test for line probe via per-method iseq Support per-method iseqs for line probes on pre-loaded files Fix Steep: allow nil for @current_components Fix StandardRB: add parens to ternary, remove extra blank line Fix backfill_registry test failures Disable GC during backfill integration test to prevent iseq collection Fix backfill_registry tests on Ruby < 3.1 (iseq_type unavailable) Initialize @current_components to suppress Ruby 2.6/2.7 warning Return nil explicitly from backfill_registry Remove respond_to?(:all_iseqs) guard from backfill_registry Add tests for calling backfill_registry twice Fix inaccurate comment: first_lineno == 0 heuristic matches iseq_type Document iseq_type Ruby 3.1 dependency and two-strategy backfill Review fixes: doc comments, error handling test coverage, spec_helper require Guard rb_iseq_type behind have_func for Ruby < 3.1 compat Add DI.iseq_type C extension; use type instead of first_lineno in backfill Stub backfill_registry in pre-existing tests ...
* origin/di-per-method-iseq: (23 commits) Improve DITargetNotInRegistry error messages Update remote config test for new error message format Fix throwable integration test to include stacktrace Add integration test for line probe via per-method iseq Support per-method iseqs for line probes on pre-loaded files Fix Steep: allow nil for @current_components Fix StandardRB: add parens to ternary, remove extra blank line Fix backfill_registry test failures Disable GC during backfill integration test to prevent iseq collection Fix backfill_registry tests on Ruby < 3.1 (iseq_type unavailable) Initialize @current_components to suppress Ruby 2.6/2.7 warning Return nil explicitly from backfill_registry Remove respond_to?(:all_iseqs) guard from backfill_registry Add tests for calling backfill_registry twice Fix inaccurate comment: first_lineno == 0 heuristic matches iseq_type Document iseq_type Ruby 3.1 dependency and two-strategy backfill Review fixes: doc comments, error handling test coverage, spec_helper require Guard rb_iseq_type behind have_func for Ruby < 3.1 compat Add DI.iseq_type C extension; use type instead of first_lineno in backfill Stub backfill_registry in pre-existing tests ...
dee647b to
4107334
Compare
|
✨ Fix all issues with BitsAI or with Cursor
|
873a846 to
af1e513
Compare
* origin/di-per-method-iseq: (27 commits) Fix iseq_type stub on Ruby < 3.1: guard with respond_to? Fix Steep: update RBS for raise_if_probe_in_loaded_features Improve DITargetNotInRegistry error messages Update remote config test for new error message format Fix throwable integration test to include stacktrace Add integration test for line probe via per-method iseq Support per-method iseqs for line probes on pre-loaded files Document iseq lifecycle and GC interactions for DI backfill Fix backfill integration test: keep top-level iseq alive across tests Fix Steep: allow nil for @current_components Fix StandardRB: add parens to ternary, remove extra blank line Fix backfill_registry test failures Disable GC during backfill integration test to prevent iseq collection Fix backfill_registry tests on Ruby < 3.1 (iseq_type unavailable) Initialize @current_components to suppress Ruby 2.6/2.7 warning Return nil explicitly from backfill_registry Remove respond_to?(:all_iseqs) guard from backfill_registry Add tests for calling backfill_registry twice Fix inaccurate comment: first_lineno == 0 heuristic matches iseq_type Document iseq_type Ruby 3.1 dependency and two-strategy backfill ...
When CodeTracker starts, use the all_iseqs C extension to populate the registry with instruction sequences for files that were loaded before tracking began. This enables line probes on third-party code and application code loaded at boot time. Only whole-file iseqs (first_lineno == 0) are backfilled — per-method iseqs require instrumenter changes to select the correct iseq for a target line and will be supported in a follow-up. Backfill does not overwrite entries from :script_compiled, which are authoritative. The C extension availability is checked via DI.respond_to?(:all_iseqs) so the code gracefully degrades when the extension is not compiled. - Added CodeTracker#backfill_registry - Called from CodeTracker#start after trace point is enabled - Added RBS signature - Added tests for backfill behavior and C extension fallback Co-Authored-By: Claude <noreply@anthropic.com>
- Added rescue block around backfill_registry so failures are best-effort (logged + telemetry) rather than propagating - Replaced all skip-based tests with mock-based tests that exercise backfill logic without requiring the compiled C extension - Added tests for: mixed iseq types, multiple files, error handling, suffix/exact lookup on backfilled entries, start ordering - 27 examples, 0 failures, 0 pending, 0 skipped Co-Authored-By: Claude <noreply@anthropic.com>
Tests the end-to-end flow: test class loaded before code tracking starts → CodeTracker#start triggers backfill via all_iseqs C extension → iseq recovered from object space → line probe installed on backfilled iseq → probe fires and captures local variables. Runs under rake spec:di_with_ext (requires compiled C extension). Three test cases: - Probe installs successfully on backfilled iseq - Probe fires when target line executes - Snapshot captures local variables from backfilled iseq Co-Authored-By: Claude <noreply@anthropic.com>
On macOS CI the C extension is compiled, so backfill_registry populates the CodeTracker registry with pre-loaded files during start. This broke existing tests that expect the registry to be empty after start or to contain exactly N explicitly-loaded files. Fix by stubbing backfill_registry in test contexts that exercise :script_compiled behavior. Backfill is tested separately in its own describe blocks. Affected contexts: - CodeTracker #start (before block) - CodeTracker shared context 'when code tracker is running' - CodeTracker #iseqs_for_path_suffix (around block) - Instrumenter shared context 'with code tracking' Co-Authored-By: Claude <noreply@anthropic.com>
…kfill The backfill filter used first_lineno == 0 to identify whole-file iseqs, but most whole-file iseqs from all_iseqs have first_lineno == 1. The new DI.iseq_type method reads the iseq type directly from the Ruby VM struct and returns a symbol (:top, :method, :block, :class, etc.). The backfill now filters by type == :top || type == :main, which correctly identifies whole-file iseqs regardless of first_lineno. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
rb_iseq_type is an internal Ruby function that only exists in Ruby 3.1+. On Ruby 2.7 and 3.0, referencing it causes an undefined symbol error at load time, crashing the entire C extension (including all_iseqs and exception_message which work fine on those versions). Use have_func in extconf.rb to detect rb_iseq_type at compile time, and wrap the iseq_type function + registration in #ifdef HAVE_RB_ISEQ_TYPE. The Ruby code in code_tracker.rb already handles the missing method via DI.respond_to?(:iseq_type) with a first_lineno fallback. Co-Authored-By: Claude <noreply@anthropic.com>
… require - Add doc comments for rb_iseqw_new and rb_iseqw_to_iseq prototypes in di.c (internal Ruby functions used without documentation) - Add error handling test coverage for backfill_registry: verify logger.debug is called with the error message and telemetry.report is called when DI.current_component is available - Add test coverage for the first_lineno == 0 fallback path when iseq_type is unavailable (Ruby versions without rb_iseq_type) - Add missing require "datadog/di/spec_helper" to iseq_type_spec.rb for consistency with other ext specs - Fix skip message: iseq_type availability depends on rb_iseq_type in the Ruby runtime, not on the DI C extension Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- di.c: Document that rb_iseq_type was added in Ruby 3.1, explain the HAVE_RB_ISEQ_TYPE compile-time guard, and note the fallback path - code_tracker.rb: Replace "first_lineno == 0" YARD doc with full description of both strategies (iseq_type on 3.1+, first_lineno heuristic on older Rubies) and their tradeoffs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The YARD doc claimed the first_lineno == 0 fallback "can match top-level eval iseqs" but this is wrong. InstructionSequence.compile passes first_lineno = 1 (not 0), and require/load passes INT2FIX(0) in Ruby's rb_iseq_new_top/rb_iseq_new_main. Both strategies produce the same result in practice. Verified by reading Ruby 3.0 source (iseq.c lines 813-822): rb_iseq_new_with_opt(ast, name, path, realpath, INT2FIX(0), ...) → ISEQ_TYPE_TOP with first_lineno = 0 And compile path (iseq.c line 1064): rb_iseq_new_with_opt(&ast->body, label, file, realpath, line, ...) → line defaults to INT2FIX(1) for compile/eval Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verify idempotency: calling backfill_registry a second time with the same iseqs doesn't duplicate entries (registry.key? guard). Also verify that a second call with new iseqs adds them without overwriting entries from the first call. Co-Authored-By: Claude <noreply@anthropic.com>
- Remove load_iseq from prepended module in after block instead of leaving the prepend as a permanent global mutation. Uses the same pattern as the DI instrumenter for method probe cleanup: the module stays in the ancestor chain but becomes a no-op. - Regenerate ruby_3.2_rails8_mysql2.gemfile to include bootsnap. Other Ruby version gemfiles will be regenerated in their respective CI environments. Co-Authored-By: Claude <noreply@anthropic.com>
5c06278 to
c8f8bf9
Compare
- Rewrite `unless` with mixed logical operators to `if` with negated conditions in backfill_registry (Style/UnlessLogicalOperators) - Remove redundant begin block and line continuations in Bootsnap test Co-Authored-By: Claude <noreply@anthropic.com>
…oads The skip guard only caught LoadError, but bootsnap can load successfully while its iseq cache fails to initialize (e.g. C extension not compiled for this environment). This caused test failures in CI jobs where bootsnap was available as a transitive dependency but couldn't install its iseq cache. Fix: probe-test the iseq cache in before(:all) — install it in a temp dir, clean up, and skip if any error occurs. Co-Authored-By: Claude <noreply@anthropic.com>
c8f8bf9 to
31c6a23
Compare
The require was inside before(:all), which loads bootsnap into the process at test execution time — after other test infrastructure has initialized. Moving it to file load time makes the side effect visible and predictable. If bootsnap isn't available or its iseq cache can't initialize, the constant BOOTSNAP_AVAILABLE is false and the entire describe block skips via before(:all). Also eliminates the standard:disable comment for Style/RedundantBegin since the rescue is now in a top-level begin/rescue/end, not a block. Co-Authored-By: Claude <noreply@anthropic.com>
… RBS type - compile_file_backfill_spec.rb: fix skip message "Ruby < 3.1" → "Ruby >= 3.1 only" - iseq_type_spec.rb: rename "eval with top-level code" → "compiled top-level code" (the test uses compile(), not eval) - code_tracker_spec.rb: remove two comments that just restate the expectation below them - code_tracker_spec.rb: rename "calls backfill_registry after trace point is enabled" test — it didn't actually verify ordering, only that active? is true after start - sig/datadog/di.rbs: iseq_type return type Symbol → Symbol? (C code returns Qnil when rb_iseqw_to_iseq returns NULL) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…oaded code The prior text said "Code loaded before the tracer initializes cannot be instrumented with line probes." This is no longer accurate: #5496 backfills whole-file iseqs from object space at startup, and #5501 adds per-method iseq fallback for files whose whole-file iseq was garbage collected. Together these cover most pre-loaded application and gem code. The small remaining exception is files where all iseqs (whole-file and per-method) were GC'd before DI started. The updated text documents this accurately without overpromising. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
ade72cf to
22e7d9c
Compare
The guard fires in zero CI jobs — the di:bootsnap job always has bootsnap in the gemfile. It only fires when a developer runs the file manually without the correct gemfile, in which case a LoadError is the right signal. More importantly, the StandardError rescue around install!() could silently skip all tests if bootsnap's C extension fails on a new Ruby version, hiding a real compatibility break as green CI. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This reverts commit ce589f5.
This reverts commit 55ae8c3.
The test "does not register compile_file iseqs" passed vacuously when the code was correct — the `if result` block meant zero assertions ran when result was nil. Add `expect(result).to be_nil` to directly assert the intended behavior. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
When a file's whole-file (:top) iseq has been garbage collected, per-method iseqs from all_iseqs can still be used to target line probes. This covers 86% of files that were previously untargetable. Changes: - backfill_registry stores per-method iseqs in per_method_registry (grouped by path) instead of discarding them - New iseq_for_line(suffix, line) method tries whole-file iseq first, then searches per-method iseqs for one whose trace_points include the target line - Instrumenter uses iseq_for_line when available, falls back to iseqs_for_path_suffix for compatibility Verified: 37 code_tracker tests pass, lint clean, types clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Loads a test class, GCs the top iseq, then verifies that the backfill finds the surviving method iseq and a line probe can be installed, fired, and captures local variables through it. Precondition checks skip the test if GC didn't collect the top iseq or if the C extension is unavailable. Verified: 3 integration tests pass (install, fire, capture locals). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The throwable now includes a stacktrace array (from the C extension commit). Also update error message assertion for the new raise_if_probe_in_loaded_features format. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The raise_if_probe_in_loaded_features now reports whether per-method iseqs exist or not, instead of the generic "not in code tracker registry" message. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Distinguish between "has per-method iseqs but none cover this line" and "has no surviving iseqs at all". Include the target line number in the error. Helps users understand why a line probe failed and whether the file is partially targetable. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The method gained line_no and code_tracker parameters in the per-method iseq support commit, but the RBS declaration was not updated. Co-Authored-By: Claude <noreply@anthropic.com>
On Ruby < 3.1, DI.iseq_type is not defined (the C extension gates it with have_func for rb_iseq_type). RSpec's partial double verification rejects stubs for undefined methods, causing 10 failures in the iseq_for_line and per-method iseq tests. Fix: guard iseq_type stubs with `if Datadog::DI.respond_to?(:iseq_type)`, matching the existing pattern in backfill_registry tests (lines 246-255). On Ruby < 3.1, the code under test falls back to first_lineno == 0. Co-Authored-By: Claude <noreply@anthropic.com>
The all_iseqs C extension is compiled by the di_with_ext CI task. Its absence means a broken build, not an unsupported platform. Tests should fail loudly if the extension is missing, not skip silently. The iseq_type skip (Ruby < 3.1) is retained — that is a genuine platform limitation. Co-Authored-By: Claude <noreply@anthropic.com>
Verifies the full DI stack works when compile_file was called on a file but the compile_file iseqs were GC'd (no reference held). The per-method iseq fallback installs and fires the probe correctly. Documents a known limitation: if compile_file child iseqs survive in object space (reference held), they are indistinguishable from require-produced children and backfill may register the wrong one, causing probes to silently not fire. Co-Authored-By: Claude <noreply@anthropic.com>
The backfill_registry docstring said "per-method iseqs require instrumenter changes and will be supported in a follow-up." This PR is that follow-up — the comment is now stale. Co-Authored-By: Claude <noreply@anthropic.com>
…ent section The section said "Code loaded before the tracer initializes cannot be instrumented with line probes." After #5496 (whole-file iseq backfill) and this PR (per-method iseq fallback), that limitation no longer exists. Remove the section entirely rather than rewrite it. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
22e7d9c to
bc3339c
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bc3339ceee
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| else | ||
| # Store per-method/block/class iseqs as fallback for files | ||
| # whose whole-file iseq was GC'd. These can be used to | ||
| # target line probes on lines within their range. | ||
| (per_method_registry[path] ||= []) << iseq |
There was a problem hiding this comment.
Skip compile_file top-level ISeqs in per-method fallback
This branch adds every non-whole_file iseq to per_method_registry, which includes :top compile_file iseqs (first_lineno != 0). Later, iseq_for_line can select one of these for a line probe, and the probe installs successfully even though targeted TracePoints on compile_file iseqs do not fire when the runtime executes the require/load-produced code. In environments where compile_file artifacts are retained, this leads to silent “installed but never firing” probes for pre-loaded files.
Useful? React with 👍 / 👎.
| matching = iseqs.find do |iseq| | ||
| iseq.trace_points.any? { |tp_line, _event| tp_line == line } |
There was a problem hiding this comment.
Ignore non-executable trace point events when matching lines
Line matching currently checks only tp_line and not the event kind, so it can select an iseq where the requested line appears only as :call (e.g., a def line). hook_line then enables only :line/:return/:b_return, so that probe is recorded as installed but never executes. This creates false-positive installations for line numbers that exist in trace metadata but are not executable for the event types DI actually subscribes to.
Useful? React with 👍 / 👎.
require "bootsnap" alone does not load Bootsnap::CompileCache::ISeq — that submodule must be required explicitly. Without it, the spec fails with NameError when referencing Bootsnap::CompileCache::ISeq. Also restores the BOOTSNAP_AVAILABLE skip guard that was removed, so the tests degrade gracefully when bootsnap or its ISeq cache is unavailable. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
What does this PR do?
When a file's whole-file (:top) iseq has been garbage collected, the backfill now stores surviving per-method iseqs and the instrumenter can use them to target line probes.
Motivation:
86% of files loaded before DI starts have their whole-file iseq GC'd but still have surviving per-method iseqs (measured on a Rails app with 146 gems: 1987 of 2309 missing files had 22739 surviving method iseqs). Previously these files were completely untargetable by line probes.
Change log entry
Yes. Dynamic Instrumentation line probes can now target more pre-loaded files whose whole-file instruction sequence was garbage collected.
Additional Notes:
backfill_registrynow stores per-method iseqs in a separateper_method_registrykeyed by pathiseq_for_line(suffix, line)method checks whole-file iseq first, falls back to per-method iseqs by checkingtrace_pointsfor the target lineiseq_for_linewhen available, falls back toiseqs_for_path_suffixfor compatibility with older code trackersHow to test the change?
Unit tests added + manual testing via gobo targeting stdlib with partial iseqs:
The screenshot shows
Set#addcaptured with argument1. The gobo demo callsSet.new([1, 2, 3])thens.add(4)ands.add(2), butSet.newinternally callsaddfor each element (1, 2, 3), so the probe fires onadd(1)first. Due to the DI rate limit (one snapshot per second by default), only the first invocation is captured.