Skip to content

Trace analysis notebook#33

Open
EoghanOConnor wants to merge 3 commits into
redhat-et:mainfrom
EoghanOConnor:analysis
Open

Trace analysis notebook#33
EoghanOConnor wants to merge 3 commits into
redhat-et:mainfrom
EoghanOConnor:analysis

Conversation

@EoghanOConnor

@EoghanOConnor EoghanOConnor commented May 1, 2026

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • New Features
    • Added trace analysis notebook for examining MLflow experiments. Features include data retrieval, session aggregation, and visualizations of cost-per-job and job count metrics across sessions.

@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@coderabbitai

coderabbitai Bot commented May 1, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@EoghanOConnor, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 52 minutes and 46 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: eb0b20f2-f7d2-4148-998c-87fa2c6a9078

📥 Commits

Reviewing files that changed from the base of the PR and between 12105f7 and da082f3.

📒 Files selected for processing (1)
  • analysis/trace_analysis.ipynb
📝 Walkthrough

Walkthrough

This pull request adds a new Jupyter notebook for analyzing MLflow trace data. The notebook connects to an MLflow server via environment-configured parameters, retrieves all traces for a specified experiment, normalizes trace and request/response metadata into pandas DataFrames, extracts job identifiers from response text, aggregates metrics at the session level, and generates matplotlib visualizations of cost per job.

Changes

MLflow Trace Analysis Notebook

Layer / File(s) Summary
MLflow connection and trace retrieval
analysis/trace_analysis.ipynb
Environment variables configure the MLflow tracking URI and experiment name. The notebook creates an MLflow client, looks up the experiment by name, and retrieves all traces using paginated search_traces calls, aggregating results into all_traces.
Trace data normalization and exploration
analysis/trace_analysis.ipynb
Each trace is parsed to extract request/response JSON, span data, token counts, costs, session ID, and timestamps into normalized dicts. Results are converted to a DataFrame, displayed, and iterated through for console inspection of all fields and values.
Session aggregation and cost-per-job computation
analysis/trace_analysis.ipynb
Traces are grouped by session ID. Helper functions extract_job_ids and count_job_ids parse "Job " occurrences from response text. Augmented traces with job counts are used to compute per-session total tokens and costs, select the representative row per session (max job count), merge totals back, and derive cost_per_job with division-by-zero handling.
Matplotlib plots and analysis visualization
analysis/trace_analysis.ipynb
Multiple matplotlib figures are generated from session-level aggregated statistics, including line plots and bar charts showing average cost per job as a function of job count, with an annotated bar chart that includes session count labels.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: adding a Jupyter notebook for trace analysis with MLflow integration, data transformation, and visualization.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@EoghanOConnor EoghanOConnor force-pushed the analysis branch 2 times, most recently from 3a832ea to 9134d90 Compare June 3, 2026 16:11
@EoghanOConnor EoghanOConnor changed the title [WIP]: Trace analysis notebook Trace analysis notebook Jun 3, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
analysis/trace_analysis.ipynb (2)

54-68: ⚡ Quick win

Cells 2 and 3 duplicate the experiment-lookup logic.

The tracking-URI setup, client creation, and get_experiment_by_name block here repeat what cell 2 (lines 29–44) already does, including a re-import of mlflow/MlflowClient. Consolidate into a single setup cell so experiment_id has one source of truth and the notebook is easier to run top-to-bottom.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@analysis/trace_analysis.ipynb` around lines 54 - 68, Remove the duplicated
MLflow setup and experiment lookup by consolidating the repeated imports and
logic into a single initialization cell: keep one mlflow import and MlflowClient
creation, one mlflow.set_tracking_uri(MLFLOW_TRACKING_URI), one client =
MlflowClient(), and a single mlflow.get_experiment_by_name(...) block that
assigns experiment_id; delete the redundant cell contents (the second import,
set_tracking_uri, client creation, and try/except that recreates experiment_id)
so experiment_id is sourced only once and the notebook runs top-to-bottom.

197-198: 💤 Low value

This cell renders a meaningless DataFrameGroupBy object.

df_traces.groupby('session_id') produces a lazy GroupBy, and display of it shows only its repr, not data. df_sessions is also reassigned in the next cell, making this assignment dead. Consider removing the cell or replacing it with an actual aggregation/.size() view.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@analysis/trace_analysis.ipynb` around lines 197 - 198, The cell assigns
df_sessions = df_traces.groupby('session_id') and then display(df_sessions),
which just shows a GroupBy repr and is dead because df_sessions is reassigned
later; replace or remove it: either remove the entire cell, or change the
assignment to an actual aggregation/view such as df_sessions =
df_traces.groupby('session_id').size().reset_index(name='count') (or another
meaningful aggregation like .agg(...)) and then display(df_sessions) so the
DataFrame is informative; update references to df_sessions accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@analysis/trace_analysis.ipynb`:
- Around line 120-139: The parsing loop accesses trace.data.response and
trace.info.trace_metadata keys unsafely (response_obj, response_text,
token_usage, total_cost, session_id) which can raise TypeError/KeyError and
abort the whole batch; update the logic around response_obj, span_names,
token_usage and session_id to defensively validate and default values: use
trace.data.get('response') or getattr(trace.data, 'response', None), check
isinstance(response_obj, (str, dict)) and wrap json.loads in try/except (skip or
continue on parse failure), use
trace.info.trace_metadata.get('mlflow.trace.tokenUsage') and
.get('mlflow.trace.cost') and .get('mlflow.trace.session') with sensible
defaults or skip traces missing required metadata, and only set response_text
when response_obj is a dict containing the 'response' key to avoid crashes
(apply these guards where spans, span_names, token_usage, total_cost, and
session_id are computed).
- Around line 93-98: The exception handler leaves the variable traces undefined
causing a NameError later; before the try (or inside the except) ensure traces
is initialized to an empty list so downstream cells can iterate safely. Locate
the block that assigns all_traces to traces (and the surrounding try/except) and
set traces = [] before entering the try or add traces = [] in the except
alongside the existing print/traceback so that references to traces (e.g., the
subsequent for trace in traces) degrade gracefully.
- Around line 235-240: Remove the early unguarded assignment of
df_sessions['cost_per_job'] (the line computing session_total_costs / job_count)
to avoid divide-by-zero warnings; keep the later guarded calculation that checks
job_count > 0 before dividing. Specifically, delete the initial cost_per_job
assignment referencing df_sessions, session_total_costs and job_count and rely
solely on the protected computation already present further down.

---

Nitpick comments:
In `@analysis/trace_analysis.ipynb`:
- Around line 54-68: Remove the duplicated MLflow setup and experiment lookup by
consolidating the repeated imports and logic into a single initialization cell:
keep one mlflow import and MlflowClient creation, one
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI), one client = MlflowClient(), and a
single mlflow.get_experiment_by_name(...) block that assigns experiment_id;
delete the redundant cell contents (the second import, set_tracking_uri, client
creation, and try/except that recreates experiment_id) so experiment_id is
sourced only once and the notebook runs top-to-bottom.
- Around line 197-198: The cell assigns df_sessions =
df_traces.groupby('session_id') and then display(df_sessions), which just shows
a GroupBy repr and is dead because df_sessions is reassigned later; replace or
remove it: either remove the entire cell, or change the assignment to an actual
aggregation/view such as df_sessions =
df_traces.groupby('session_id').size().reset_index(name='count') (or another
meaningful aggregation like .agg(...)) and then display(df_sessions) so the
DataFrame is informative; update references to df_sessions accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 118f7269-c342-4106-b40e-d2c450578e1a

📥 Commits

Reviewing files that changed from the base of the PR and between 108dc23 and 12105f7.

📒 Files selected for processing (1)
  • analysis/trace_analysis.ipynb

Comment on lines +93 to +98
" traces = all_traces\n",
"\n",
"except Exception as e:\n",
" print(f\"Error: {e}\")\n",
" import traceback\n",
" traceback.print_exc()"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

traces is undefined when fetching fails.

On the exception path traces is never assigned, so the next cell's for trace in traces raises NameError instead of surfacing the original failure. Initialize traces = [] before the try (or in the except) so downstream cells degrade gracefully.

Proposed fix
+    traces = []
     # Fetch ALL traces with pagination
     all_traces = []
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@analysis/trace_analysis.ipynb` around lines 93 - 98, The exception handler
leaves the variable traces undefined causing a NameError later; before the try
(or inside the except) ensure traces is initialized to an empty list so
downstream cells can iterate safely. Locate the block that assigns all_traces to
traces (and the surrounding try/except) and set traces = [] before entering the
try or add traces = [] in the except alongside the existing print/traceback so
that references to traces (e.g., the subsequent for trace in traces) degrade
gracefully.

Comment on lines +120 to +139
" response_obj = trace.data.response\n",
"\n",
" # If it's a string, parse it\n",
" if isinstance(response_obj, str):\n",
" response_obj = json.loads(response_obj)\n",
"\n",
" # Get the trace breakdown (spans)\n",
" spans = trace.data.spans\n",
"\n",
" span_names = [span.name for span in trace.data.spans]\n",
"\n",
" token_usage = json.loads(trace.info.trace_metadata['mlflow.trace.tokenUsage'])\n",
" total_tokens = token_usage['total_tokens']\n",
"\n",
" token_usage = json.loads(trace.info.trace_metadata['mlflow.trace.cost'])\n",
" total_cost = token_usage['total_cost']\n",
"\n",
" session_id = trace.info.trace_metadata['mlflow.trace.session']\n",
" # Now get the actual response text\n",
" response_text = response_obj['response']\n",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unguarded response/metadata access can crash the parse loop.

Two reachable failures here:

  • response_obj is only JSON-parsed when it is a str; if trace.data.response is None (or a non-dict), response_obj['response'] at line 139 raises TypeError/KeyError.
  • The trace.info.trace_metadata[...] lookups (lines 131–137) assume mlflow.trace.tokenUsage, mlflow.trace.cost, and mlflow.trace.session are always present; any trace missing these keys aborts the whole loop via the outer except, dropping all already-parsed rows.

Use .get() with defaults and skip/guard traces lacking the required metadata so one malformed trace doesn't fail the entire batch.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@analysis/trace_analysis.ipynb` around lines 120 - 139, The parsing loop
accesses trace.data.response and trace.info.trace_metadata keys unsafely
(response_obj, response_text, token_usage, total_cost, session_id) which can
raise TypeError/KeyError and abort the whole batch; update the logic around
response_obj, span_names, token_usage and session_id to defensively validate and
default values: use trace.data.get('response') or getattr(trace.data,
'response', None), check isinstance(response_obj, (str, dict)) and wrap
json.loads in try/except (skip or continue on parse failure), use
trace.info.trace_metadata.get('mlflow.trace.tokenUsage') and
.get('mlflow.trace.cost') and .get('mlflow.trace.session') with sensible
defaults or skip traces missing required metadata, and only set response_text
when response_obj is a dict containing the 'response' key to avoid crashes
(apply these guards where spans, span_names, token_usage, total_cost, and
session_id are computed).

Comment on lines +235 to +240
"# Add the session total tokens and costs\n",
"df_sessions = df_sessions.merge(session_tokens, on='session_id', how='left')\n",
"df_sessions = df_sessions.merge(session_costs, on='session_id', how='left')\n",
"\n",
"# Calculate cost per job\n",
"df_sessions['cost_per_job'] = df_sessions['session_total_costs'] / df_sessions['job_count']\n",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In pandas, does DataFrame.merge with a named Series whose index name matches the on key raise KeyError, requiring reset_index?

💡 Result:

In pandas, DataFrame.merge treats a named Series as a DataFrame with a single column, where the column name is the name of the Series [1][2]. The merge parameters on, left_on, and right_on can reference either column names or index level names [3]. If you attempt to merge a DataFrame and a named Series using a key that exists as an index name in the Series, pandas should correctly identify it as a join key without requiring you to manually reset the index [3]. The merge operation treats index levels as valid merge keys, and you do not need to convert the index to a column to perform the merge [3]. If you encounter a KeyError during a merge, it is typically because the specified key does not exist in the object being merged or because of an ambiguity in how the merge parameters are configured [4]. You do not need to reset the index simply because the join key is an index name; you only need to ensure the merge parameters (such as left_on or right_on) correctly point to the intended index levels or columns [3].

Citations:


Remove dead cost_per_job assignment (avoid divide-by-zero warnings)

df_sessions['cost_per_job'] is computed once at ~line 240 as session_total_costs / job_count, then overwritten later (~248–251) with a guarded version (job_count > 0 ? ...). Remove the first computation and keep only the guarded calculation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@analysis/trace_analysis.ipynb` around lines 235 - 240, Remove the early
unguarded assignment of df_sessions['cost_per_job'] (the line computing
session_total_costs / job_count) to avoid divide-by-zero warnings; keep the
later guarded calculation that checks job_count > 0 before dividing.
Specifically, delete the initial cost_per_job assignment referencing
df_sessions, session_total_costs and job_count and rely solely on the protected
computation already present further down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant