feat: anonymizer measurement instrumentation and benchmark tooling by binaryaaron · Pull Request #177 · NVIDIA-NeMo/Anonymizer

binaryaaron · 2026-06-03T16:25:19Z

Summary

Adds in-repo measurement sessions, stage timers, DataDesigner workflow metrics, direct model workflow metrics, and sanitized per-record safety metrics.
Records sanitized evaluation_record rows when benchmark replace configs set evaluate: true, preserving judge verdict booleans and invalid-item counts without persisting evaluated dataframes, raw judge traces, prompts, entity values, or replacement strings.
Refactors measurement internals into the anonymizer.measurement package while preserving the public anonymizer.measurement import surface.
Adds the core benchmark runner for repeatable workloads: suite preflight validation, row slicing, case retries, per-case raw measurement shards, combined measurements.jsonl, table export, detection-artifact sidecars, raw DataDesigner message trace capture, and sanitized DataDesigner scheduler task traces.
Uses DataDesigner native LLM column tracing for standard LLM columns, adds a temporary Anonymizer private model-registry/facade shim for model-backed CustomColumnConfig traces, and emits safe dd_trace_coverage records that show native, private-facade, and unsupported coverage.
Adds the front-door measurement table exporter plus first-order benchmark and detection-artifact analysis tools, including case/group rollups for sanitized replace judge evaluation rows.
Factors shared measurement-tool support into tools/measurement/measurement_tools/ for CLI logging, export formats, table writing, manifests, and small aggregation helpers. Scripts keep their row models and metric semantics local.
Documents the in-repo observability system and puts the measurements.jsonl to Parquet/CSV/JSONL workflow front and center in the measurement tool README.
Refactors benchmark-output tests to use checked-in measurement fixtures, and uses the repo synthetic biography sample for valid benchmark workload tests.

This PR intentionally does not carry the larger derivative strategy/probe/comparison tools. Those are split to a stacked follow-up so this PR stays focused on measurement capture, export, and the basic benchmark harness.

Stack

feat: anonymizer measurement instrumentation and benchmark tooling #177: core measurement instrumentation, benchmark runner, export, and basic analysis
Follow-up stacked PR: derivative measurement probes, strategy comparisons, trace analysis, and signature-delta tools
feat: add local structured substitute benchmark strategy #184: local structured substitute replacement-map strategy
feat: add regex-backed detection benchmark strategies #183: regex-backed detection benchmark strategies

Alignment

Distributed DataDesigner execution is outside this PR. Detection export APIs, such as the work in #182, should build configs for external runtimes; the measurement tools here should consume the resulting measurement JSONL, detection artifacts, and trace sidecars.

Validation

Latest checks after benchmark analysis evaluation rollups:

uv run --frozen pytest tests/tools/test_measurement_tools.py tests/tools/test_benchmark_output_analysis.py -q
- Result: 39 passed, 6 existing DataDesigner model-config deprecation warnings
uv run --frozen pytest tests/tools/test_benchmark_output_analysis.py -q
- Result: 10 passed
uv run --frozen ruff check tools/measurement/analyze_benchmark_output.py tests/tools/test_benchmark_output_analysis.py
uv run --frozen ruff format --check tools/measurement/analyze_benchmark_output.py tests/tools/test_benchmark_output_analysis.py
uv run tools/codestyle/format.sh --check
tynav --ty-bin /root/.local/share/uv/tools/ty/bin/ty diagnostics tools/measurement/analyze_benchmark_output.py
- Result: no diagnostics for the edited file; existing pyproject.toml deprecated tool.ty.src.root warning remains
git diff --check

Earlier checks after sanitized evaluation metrics:

uv run --frozen pytest tests/test_measurement.py tests/tools/test_measurement_tools.py -q
- Result: 75 passed, 9 existing DataDesigner model-config deprecation warnings
uv run --frozen ruff check src/anonymizer/measurement/records/row.py src/anonymizer/measurement/__init__.py tools/measurement/run_benchmarks.py tests/tools/test_measurement_tools.py
uv run --frozen ruff format --check src/anonymizer/measurement/records/row.py src/anonymizer/measurement/__init__.py tools/measurement/run_benchmarks.py tests/tools/test_measurement_tools.py
tynav --ty-bin /root/.local/share/uv/tools/ty/bin/ty diagnostics src/anonymizer/measurement/records/row.py
- Result: no diagnostics for the edited file; existing pyproject.toml deprecated tool.ty.src.root warning remains
git diff --check

Earlier checks after the custom-column DD trace shim:

uv run --frozen pytest tests/test_measurement.py tests/tools/test_measurement_tools.py tests/tools/test_benchmark_output_analysis.py -q
- Result: 71 passed, 6 existing DataDesigner model-config deprecation warnings
uv run --frozen ruff check src/anonymizer/engine/ndd/adapter.py src/anonymizer/measurement/sinks.py tests/test_measurement.py
uv run --frozen ruff format --check src/anonymizer/engine/ndd/adapter.py src/anonymizer/measurement/sinks.py tests/test_measurement.py
uv run tools/codestyle/format.sh --check
git diff --check

Earlier checks after the split:

uv run --frozen pytest tests/test_measurement.py tests/engine/test_ndd_adapter.py tests/tools/test_measurement_tools.py tests/tools/test_benchmark_output_analysis.py tests/tools/test_detection_artifact_analysis.py -q
- Result: 82 passed, 9 existing DataDesigner model-config deprecation warnings
uv run --frozen ruff check tools/measurement/run_benchmarks.py tools/measurement/export_measurements.py tools/measurement/analyze_benchmark_output.py tools/measurement/analyze_detection_artifacts.py tools/measurement/measurement_tools tests/tools/test_measurement_tools.py tests/tools/test_benchmark_output_analysis.py tests/tools/test_detection_artifact_analysis.py
uv run tools/codestyle/format.sh --check
git diff --cached --check
CLI smoke:
- uv run python tools/measurement/run_benchmarks.py --help
- uv run python tools/measurement/export_measurements.py --help
- uv run python tools/measurement/analyze_benchmark_output.py --help
- uv run python tools/measurement/analyze_detection_artifacts.py --help

Earlier branch checks also covered docs build, benchmark dry-run validation for tools/measurement/examples/repo-data-smoke.yaml, shell syntax validation for the DD-trace smoke script, and tynav diagnostics on the measurement module.

Dogfood

Dogfood with local vLLM endpoint:

Endpoint: http://nemotron-3-super-h100-svc.aagonzales-dev.svc.cluster.local:8000/v1
Model: nvidia/nemotron-3-super
Output: /tmp/anonymizer-dogfood-mini-h100
Result: 2/2 cases completed, 0 errors
biographies__biographies-redact-default__r000: ~12.0s, 21 final entities
legal__legal-hash-agent-labels__r000: ~9.1s, 14 final entities
measurements.jsonl: 14 records
task traces: 10 records per case
table export: 5 tables from 14 records (run, dd_trace_coverage, ndd_workflow, stage, record)

Notes

Checked-in benchmark suites use repository-relative paths and env-backed runtime config; they avoid machine-specific endpoints and absolute local paths.
DD message traces, raw case shards, and DataDesigner artifacts are sensitive run artifacts and may contain prompts, model outputs, secrets, or PII.
DataDesigner scheduler task traces are sanitized timing sidecars. They include queue/execution/total durations and error presence, but intentionally omit raw DD error strings.
Optional replace judge evaluation metrics are sanitized into evaluation_record rows. They include verdict booleans and invalid-item counts, but not original text, entity values, replacement values, raw judge outputs, prompts, or model responses.
Standard DD message tracing uses native DataDesigner trace side effects for LLMTextColumnConfig and LLMStructuredColumnConfig.
Model-backed CustomColumnConfig traces currently use a temporary Anonymizer shim that instruments the per-run private DataDesigner model registry and returned model facades. This is intentionally documented as brittle and should be replaced by a public DataDesigner model-call trace sink. No DataDesigner issue or PR has been opened from this PR.
Runtime timings, retry counts, token counts, and local endpoint names are environment-dependent and should not be treated as portable fixtures.

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

greptile-apps · 2026-06-08T21:22:46Z

Greptile Summary

This PR introduces a comprehensive measurement and benchmarking system for the Anonymizer project. It adds a new anonymizer.measurement package with session management, streaming JSONL sinks, per-record metrics, and sanitized evaluation capture, plus a full benchmark runner (run_benchmarks.py) that orchestrates repeatable workloads with per-case retries, detection-artifact sidecars, and DataDesigner trace shims.

Measurement core: MeasurementCollector with HMAC-keyed record hashing, ContextVar-backed session propagation, streaming vs. batch write modes, and three separate sinks (records, DD message traces, DD task traces) isolated to protect against PII leakage.
NDD adapter instrumentation: Instance-level _DataDesignerUsageProbe that patches _create_resource_provider, wraps ModelFacade methods per-instance, and captures native DD column traces — replacing the class-level monkey-patching flagged in a previous review.
Benchmark runner: Suite YAML spec validation, workload row-slicing, case retry loop, combined measurements.jsonl, table export (Parquet/CSV/JSONL), and detection-artifact analysis sidecars.

Confidence Score: 5/5

Safe to merge. The measurement instrumentation is opt-in and observability-only; no core anonymization paths are altered.

The class-level monkey-patching race and enum value mismatch flagged in earlier reviews are both resolved. The new _DataDesignerUsageProbe patches at the resource_provider and ModelFacade instance level, with patches properly restored in reverse order. ContextVar propagation, streaming sink thread-safety, and error-priority ordering in configured_measurement_session are all correct. The two findings are a minor error-drop in close() and a documentation gap on a placeholder function, neither of which affects measurement correctness.

No files require special attention beyond the minor close() error-drop in src/anonymizer/measurement/collector.py.

Important Files Changed

Filename	Overview
src/anonymizer/measurement/collector.py	New MeasurementCollector with HMAC-keyed record hashing, three separate sinks, and streaming support. Minor: close() only preserves the first error when multiple sinks fail to close.
src/anonymizer/measurement/session.py	ContextVar-backed session management with correct error propagation — body errors take priority over write/close errors, write errors take priority over close errors.
src/anonymizer/engine/ndd/adapter.py	Large instrumentation layer added: instance-level _DataDesignerUsageProbe replacing the class-level patching flagged in a prior review, _DDMessageTracePlan for native column tracing, and _temporary_dd_task_trace. RLock serialises concurrent run_workflow calls on the same adapter instance.
src/anonymizer/measurement/records/row.py	Per-row record and evaluation metric capture. Evaluation records preserve only verdict booleans and invalid-item counts, correctly excluding raw text and entity values.
tools/measurement/run_benchmarks.py	Full benchmark runner with suite validation, row slicing, retry loop, per-case streaming measurement, combined JSONL output, table export, and detection-artifact sidecars. CLI flags correctly match DDTraceMode enum values.
.github/workflows/benchmark-ci.yml	Workflow_dispatch CI with correct enum choices (last_message/all_messages matching DDTraceMode), proper secret gating, always-run summary/upload steps, and appropriate self-hosted runner config.
src/anonymizer/measurement/sinks.py	Thread-safe line-buffered JSONL streaming sink with Lock, plus batch JSONL/JSON writers. Parent directory creation on init ensures directories exist before streaming starts.
src/anonymizer/measurement/recorders.py	stage_timer yields a mutable dict so callers can inject output_row_count/failed_record_count after the timed block; the finally clause then spreads those updates into the measurement record correctly.
src/anonymizer/engine/replace/llm_replace_workflow.py	Adds synthetic-original collision repair for replacement maps and a new COL_REPLACEMENT_MAP_SOURCE column; collision candidates are generated via an index-incrementing loop with a protected-values guard.
src/anonymizer/measurement/_coerce.py	Coercion helpers for JSON-safe output, text token counting (tiktoken with word-count fallback), and size bucketing. Buckets are non-overlapping and correctly labelled.

Sequence Diagram

sequenceDiagram
    participant CLI as run_benchmarks.py
    participant Session as configured_measurement_session
    participant Collector as MeasurementCollector
    participant Anon as Anonymizer._run_internal
    participant NDD as NddAdapter.run_workflow
    participant Probe as _DataDesignerUsageProbe
    participant Sink as _JsonlMeasurementSink

    CLI->>Session: "MeasurementConfig(streaming=True)"
    Session->>Sink: open JSONL sinks
    Session->>Collector: create collector
    Session-->>CLI: yield collector

    CLI->>Anon: anonymizer.run(config, data)
    Anon->>Collector: record_run_metadata()
    Anon->>NDD: run_workflow(...)
    NDD->>Probe: patch _create_resource_provider
    NDD->>NDD: acquire _run_lock
    NDD->>NDD: DataDesigner.create
    Probe->>Probe: wrap ModelFacade per instance
    NDD->>Collector: record_ndd_workflow()
    NDD->>Probe: flush_private_trace_records()
    Probe->>Collector: record_dd_message_trace()
    Anon->>Collector: record_record_metrics() per row

    opt config.evaluate
        CLI->>Collector: record_evaluation_metrics()
    end

    Session->>Collector: close() all sinks
    CLI->>CLI: combine_measurements()
    CLI->>CLI: export_measurement_tables()

_{Reviews (19): Last reviewed commit: "Add evaluation rollups to benchmark anal..." | Re-trigger Greptile}

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

andreatgretel · 2026-06-09T01:07:14Z

This overlaps with my benchmark CI PR #162 enough that I’m happy to close mine and let this be the main benchmark tooling direction.

The one thing I’d want to preserve from #162 is the CI/workflow shape: a manual GitHub Actions workflow, NVIDIA_API_KEY setup, benchmark artifact upload, and a step summary. This PR has the more complete runner/measurement stack, so it probably makes sense to retire my bespoke scripts/benchmark_ci.py path and have the workflow call tools/measurement/run_benchmarks.py instead.

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

andreatgretel · 2026-06-10T20:07:03Z

One more related follow-up, since GitHub does not let me anchor this on the unchanged chunked_validation._dispatch_chunk() call site in this PR diff:

_dispatch_chunk() was the main blind spot in the offline replace-mode async profile. On the biographies sample, each measured row made one validation call through that path, and those calls dominated pipeline wall time.

As an incremental path before a general DD model-call hook exists, it would be useful to record a sanitized validator_chunk_model_call event around the facade.generate() call there. Useful fields would be alias, chunk_index, attempt_index, elapsed_sec, ok, error_type, and prompt_char_count for the final prompt after PydanticResponseRecipe is applied. Token usage does not appear to be available from facade.generate() directly, so the workflow-level aggregate usage probe can keep covering that part.

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

lipikaramaswamy · 2026-06-11T00:18:23Z

This is awesome! Thank you @binaryaaron for setting up measurement 🤩

Took me a while, but I reviewed the measurement core, benchmark runner, exporters/analyzers, benchmark CI workflow, and the Anonymizer/NDD instrumentation. I also tried running the smoke suite locally. Some notes from that below -

The dry-run passed and planned 2 cases. Running the default smoke against build/integrate completed the biographies/redact case and produced the expected measurement outputs: run metadata, stage timings, NDD workflow request/token usage, tokens/sec, and per-record entity/replacement counts.

The legal/hash case initially failed on a transient openai/gpt-oss-120b health-check rate limit. I reran that case with explicit build provider/model config, skip_health_check: true, and lower model parallelism, and it completed successfully:

5/5 rows completed
~69s elapsed
observed_total_tokens: 38168
observed_tokens_per_sec: ~553
outputs included measurements.jsonl and normalized parquet tables for run, ndd_workflow, stage, and record

A few things I think would be good to have before merge:

Can we add runner-level support for arbitrary run_tags? I left an inline comment on this, but for our GitLab flow we’ll want to stamp metadata like anonymizer_ref, commit_sha, benchmark_suite_ref, benchmark_suite_commit_sha, and pipeline_id into every measurement row.
Can we make provider/model selection explicit in the benchmark docs/examples? Model IDs are provider-specific: build/integrate uses names like openai/gpt-oss-120b, while internal inference uses names like nvidia/openai/gpt-oss-120b, and other providers can use entirely different names. I think benchmark suites should treat model_configs and model_providers as a matched pair for reproducibility.
Can we clarify the recommended health-check behavior for benchmark/smoke runs? The smoke can fail before producing useful measurements if a provider health check hits rate limits. It would be helpful to document when benchmark suites should use skip_health_check, retries/backoff, and lower parallelism, perhaps in AGENTS.md.
Can we add an optional evaluate step to the runner if quality benchmarking is intended to be in scope? Right now the runner measures anonymizer.run(...), but does not call Anonymizer.evaluate(...), so LLM judge quality metrics are not produced or stored by the benchmark runner.
Can we document the emitted record types and key fields? The outputs are useful, but downstream consumers will need a stable contract for run, ndd_workflow, stage, and record rows.

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

binaryaaron · 2026-06-11T18:23:09Z

Thanks for the detailed run notes. I addressed the merge-blocking items in 3afb44c: runner-level run_tags, explicit provider/model examples and docs, optional replace-mode evaluate, and documentation for emitted record types/key fields. The runner already has retries/backoff, and the example provider config shows skip_health_check so smoke runs can avoid provider health-check rate-limit noise when appropriate.

binaryaaron · 2026-06-11T18:23:15Z

Agreed that _dispatch_chunk() remains an important observability gap. I did not add a dedicated validator_chunk_model_call event in this PR because the current branch now leans on DD/native/private model-call tracing where possible, and the validator chunk path deserves a small focused follow-up. I would keep the proposed sanitized fields: alias, chunk index, attempt index, elapsed time, ok/error type, and final prompt character count.

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

lipikaramaswamy

Looks great, just small comment on the analysis tables :) Thanks!!

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

binaryaaron added 2 commits June 2, 2026 21:38

feat: add anonymizer measurement instrumentation

b75ab36

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

feat: add measurement benchmark tooling

019d852

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

binaryaaron changed the title ~~Add anonymizer measurement instrumentation and benchmark tooling~~ feat: anonymizer measurement instrumentation and benchmark tooling Jun 3, 2026

binaryaaron added 3 commits June 3, 2026 18:07

Add opt-in DD tracing for benchmarks

4745a4b

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

feat: add benchmark analysis strategy probes

6e7a30c

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

Harden benchmark runtime portability

756c41d

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

binaryaaron marked this pull request as ready for review June 8, 2026 21:15

binaryaaron requested review from a team as code owners June 8, 2026 21:15

greptile-apps Bot reviewed Jun 8, 2026

View reviewed changes

Comment thread src/anonymizer/measurement.py Outdated

Comment thread src/anonymizer/engine/ndd/adapter.py Outdated

binaryaaron added 2 commits June 8, 2026 21:54

Add richer benchmark quality metrics

b157a1d

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

Split regex detection into stacked branch

b348582

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

binaryaaron mentioned this pull request Jun 8, 2026

feat: add regex-backed detection benchmark strategies #183

Draft

binaryaaron added 2 commits June 8, 2026 22:52

Harden DD trace measurement hooks

c015b87

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

Move local structured substitute to stacked branch

f4fbe73

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

binaryaaron mentioned this pull request Jun 8, 2026

feat: add local structured substitute benchmark strategy #184

Draft

binaryaaron added 3 commits June 9, 2026 00:02

Format staged detection analysis tool

00a5f81

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

Clarify measurement docs and fixtures

d011a7c

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

Factor measurement tool support helpers

3a2ea40

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

andreatgretel reviewed Jun 9, 2026

View reviewed changes

Comment thread src/anonymizer/engine/ndd/adapter.py Outdated

andreatgretel reviewed Jun 9, 2026

View reviewed changes

Comment thread tools/measurement/run_benchmarks.py

andreatgretel reviewed Jun 9, 2026

View reviewed changes

Comment thread tools/measurement/run_benchmarks.py Outdated

andreatgretel reviewed Jun 9, 2026

View reviewed changes

Comment thread src/anonymizer/measurement.py Outdated

binaryaaron added 5 commits June 9, 2026 04:58

Fix measurement cleanup and benchmark metrics

ed8ba48

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

Use native DataDesigner message traces

a852659

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

Add sanitized DataDesigner task traces

8d9fab8

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

Cover structured DataDesigner message traces

eddb291

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

Add manual benchmark CI workflow

aa96fa6

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

Limit measurement PR to core benchmark tools

fabdc78

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

binaryaaron mentioned this pull request Jun 9, 2026

chore: add derivative measurement analysis tools #187

Draft

andreatgretel reviewed Jun 10, 2026

View reviewed changes

Comment thread src/anonymizer/engine/ndd/adapter.py Outdated

andreatgretel reviewed Jun 10, 2026

View reviewed changes

Comment thread src/anonymizer/engine/ndd/adapter.py Outdated

Trace custom DataDesigner model calls

7d58fd2

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

lipikaramaswamy reviewed Jun 10, 2026

View reviewed changes

Comment thread tools/measurement/run_benchmarks.py

lipikaramaswamy reviewed Jun 10, 2026

View reviewed changes

Comment thread src/anonymizer/engine/ndd/adapter.py Outdated

lipikaramaswamy reviewed Jun 10, 2026

View reviewed changes

Comment thread tools/measurement/run_benchmarks.py

lipikaramaswamy reviewed Jun 10, 2026

View reviewed changes

Comment thread tools/measurement/examples/repo-data-smoke.yaml

Improve benchmark observability tracing

3afb44c

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

greptile-apps Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread src/anonymizer/engine/ndd/adapter.py Outdated

Track unsupported DD trace columns

1d88389

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

greptile-apps Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread .github/workflows/benchmark-ci.yml

lipikaramaswamy reviewed Jun 11, 2026

View reviewed changes

Comment thread src/anonymizer/engine/ndd/adapter.py

Harden DD trace benchmark plumbing

00c7293

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

binaryaaron requested review from andreatgretel and lipikaramaswamy June 11, 2026 22:52

binaryaaron self-assigned this Jun 11, 2026

lipikaramaswamy reviewed Jun 11, 2026

View reviewed changes

Comment thread tools/measurement/run_benchmarks.py

Add sanitized evaluation measurement records

2a49d18

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

lipikaramaswamy approved these changes Jun 12, 2026

View reviewed changes

Comment thread tools/measurement/analyze_benchmark_output.py

lipikaramaswamy mentioned this pull request Jun 12, 2026

feat(detection): export detection workflow for distributed/SLURM exec… #182

Merged

9 tasks

Add evaluation rollups to benchmark analysis

6f843af

Signed-off-by: Aaron Gonzales <aagonzales@nvidia.com>

binaryaaron merged commit f6dd05d into main Jun 12, 2026
13 checks passed

binaryaaron deleted the binaryaaron/perf-epic branch June 12, 2026 20:54

lipikaramaswamy mentioned this pull request Jun 12, 2026

fix(detection): pass single chunk validation flag to exports #190

Open

9 tasks

Conversation

binaryaaron commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Stack

Alignment

Validation

Dogfood

Notes

Uh oh!

greptile-apps Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

andreatgretel commented Jun 9, 2026

Uh oh!

Uh oh!

Uh oh!

andreatgretel commented Jun 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lipikaramaswamy commented Jun 11, 2026

Uh oh!

Uh oh!

binaryaaron commented Jun 11, 2026

Uh oh!

binaryaaron commented Jun 11, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lipikaramaswamy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

binaryaaron commented Jun 3, 2026 •

edited

Loading

greptile-apps Bot commented Jun 8, 2026 •

edited

Loading