Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
245 changes: 245 additions & 0 deletions plans/cli-display/plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,245 @@
# CLI Display Improvements — Implementation Plan

## Problem

The CLI output is bare:

- `run` prints only `"Output written to: <path>"` — no stats, no entity summary, no failure info
- `preview` calls `result.dataframe.to_string(max_colwidth=80)` — raw pandas dump, hard to read, no summary
- `failed_records` are never shown to the user
- No way to save the `trace_dataframe` to a file

## Scope

No new dependencies. All rendering uses stdlib + pandas only.

---

## Files Changed

| File | Change |
|---|---|
| `src/anonymizer/interface/cli/_output.py` | Major expansion — new display functions |
| `src/anonymizer/interface/cli/main.py` | Add `--trace` flag; wire up new display calls |
| `tests/interface/cli/test_cli_output.py` | Update 1 existing test; add ~12 new tests |

`src/anonymizer/interface/anonymizer.py` — no changes needed. The private helpers `_unwrap_entities`, `_count_labels`, `_count_entities`, `_count_labels_for_row` are module-level functions and can be imported directly by `_output.py`. The import graph stays acyclic (`cli/_output.py` → `interface/anonymizer.py` → `engine/`).

---

## Step 1 — Expand `_output.py`

Keep `write_result()` signature identical (existing tests depend on it). Add:

### New public functions

```python
def write_trace(result: AnonymizerResult | PreviewResult, trace_path: str | Path) -> Path
def print_run_summary(result: AnonymizerResult, written: Path, *, trace_written: Path | None = None) -> None
def print_preview(result: PreviewResult) -> None
```

### New private helpers

```python
def _is_rewrite_mode(df: pd.DataFrame) -> bool
# True if any column ends with "_rewritten"

def _entity_stats(df: pd.DataFrame) -> tuple[int, Counter[str]]
# Reads COL_FINAL_ENTITIES, falls back to COL_DETECTED_ENTITIES
# Returns (total_count, per_label_counter)
# Returns (0, Counter()) when neither column is present (rewrite mode)

def _format_entity_table(total: int, labels: Counter[str]) -> list[str]
# Returns lines; top 20 labels sorted by count descending
# Appends " ... and N more" if truncated

def _format_failed_records(failed: list[FailedRecord]) -> list[str]
# Returns lines; empty list when len(failed) == 0

def _truncate(text: str, max_len: int = 120) -> str
# Appends "..." if text exceeds max_len

def _format_preview_record(idx: int, row: pd.Series, text_col: str, rewrite_mode: bool) -> list[str]
# Returns lines for one record block
```

Helpers return `list[str]` so they are unit-testable without capturing stdout.

### `print_run_summary` output — replace mode

```
Output written to: /path/to/data_anonymized.csv
Trace written to : /path/to/trace.csv ← only when --trace given

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say "trace dataset"


--- Summary ---
Records processed : 1,234
Failed records : 0
Total entities : 4,521
first_name : 891
email : 312
ssn : 47
... and 3 more
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to add some context about what happened besides what the user ran. sth like:

Mode : replace (redact) 
Source : data.csv (1,234 rows) 
Output : /path/to/data_anonymized.csv (1,234 rows, 1.2 MB) 
Trace : /path/to/trace.csv ← only when --trace given Elapsed : 1m 23s


### `print_run_summary` output — rewrite mode

```
Output written to: /path/to/data_rewritten.csv

--- Summary ---
Records processed : 1,234
Failed records : 0
Avg utility score : 0.87
Avg leakage mass : 0.03
Avg weighted leakage : 0.01
Needs human review : 12 (1.0%)
```

### Failed records block (appended when failures > 0)

```
--- Failed Records (3) ---
record_id=abc123 step=detect reason=LLM timeout

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this as the group failures? sth like Group by (step, reason, count):

--- Failed Records (12) ---
  detect    LLM timeout         × 8
  replace   JSON parse error    × 3
  detect    GLiNER OOM          × 1

Tip: run with --debug to see per-record stack traces, or --failed-output failed.csv to export failures for triage.
Add --failed-output PATH (parallel to --trace): writes a small dataframe of record_id, step, reason for downstream triage.
The "Tip" footer turns a wall of text into a next action.

record_id=def456 step=replace reason=JSON parse error
```

### `print_preview` output — replace mode

```
--- Preview (3 records, replace mode) ---

Record 1
original : Alice works at Acme Corp, call 555-1212
replaced : [REDACTED_FIRST_NAME] works at [REDACTED_COMPANY_NAME], ...
entities : first_name, company_name, phone_number

Record 2
...

--- 0 failed records ---
```

### `print_preview` output — rewrite mode

```
--- Preview (3 records, rewrite mode) ---

Record 1
original : Alice works at Acme Corp
rewritten : A healthcare professional works at a technology company
utility : 0.91
leakage : 0.00
review? : No

--- 0 failed records ---
```

### Edge cases

| Scenario | Behaviour |
|---|---|
| 0 entities detected | Print `"Total entities : 0"` with no label block |
| Rewrite mode — no entity column | Skip entity section entirely |
| Metric column absent in dataframe | Skip that metric line (guard with `if col in df.columns`) |
| 0 rows in dataframe (all failed) | Means are NaN — skip metric lines |
| Text > 120 chars | Truncate with `"..."` |
| > 20 entity labels | Show top 20 + `"... and N more"` |

---

## Step 2 — Update `main.py`

### Add `--trace` to `CliOpts` (under `# -- shared --`)

```python
trace: Annotated[
str | None,
cyclopts.Parameter(help="Write full pipeline trace to this path (.csv or .parquet)."),
] = None
```

`--trace` requires an explicit path argument (consistent with `--output`). No auto-derivation.

### Update `run` command

```python
# before
written = write_result(result, output)
print(f"Output written to: {written}")

# after
written = write_result(result, output)
trace_written = write_trace(result, opts.trace) if opts.trace else None
print_run_summary(result, written, trace_written=trace_written)
```

### Update `preview` command

```python
# before
print(result.dataframe.to_string(max_colwidth=80))

# after
print_preview(result)
```

### No `--trace` on `preview`

Preview is a quick inspection tool. Trace export belongs to `run`.

### Updated imports from `_output`

```python
from anonymizer.interface.cli._output import print_preview, print_run_summary, write_result, write_trace
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some TTY-aware visual polish could be very helpful to distinguish different sections of the output in CLI. The plan's mockups are pure ASCII, which is fine — but a few cheap conditionals (no new deps) make the output much more scannable:
This could be sth like, green for the output path line, red for the failures count when > 0, dim for label rows. Bold section headers (--- Summary ---) so they don't disappear in long terminal scrollback.
Auto-disable on non-TTY, NO_COLOR env, or adding a --no-color flag.

---

## Step 3 — Tests

### Update existing: `test_cli_output.py`

- `test_preview_prints_dataframe`: currently asserts `"bio_replaced" in out` (column name). After the change `print_preview` shows values, not column headers. Update assertion to check for a value from the replaced text (e.g. `"REDACTED_0"`).
- `test_run_default_output_path` and `test_run_explicit_output`: assert the output path appears in stdout — still true because `print_run_summary` prints it on the first line. No change needed.

### New tests to add

```
# print_run_summary — replace mode
test_run_summary_replace_prints_entity_stats
test_run_summary_replace_zero_entities
test_run_summary_with_failures_prints_failed_block
test_run_summary_no_failures_omits_failed_block
test_run_summary_trace_line_appears

# print_run_summary — rewrite mode
test_run_summary_rewrite_prints_metrics
test_run_summary_rewrite_missing_columns_no_crash

# print_preview
test_preview_replace_shows_record_blocks
test_preview_rewrite_shows_metrics_per_record
test_preview_truncates_long_text
test_preview_failed_records_shown
test_preview_empty_dataframe_no_crash

# write_trace
test_write_trace_csv
test_write_trace_parquet

# CLI integration
test_run_with_trace_flag_writes_trace_file
```

All new tests construct result objects directly (using an extended `_make_result` helper) — no real pipeline or LLM calls.

---

## Implementation Order

1. Implement `_output.py` (new helpers + display functions)
2. Update `main.py` (add `--trace`, wire display calls, update imports)
3. Update + extend `test_cli_output.py`
4. Run `make check` (format, lint, copyright)
5. Run `make test`
Loading