Skip to content

fix(todo): stop json.loads corrupting bare numeric-looking todo ids#663

Open
sean-kim05 wants to merge 2 commits into
usestrix:mainfrom
sean-kim05:fix/todo-id-json-scalar-corruption
Open

fix(todo): stop json.loads corrupting bare numeric-looking todo ids#663
sean-kim05 wants to merge 2 commits into
usestrix:mainfrom
sean-kim05:fix/todo-id-json-scalar-corruption

Conversation

@sean-kim05

@sean-kim05 sean-kim05 commented Jul 3, 2026

Copy link
Copy Markdown

Closes #670

What

_normalize_todo_ids (in strix/tools/todo/tools.py) accepts a todo id argument that may be a JSON array ('["a","b"]'), a comma-separated string, or a single bare id. It runs the input through json.loads and, when the parsed value isn't a list, returns str(parsed):

try:
    data = json.loads(stripped)
except json.JSONDecodeError:
    data = stripped.split(",") if "," in stripped else [stripped]
if isinstance(data, list):
    return [str(item).strip() for item in data if str(item).strip()]
return [str(data).strip()]        # <-- str() of a parsed JSON scalar

Todo ids are generated as str(uuid.uuid4())[:6] — 6-char hex slugs. Slugs like 1e5230, 2363e0, or 0e4440 are valid JSON numbers, so json.loads parses them:

  • json.loads("1e5230")float('inf')"inf"
  • json.loads("2363e0")2363.0"2363.0"

So the id is silently corrupted. mark_todo_done, mark_todo_pending, and delete_todo then look up the wrong id and return Todo with ID 'inf' not found — the operation quietly fails on the real todo.

This only affects the bare single-id input path, but that path is deliberately supported (the else [stripped] / comma-split fallback exists precisely to tolerate non-array input, which an LLM caller regularly produces despite the "pass a list" guidance).

Fix

Use json.loads only to detect and unpack the JSON array form. Any non-array input is treated as a literal id (optionally comma-separated) and is never passed through the parsed scalar — so numeric-looking ids survive verbatim. The JSON-array and Python-list paths are unchanged.

Tests

New tests/test_todo.py:

  • Normalization paths: bare numeric-looking ids preserved (1e5230, 2363e0, 0e4440), plain hex/digit ids, JSON array, comma-separated, list input, empty/None.
  • End-to-end: _mark resolves a bare 1e5230 id and marks the real todo done (rather than erroring on 'inf').

Verified RED→GREEN: without the fix, the two bug-targeting tests fail (the _mark test returns success: False because it searched for 'inf'); with the fix all pass. Full suite: 87 passed (the 2 pre-existing Windows-only failures in test_config_loader/test_local_sources are unrelated). ruff format, ruff check, and mypy strix/tools/todo/tools.py all clean.

`_normalize_todo_ids` runs a bare id string through `json.loads` and, when
the result isn't a list, returns `str(parsed)`. Todo ids are
`str(uuid.uuid4())[:6]` hex slugs, and ones like "1e5230" or "2363e0" are
valid JSON numbers — so json.loads turns them into `inf` / `2363.0`.
`mark_todo_done`, `mark_todo_pending`, and `delete_todo` then operate on a
non-existent id and report "Todo with ID 'inf' not found", silently
failing whenever such an id is passed as a bare string — a form the
comma-split fallback explicitly supports.

Use json only to detect the JSON *array* form; treat any non-array input
as a literal id (optionally comma-separated). Adds tests/test_todo.py
covering the normalization paths plus an end-to-end `_mark` resolution.
@greptile-apps

greptile-apps Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes todo id normalization for bare numeric-looking ids. The main changes are:

  • Uses JSON parsing only to unpack JSON-array todo id inputs.
  • Preserves non-array string inputs as literal ids or comma-separated ids.
  • Adds tests for normalization paths and an end-to-end mark operation.

Confidence Score: 4/5

The changed normalization flow looks mergeable after preserving the JSON string scalar fallback.

  • Bare numeric-looking todo ids are now preserved correctly.
  • JSON-array, comma-separated, list, empty, and in-memory mark paths have direct test coverage.
  • A single-id JSON string scalar can now be looked up with quotes included, which can make an existing fallback input report the todo as missing.

strix/tools/todo/tools.py

Important Files Changed

Filename Overview
strix/tools/todo/tools.py Updates _normalize_todo_ids so bare numeric-looking ids are no longer converted through JSON numeric scalar parsing.
tests/test_todo.py Adds focused tests for todo id normalization and in-memory marking of a numeric-looking id.
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
strix/tools/todo/tools.py:144-147
**JSON Scalar Quotes Become ID Characters**

When a caller passes a single todo id as a JSON string scalar like `"a3f9c2"`, `json.loads` still succeeds but the parsed scalar is now ignored, so the lookup uses the literal value including quotes. This is an old-vs-new regression for the single-id fallback path: the todo stored as `a3f9c2` is reported as not found under `"a3f9c2"`.

```suggestion
        if isinstance(parsed, list):
            return [str(item).strip() for item in parsed if str(item).strip()]
        if isinstance(parsed, str):
            return [parsed.strip()] if parsed.strip() else []
        parts = stripped.split(",") if "," in stripped else [stripped]
        return [part.strip() for part in parts if part.strip()]
```

Reviews (1): Last reviewed commit: "fix(todo): stop json.loads corrupting ba..." | Re-trigger Greptile

Comment thread strix/tools/todo/tools.py Outdated
Comment on lines +144 to +147
if isinstance(parsed, list):
return [str(item).strip() for item in parsed if str(item).strip()]
parts = stripped.split(",") if "," in stripped else [stripped]
return [part.strip() for part in parts if part.strip()]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 JSON Scalar Quotes Become ID Characters

When a caller passes a single todo id as a JSON string scalar like "a3f9c2", json.loads still succeeds but the parsed scalar is now ignored, so the lookup uses the literal value including quotes. This is an old-vs-new regression for the single-id fallback path: the todo stored as a3f9c2 is reported as not found under "a3f9c2".

Suggested change
if isinstance(parsed, list):
return [str(item).strip() for item in parsed if str(item).strip()]
parts = stripped.split(",") if "," in stripped else [stripped]
return [part.strip() for part in parts if part.strip()]
if isinstance(parsed, list):
return [str(item).strip() for item in parsed if str(item).strip()]
if isinstance(parsed, str):
return [parsed.strip()] if parsed.strip() else []
parts = stripped.split(",") if "," in stripped else [stripped]
return [part.strip() for part in parts if part.strip()]
Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/tools/todo/tools.py
Line: 144-147

Comment:
**JSON Scalar Quotes Become ID Characters**

When a caller passes a single todo id as a JSON string scalar like `"a3f9c2"`, `json.loads` still succeeds but the parsed scalar is now ignored, so the lookup uses the literal value including quotes. This is an old-vs-new regression for the single-id fallback path: the todo stored as `a3f9c2` is reported as not found under `"a3f9c2"`.

```suggestion
        if isinstance(parsed, list):
            return [str(item).strip() for item in parsed if str(item).strip()]
        if isinstance(parsed, str):
            return [parsed.strip()] if parsed.strip() else []
        parts = stripped.split(",") if "," in stripped else [stripped]
        return [part.strip() for part in parts if part.strip()]
```

How can I resolve this? If you propose a fix, please make it concise.

The first pass routed every non-array input through the literal-token
path, which dropped the surrounding quotes of a JSON string scalar — so a
caller passing '"a3f9c2"' looked the id up *with* quotes and missed the
real todo (a regression from the prior json.loads behaviour, flagged by
the Greptile review on usestrix#663).

Handle a parsed `str` by unwrapping it, while still keeping numeric-looking
bare ids literal (json.loads yields a float/inf for those, not a str).
The string / comma / single-token cases resolve into one token list, which
also keeps the function within the return-count lint limit. Adds a
regression test for the quoted-scalar and empty-string inputs.
@sean-kim05

Copy link
Copy Markdown
Author

Thanks for the catch — confirmed as a real regression on my part. Routing every non-array input through the literal-token path dropped the quote-unwrapping that json.loads previously did for a JSON string scalar, so '"a3f9c2"' was looked up with the quotes and missed the stored todo.

Fixed in 370e945:

  • a parsed str is now unwrapped to its value;
  • numeric-looking bare ids stay literal, since json.loads yields a float/inf (not a str) for those, so '1e5230' is unaffected;
  • added a regression test covering '"a3f9c2"', '"1e5230"', and '""' (empty → no ids).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Todo tools silently fail on numeric-looking ids (e.g. "1e5230" parsed to inf)

1 participant