Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 44 minutes and 42 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a mutation-testing framework for C functions under Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/util/test_c_mutator.py`:
- Around line 27-29: Replace the ambiguous EN DASH characters (U+2013) used in
section comment headers with standard ASCII hyphen-minus (`-`) to satisfy Ruff
RUF003; specifically edit the comment lines that read patterns like "AOR –
Arithmetic Operator Replacement" (and the other section headers found at the
same file: the occurrences flagged on lines 28, 61, 95, 141, and 190) and change
the dash character to `-` so the headers become "AOR - Arithmetic Operator
Replacement" (and the analogous fixes for the other section titles).
- Around line 32-34: The test name test_aor_finds_multiplication_in_factorial is
misleading because it calls _get_function("data/qsort.c", "partition") and
verifies AOR mutants for + and -; rename the test and its docstring to
accurately reflect the behavior (e.g., test_aor_finds_add_and_sub_in_partition
or test_aor_finds_plus_minus_in_partition) and update any references to the old
name so the test clearly indicates it targets the partition function and
plus/minus arithmetic operators.
- Around line 19-24: The helper function _get_function is missing a return type
annotation; update its signature to specify it returns CFunction (not Optional)
because ParsecProject.get_function_or_none(name) yields CFunction | None and the
assert guarantees non-None. Import or reference the CFunction type if needed and
change def _get_function(file: str, name: str): to include -> CFunction so
static analysis (ruff ANN202) is satisfied, keeping the body unchanged and still
calling ParsecProject.get_function_or_none.
In `@util/c_mutator.py`:
- Around line 181-196: get_mutants() and get_mutants_by_operator() recompute
mutations on every call; add a lazy cache to avoid repeated tree traversals by
storing results once computed and returning cached values on subsequent calls.
Implement a private cache attribute (e.g., _mutants_cache and/or
_mutants_by_operator_cache) that get_mutants() and get_mutants_by_operator()
populate on first call using the existing helper methods (_apply_aor,
_apply_ror, _apply_lcr, _apply_crp, _apply_rvr) and return thereafter; ensure
any mutating operation on the AST or any public method that changes state
clears/invalidate the cache (e.g., in methods that modify the tree or in a new
invalidate_mutant_cache() called from those mutators). Keep existing public
signatures (get_mutants, get_mutants_by_operator) and ensure thread-safety if
necessary for your environment.
- Around line 311-313: The current check compares original_text == "0" and
misses other zero literal forms; update the check in the method using
self._node_text(expr_node) to detect any literal that represents numeric zero by
attempting to parse the token (e.g., call int(original_text, 0) or otherwise
normalize numeric literals while catching ValueError) and skip the mutation if
the parsed value is 0; fall back to the original string comparison only if
parsing fails so non-numeric expressions are still handled.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 788961dc-c286-4da1-8e6c-b19eb9caaac5
📒 Files selected for processing (3)
test/util/test_c_mutator.pyutil/__init__.pyutil/c_mutator.py
mernst
left a comment
There was a problem hiding this comment.
Please address the CodeRabbit comments.
A standard mutation is to replace a binary operation by each of its operands.
Two mutants for
a = b + c;
are
a = b;
a = c;
Is there a reason not to include those mutations?
Also, please document where you got the list of mutations that are included.
| """ | ||
|
|
||
| # Maps each arithmetic operator to its candidate replacements. | ||
| _AOR_REPLACEMENTS: Mapping[str, list[str]] = MappingProxyType( |
There was a problem hiding this comment.
This should be a field of MutationOperator.AOR, and likewise for other variables like this one.
| self._c_function = c_function | ||
| self._parser: Parser = Parser(_C_LANGUAGE) | ||
|
|
||
| source = c_function.source_code or c_function.get_original_source_code() |
There was a problem hiding this comment.
I think this functionality should be provided by a method in CFunction. Clients should always use that method (which should side-effect the field if it's not set). Clients should not have to check in two places.
| """Return all first-order mutants of this function. | ||
|
|
||
| Mutants are produced by applying each enabled mutation operator in | ||
| order: AOR → ROR → LCR → CRP → RVR. |
There was a problem hiding this comment.
Is "in order" needed in the spec? That prevents future optimizations, and I don't see why clients care about it.
| list[Mutant]: All AOR mutants. | ||
| """ | ||
| return self._apply_binary_operator_replacements( | ||
| MutationOperator.AOR, self._AOR_REPLACEMENTS |
There was a problem hiding this comment.
It should be enough to pass MutationOperator.AOR, because AOR it should contain the replacements as a field.
…ication into vikram/c_mutator
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
util/c_mutator.py (3)
395-412: 🛠️ Refactor suggestion | 🟠 Major
_get_operator_nodeis overly coupled to_ALL_OPERATOR_REPLACEMENTS.This helper currently returns
Nonefor binary-expression operators outside that global set, which can silently block future replacement-map extensions. Prefer returning the operator token independent of mutation categories and filtering at the replacement-map lookup site.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@util/c_mutator.py` around lines 395 - 412, The helper _get_operator_node currently filters anonymous children by checking their text against _ALL_OPERATOR_REPLACEMENTS which couples it to mutation categories; instead, have _get_operator_node simply return the first unnamed child node (i.e., the operator token) without consulting _ALL_OPERATOR_REPLACEMENTS so that caller code (where replacement lookups occur) can decide whether the token is supported—remove the text-based check in _get_operator_node and keep any filtering/lookup logic at the replacement-map lookup site that uses _node_text and _ALL_OPERATOR_REPLACEMENTS.
173-196: 🧹 Nitpick | 🔵 TrivialAvoid recomputing all mutants on repeated calls.
get_mutants()andget_mutants_by_operator()both regenerate full mutation sets every call, which is avoidable repeated traversal work.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@util/c_mutator.py` around lines 173 - 196, get_mutants() and get_mutants_by_operator() recompute the same mutants on every call; cache the results instead: compute once by calling the per-operator helpers (_apply_aor, _apply_ror, _apply_lcr, _apply_crp, _apply_rvr) once, store the combined list in a private attribute (e.g. self._mutants) and the mapping in another private attribute (e.g. self._mutants_by_operator), and have get_mutants() and get_mutants_by_operator() return those cached values; ensure the cache is initialized lazily (compute on first call) and document or provide a single clear/invalidate point if the underlying source can change so subsequent calls recompute.
311-313:⚠️ Potential issue | 🟡 MinorSkip semantically-zero return literals, not only
"0".Line 312 only checks the exact string
"0", so forms like0x0/00still generate no-op RVR mutants.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@util/c_mutator.py` around lines 311 - 313, The current check compares the raw text from self._node_text(expr_node) to the string "0", which misses other zero literals like 0x0, 00, or suffixed forms; instead, parse the literal text returned by _node_text (strip C integer suffixes like U, L, UL, etc., and underscores), then attempt to interpret it as an integer with base 0 (so hex/octal are handled) and treat it as semantically zero if int(...) == 0; keep a try/except ValueError around the parse and only fall through to mutation when parsing fails or value != 0 to preserve existing behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@util/c_mutator.py`:
- Around line 266-270: The candidate generation currently skips adding literal
(original_value - 1) when original_value == 0; update the logic in
util/c_mutator.py where candidate_set and replacements are built (look for
variables candidate_set, original_value, replacements) to always include
original_value - 1 (i.e., remove the guard that prevents adding it for zero) so
that the literal -1 mutant is produced for 0; keep using a set to avoid
duplicates and then sort into replacements as before.
- Around line 257-261: The int(original_text, 0) call in util/c_mutator.py skips
C-style octal literals like "077"; update the parsing around original_value =
int(original_text, 0) to detect and handle traditional C octal syntax: if
original_text is all digits, starts with '0', and contains only digits 0-7 (and
not just "0"), parse with base 8; otherwise fall back to int(original_text, 0);
keep the existing except ValueError handling for non-integer/floating literals.
Target the original_text parsing logic in util/c_mutator.py (the int(...) and
its except block).
---
Duplicate comments:
In `@util/c_mutator.py`:
- Around line 395-412: The helper _get_operator_node currently filters anonymous
children by checking their text against _ALL_OPERATOR_REPLACEMENTS which couples
it to mutation categories; instead, have _get_operator_node simply return the
first unnamed child node (i.e., the operator token) without consulting
_ALL_OPERATOR_REPLACEMENTS so that caller code (where replacement lookups occur)
can decide whether the token is supported—remove the text-based check in
_get_operator_node and keep any filtering/lookup logic at the replacement-map
lookup site that uses _node_text and _ALL_OPERATOR_REPLACEMENTS.
- Around line 173-196: get_mutants() and get_mutants_by_operator() recompute the
same mutants on every call; cache the results instead: compute once by calling
the per-operator helpers (_apply_aor, _apply_ror, _apply_lcr, _apply_crp,
_apply_rvr) once, store the combined list in a private attribute (e.g.
self._mutants) and the mapping in another private attribute (e.g.
self._mutants_by_operator), and have get_mutants() and get_mutants_by_operator()
return those cached values; ensure the cache is initialized lazily (compute on
first call) and document or provide a single clear/invalidate point if the
underlying source can change so subsequent calls recompute.
- Around line 311-313: The current check compares the raw text from
self._node_text(expr_node) to the string "0", which misses other zero literals
like 0x0, 00, or suffixed forms; instead, parse the literal text returned by
_node_text (strip C integer suffixes like U, L, UL, etc., and underscores), then
attempt to interpret it as an integer with base 0 (so hex/octal are handled) and
treat it as semantically zero if int(...) == 0; keep a try/except ValueError
around the parse and only fall through to mutation when parsing fails or value
!= 0 to preserve existing behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 86d02cec-3cae-4c9e-9fa0-313583e78faa
📒 Files selected for processing (1)
util/c_mutator.py
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@util/mutant/__init__.py`:
- Around line 12-21: The __all__ list is unsorted (RUF022); sort the exported
names alphabetically in the __all__ list so linting passes—reorder the entries
including "ArithmeticOperatorReplacement", "ConstantReplacement", "CMutator",
"LogicalConnectorReplacement", "Mutant", "MutationOperator",
"RelationalOperatorReplacement", and "ReturnValueReplacement" into canonical
alphabetical order.
In `@util/mutant/c_mutator.py`:
- Around line 41-49: Update the CMutator.__init__ docstring to remove references
to non-existent CFunction.set_source_code, a public source_code attribute, and
get_original_source_code(); instead state that the mutator obtains source via
c_function.get_source_code(), which encapsulates caching and reading from the
original file as needed (mention CMutator.__init__ and CFunction.get_source_code
to locate the change).
In `@util/mutant/mutant.py`:
- Around line 47-52: The docstring for the create function incorrectly types
`operator` as MutationOperator while the implementation and usage treat it as a
string; update the create() docstring to state `operator (str)` instead of
`operator (MutationOperator)` so it matches the annotated/used type (refer to
the create function and its operator parameter in mutant.py which uses string
keys like "CRP"/"RVR").
In `@util/mutant/mutation_operator.py`:
- Around line 289-291: The literal-text check comparing original_text == "0"
misses equivalent zero forms; update the check in the mutation loop to detect
number literals whose numeric value is zero instead of relying on exact text:
when expr_node is a number_literal (or its node kind/type), obtain the numeric
value (parse the literal after stripping surrounding parentheses and integer
suffixes/hex prefixes) and skip creating the mutant if the parsed integer equals
0; fall back to trimming whitespace/parentheses for non-number_literal nodes
using node_text(source_bytes, expr_node) to catch simple parenthesized cases
before continuing. Ensure you reference the existing node_text and expr_node
logic and only change the condition that currently uses original_text == "0".
- Around line 215-221: The comment is misleading: int(original_text, 0) can
raise ValueError for both floating-point literals and C integer-suffixed
literals (e.g., 1U, 10ULL) produced by tree-sitter-c's number_literal nodes.
Update the inline comment on the except branch and any docstring referencing
this behavior (around
collect_nodes_by_type/number_literal/node_text/int(original_text, 0)) to state
that both floats and integer-suffixed literals are skipped, and if you want to
support suffixed integers instead, strip and preserve C integer suffixes before
calling int() and reattach them when generating replacements (handle parsing in
the logic around original_value and replacement generation).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b716e863-9a57-4f19-aa14-726ae5f2be8d
📒 Files selected for processing (5)
util/__init__.pyutil/mutant/__init__.pyutil/mutant/c_mutator.pyutil/mutant/mutant.pyutil/mutant/mutation_operator.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
util/mutant/mutation_operator.py (1)
223-227:⚠️ Potential issue | 🟠 MajorNormalize C integer literals before deciding which mutants to emit.
int(original_text, 0)uses Python literal syntax, so CRP skips valid C integers like010,1U, and0x10ULL; RVR can still emit equivalent mutants forreturn 0L;orreturn (0);. Reuse a small C-integer parser for both paths.🐛 Proposed fix
+import re from abc import ABC, abstractmethod from collections.abc import Mapping from types import MappingProxyType @@ from .mutant import Mutant +_C_INTEGER_SUFFIX_RE = re.compile(r"(?i)(?:u(?:ll|l)?|(?:ll|l)u?)$") + + +def _parse_c_integer_literal(text: str) -> int | None: + """Return the value of a C integer literal, or `None` for non-integers.""" + stripped = text.strip() + while stripped.startswith("(") and stripped.endswith(")"): + stripped = stripped[1:-1].strip() + + literal = _C_INTEGER_SUFFIX_RE.sub("", stripped) + try: + if re.fullmatch(r"0[0-7]+", literal): + return int(literal, 8) + return int(literal, 0) + except ValueError: + return None + + class MutationOperator(ABC): @@ original_text = node_text(source_bytes, node) - try: - # base=0 handles hex (0x...), octal (0...), and binary (0b...) literals. - original_value = int(original_text, 0) - except ValueError: - continue # Skip floating-point literals (e.g. 1.0, 2.5f). + original_value = _parse_c_integer_literal(original_text) + if original_value is None: + continue # Skip non-integer literals, such as floating-point values. @@ original_text = node_text(source_bytes, expr_node) - if expr_node.type == "number_literal": - try: - if int(original_text, 0) == 0: - continue # already returns 0; no interesting mutation - except ValueError: - pass # non-integer literal (e.g. 0.0f) — don't skip - elif original_text == "0": + if _parse_c_integer_literal(original_text) == 0: continue # already returns 0; no interesting mutationVerification:
#!/bin/bash # Description: Show C integer forms that Python's int(text, 0) does not parse. python - <<'PY' cases = ["0", "010", "1U", "0L", "0x10ULL", "1.0"] for text in cases: try: print(f"{text}: {int(text, 0)}") except ValueError as exc: print(f"{text}: ValueError ({exc})") PYAlso applies to: 297-304
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@util/mutant/mutation_operator.py` around lines 223 - 227, The code currently calls int(original_text, 0) (the variables original_text and original_value) which fails to recognize valid C integer suffixes/prefixes (e.g., 010, 1U, 0L, 0x10ULL); replace this parsing with a small C integer normalizer/parser used by both the CRP path and the RVR/mutant-emission path so literals are normalized (strip underscores, parse base prefixes, remove integer suffixes like U/L/ULL) before converting to an integer; update the code that currently does "original_value = int(original_text, 0)" and the corresponding except ValueError: continue logic to call the new parse_normalize_c_int(original_text) helper (or inline equivalent) and handle non-integer literals consistently (e.g., skip floats) — apply the same change to the similar block around the later occurrence mentioned in the comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@util/mutant/mutation_operator.py`:
- Around line 223-227: The code currently calls int(original_text, 0) (the
variables original_text and original_value) which fails to recognize valid C
integer suffixes/prefixes (e.g., 010, 1U, 0L, 0x10ULL); replace this parsing
with a small C integer normalizer/parser used by both the CRP path and the
RVR/mutant-emission path so literals are normalized (strip underscores, parse
base prefixes, remove integer suffixes like U/L/ULL) before converting to an
integer; update the code that currently does "original_value =
int(original_text, 0)" and the corresponding except ValueError: continue logic
to call the new parse_normalize_c_int(original_text) helper (or inline
equivalent) and handle non-integer literals consistently (e.g., skip floats) —
apply the same change to the similar block around the later occurrence mentioned
in the comment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a050e7c8-c16c-494e-b5da-1b77fd4e2cd2
📒 Files selected for processing (3)
util/mutant/c_mutator.pyutil/mutant/mutant.pyutil/mutant/mutation_operator.py
| """Return the names of all directly called functions within a function body. | ||
|
|
||
| Only direct calls (``identifier`` as the function node of a ``call_expression``) are | ||
| Only direct calls (`identifier` as the function node of a `call_expression`) are |
There was a problem hiding this comment.
Minor: these small formatting changes can be committed directly (or, if you prefer, put in a different pull request).
| list[str]: Deduplicated list of callee names in order of first appearance. | ||
| """ | ||
| callee_names: set[str] = set() | ||
| callee_names: list[str] = [] |
There was a problem hiding this comment.
This fix can be put in a separate pull request.
mernst
left a comment
There was a problem hiding this comment.
It would be useful to briefly document why this mutator was created rather than using an existing tool such as:
https://github.com/mull-project/mull
https://github.com/joakim-brannstrom/dextool/tree/master/plugin/mutate
https://github.com/arun-babu/mutate.py
| class MutationOperator(ABC): | ||
| """Mutation operator categories applied during mutation testing. | ||
|
|
||
| Each subtype corresponds to a classical first-order mutation operator: |
There was a problem hiding this comment.
Not every subtype. Not BinaryMutationOperator, for example.
|
|
||
| @property | ||
| def name(self) -> str: | ||
| """Return the name of this mutation operator. |
There was a problem hiding this comment.
The many identical implementations lead me to think you could put this in a superclass and use a field like _REPLACEMENTS to hold the value.
But, I wonder whether the subclasses are needed at all. I think the implementation could contain just BinaryMutationOperator which has one huge _REPLACEMENTS field that contains all replacements, plus a map from operator to the type of replacement (or _REPLACEMENTS could map to a pair of a name and a list of replacement operators). I don't see any use that needs the finer-grained subtypes to exist.
|
|
||
|
|
||
| class ReturnValueReplacement(MutationOperator): | ||
| """RVR: replaces each non-zero return expression with `0`.""" |
There was a problem hiding this comment.
I suggest that RVR should skip any return statement whose expression is a constant (not just those that are zero). Otherwise, both RVR and CRP would replace return 22 by return 0, leading to a duplicate.
| for node in collect_nodes_by_type(tree.root_node, "return_statement"): | ||
| # A return_statement's named children exclude the "return" keyword and ";". | ||
| expr_node: Node | None = next(iter(node.named_children), None) | ||
| if expr_node is None: |
There was a problem hiding this comment.
How can this happen, given the test for "void" return type above?
| *self._apply_lcr(), | ||
| ] | ||
|
|
||
| def _apply_aor(self) -> list[Mutant]: |
There was a problem hiding this comment.
As noted elsewhere, I don't see the need for all the subclasses of BinaryMutationOperator.
|
@mernst I think @vikramnitin9 is best equipped to answer your question regarding why this mutator was built rather than using an off-the-shelf solution |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
util/mutant/mutation_operator.py (1)
237-238:⚠️ Potential issue | 🔴 CriticalCritical:
set.remove(original_value)raisesKeyErrorfor every non-zero literal, breaking CRP.The candidate set
{original_value + 1, 0, original_value - 1}only containsoriginal_valuein the single caseoriginal_value == 0. For every other integer literal (the vast majority in real C code),set.remove(original_value)raisesKeyError, soConstantReplacement.apply()will abort with an unhandled exception andCMutator.get_mutants()will fail whenever the function contains a non-zero integer literal.Quick trace:
orig = 5→{6, 0, 4}.remove(5)→KeyErrororig = 1→{2, 0}.remove(1)→KeyErrororig = 0→{1, 0, -1}.remove(0)→{1, -1}✓ (only this case works)Use
discard, which is a no-op when the element is absent and still strips the0replacement when the literal is already0.🛠️ Proposed fix
# Use a set to avoid emitting duplicate replacement values. # Example: for literal '1', both "replace with 0" and # "replace with original_value - 1" would otherwise add 0 twice. candidate_set: set[int] = {original_value + 1, 0, original_value - 1} - candidate_set.remove(original_value) + # Skip the `0` replacement when the literal is already 0 (the only case + # where original_value appears in the candidate set). discard() is a + # no-op when the value is absent. + candidate_set.discard(original_value)Also recommend adding a CRP regression test over a non-zero literal (e.g., a function returning
42) so this regression is caught in CI.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@util/mutant/mutation_operator.py` around lines 237 - 238, The candidate set removal in ConstantReplacement.apply() uses set.remove(original_value) which raises KeyError for any non-zero literal; change that call to set.discard(original_value) so missing values are ignored and the replacement set is still computed correctly; update ConstantReplacement.apply() (and any similar logic in the mutation_operator module) accordingly and add a regression test in the CRP test suite that exercises a non-zero literal (e.g., a function returning 42) to ensure CMutator.get_mutants() no longer fails.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@util/mutant/mutation_operator.py`:
- Around line 237-238: The candidate set removal in ConstantReplacement.apply()
uses set.remove(original_value) which raises KeyError for any non-zero literal;
change that call to set.discard(original_value) so missing values are ignored
and the replacement set is still computed correctly; update
ConstantReplacement.apply() (and any similar logic in the mutation_operator
module) accordingly and add a regression test in the CRP test suite that
exercises a non-zero literal (e.g., a function returning 42) to ensure
CMutator.get_mutants() no longer fails.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: bd56af9f-6b82-460e-b691-46a849fc8b77
📒 Files selected for processing (4)
util/mutant/c_mutator.pyutil/mutant/mutant.pyutil/mutant/mutation_operator.pyutil/tree_sitter_util.py
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
util/mutant/mutation_operator.py (1)
300-305:⚠️ Potential issue | 🟡 MinorReuse CRP’s C-integer normalization before skipping zero returns.
return 0U;/return 0L;currently falls throughValueErrorand emits an equivalentreplacement_expr="0"RVR mutant. Reuse the suffix-stripping and octal-normalization logic from CRP here.Suggested fix
+def _parse_c_integer_literal(text: str) -> int: + """Parse a C integer literal after removing suffixes and normalizing octal syntax.""" + stripped = text.rstrip("uUlL") + if len(stripped) > 1 and stripped[0] == "0" and stripped[1:].isdigit(): + stripped = "0o" + stripped[1:] + return int(stripped, 0) + @@ - stripped = original_text.rstrip("uUlL") - # Normalize C octal (010) to Python octal (0o10); int() with base=0 requires 0o prefix. - if len(stripped) > 1 and stripped[0] == "0" and stripped[1:].isdigit(): - stripped = "0o" + stripped[1:] - try: - original_value = int(stripped, 0) + original_value = _parse_c_integer_literal(original_text) @@ if expr_node.type == "number_literal": try: - if int(original_text, 0) == 0: + if _parse_c_integer_literal(original_text) == 0: continue # already returns 0; no interesting mutation🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@util/mutant/mutation_operator.py` around lines 300 - 305, The skip-zero check inside the expr_node.type == "number_literal" branch currently just does int(original_text, 0) and falls through on ValueError, causing suffixed integers like "0U"/"0L" to be treated as non-zero; replace that with the same C-integer normalization used by CRP: strip trailing integer suffixes ([uUlL]+, combinations like UL, LU, etc.) from original_text, normalize legacy octal literals by converting a leading zero followed by digits to a Python-parsable "0o" form, then call int(normalized_text, 0) and skip mutation if the result == 0; apply this logic where original_text and replacement_expr are used for RVR mutants so suffix-only zero returns are correctly skipped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/util/test_c_mutator.py`:
- Around line 24-26: The type annotation for _get_mutants_grouped_by_operator is
incorrect: Mutant.operator is a str and the function returns keys of type str,
so change the return annotation from dict[MutationOperator, list[Mutant]] to
dict[str, list[Mutant]] (update the signature of
_get_mutants_grouped_by_operator accordingly and adjust any imports/aliases if
necessary to satisfy typing checks).
- Around line 152-165: The test test_crp_zero_literal_only_generates_increment
is incorrect about the CRP contract: update the test (still operating on
CMutator and using
_get_mutants_grouped_by_operator(mutator)[MutationOperator.CRP]) to reflect that
CRP produces both "1" and "-1" replacements for original "0"; rename the test to
something like test_crp_zero_literal_generates_increment_and_decrement and add
an assertion that "-1" is present in replacements_for_zero while keeping the
assertion that "0" is not present.
In `@util/mutant/mutation_operator.py`:
- Line 108: Replace tuple-index access to tree-sitter points with the attribute
API: change uses of op_node.start_point[0] and similar tuple-style accesses to
op_node.start_point.row (and use .end_point.row where applicable) in
mutation_operator.py (locations around the code that reference
op_node.start_point and op_node.end_point—e.g., the occurrence setting line =
op_node.start_point[0] and the other occurrences around the reported spots).
Ensure all three instances (the one at the shown spot plus the occurrences near
the other reported lines) use .row consistently to match
util/tree_sitter_util.py and avoid tuple-style indexing.
In `@util/tree_sitter_util.py`:
- Around line 156-160: The nested helper traverse lacks a docstring; add a
concise docstring above the traverse function describing its purpose
(recursively collect AST nodes matching node_type), its parameters
(current_node: Node), its behavior (depth-first recursion over
current_node.children), its side-effects (appends matches to the outer-scope
result list and relies on the enclosing node_type and result variables), and
what it returns (None). Reference the nested function name traverse and the
outer symbols node_type and result so future readers know this helper uses
closure variables rather than returning a value.
---
Duplicate comments:
In `@util/mutant/mutation_operator.py`:
- Around line 300-305: The skip-zero check inside the expr_node.type ==
"number_literal" branch currently just does int(original_text, 0) and falls
through on ValueError, causing suffixed integers like "0U"/"0L" to be treated as
non-zero; replace that with the same C-integer normalization used by CRP: strip
trailing integer suffixes ([uUlL]+, combinations like UL, LU, etc.) from
original_text, normalize legacy octal literals by converting a leading zero
followed by digits to a Python-parsable "0o" form, then call
int(normalized_text, 0) and skip mutation if the result == 0; apply this logic
where original_text and replacement_expr are used for RVR mutants so suffix-only
zero returns are correctly skipped.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e3f608e3-96da-4677-b647-e1901967e159
📒 Files selected for processing (4)
test/util/test_c_mutator.pytranslation/normalization.pyutil/mutant/mutation_operator.pyutil/tree_sitter_util.py
| def _get_mutants_grouped_by_operator(mutator: CMutator) -> dict[MutationOperator, list[Mutant]]: | ||
| mutants = sorted(mutator.get_mutants(), key=lambda m: m.operator) | ||
| return {op: list(group) for op, group in groupby(mutants, lambda m: m.operator)} |
There was a problem hiding this comment.
Annotate grouped mutant keys as str, not MutationOperator.
Mutant.operator is a str, and MutationOperator.AOR/CRP/etc. are string constants, so this helper returns dict[str, list[Mutant]]. The current annotation risks failing make checks.
Suggested fix
-def _get_mutants_grouped_by_operator(mutator: CMutator) -> dict[MutationOperator, list[Mutant]]:
+def _get_mutants_grouped_by_operator(mutator: CMutator) -> dict[str, list[Mutant]]:
mutants = sorted(mutator.get_mutants(), key=lambda m: m.operator)
return {op: list(group) for op, group in groupby(mutants, lambda m: m.operator)}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/util/test_c_mutator.py` around lines 24 - 26, The type annotation for
_get_mutants_grouped_by_operator is incorrect: Mutant.operator is a str and the
function returns keys of type str, so change the return annotation from
dict[MutationOperator, list[Mutant]] to dict[str, list[Mutant]] (update the
signature of _get_mutants_grouped_by_operator accordingly and adjust any
imports/aliases if necessary to satisfy typing checks).
| def test_crp_zero_literal_only_generates_increment() -> None: | ||
| fn = _get_function("data/factorial_iterative.c", "factorial_iter") | ||
| src_with_zero = dedent("""\ | ||
| int f() { | ||
| int x = 0; | ||
| return x; | ||
| } | ||
| """) | ||
| fn.set_source_code(src_with_zero) | ||
| mutator = CMutator(fn) | ||
| crp_mutants = _get_mutants_grouped_by_operator(mutator)[MutationOperator.CRP] | ||
| replacements_for_zero = [m.replacement_expr for m in crp_mutants if m.original_expr == "0"] | ||
| assert "1" in replacements_for_zero | ||
| assert "0" not in replacements_for_zero |
There was a problem hiding this comment.
Align the zero-literal CRP test with the CRP contract.
CRP currently generates both 0 → 1 and 0 → -1 after excluding only the original 0; the test name says “only generates increment” and does not cover -1. Rename it and assert the decrement replacement too.
Suggested fix
-def test_crp_zero_literal_only_generates_increment() -> None:
+def test_crp_zero_literal_generates_non_noop_neighbors() -> None:
@@
replacements_for_zero = [m.replacement_expr for m in crp_mutants if m.original_expr == "0"]
assert "1" in replacements_for_zero
+ assert "-1" in replacements_for_zero
assert "0" not in replacements_for_zero🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/util/test_c_mutator.py` around lines 152 - 165, The test
test_crp_zero_literal_only_generates_increment is incorrect about the CRP
contract: update the test (still operating on CMutator and using
_get_mutants_grouped_by_operator(mutator)[MutationOperator.CRP]) to reflect that
CRP produces both "1" and "-1" replacements for original "0"; rename the test to
something like test_crp_zero_literal_generates_increment_and_decrement and add
an assertion that "-1" is present in replacements_for_zero while keeping the
assertion that "0" is not present.
| op_text = node_text(source_bytes, op_node) | ||
| for replacement in self._REPLACEMENTS.get(op_text, []): | ||
| mutated_src = replace_node(source_bytes, op_node, replacement) | ||
| line = op_node.start_point[0] + 1 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether the installed tree_sitter Point supports tuple indexing,
# and list the remaining tuple-style start_point usages.
python - <<'PY'
try:
from tree_sitter import Point
p = Point(row=1, column=2)
print("Point:", p)
try:
print("p[0] =", p[0])
except Exception as exc:
print("p[0] failed:", type(exc).__name__, exc)
print("p.row =", p.row)
except Exception as exc:
print("tree_sitter import/Point construction failed:", type(exc).__name__, exc)
PY
rg -n 'start_point\[[0-9]+\]' util/mutant/mutation_operator.py util/tree_sitter_util.pyRepository: vikramnitin9/rust_verification
Length of output: 425
🏁 Script executed:
# Check util/tree_sitter_util.py for start_point usage pattern
cat -n util/tree_sitter_util.py | grep -i "start_point"Repository: vikramnitin9/rust_verification
Length of output: 173
🏁 Script executed:
# Search for all start_point usages across the entire codebase
rg -n 'start_point' --type pyRepository: vikramnitin9/rust_verification
Length of output: 473
Use .row for tree-sitter point access.
util/tree_sitter_util.py already uses node.start_point.row; these tuple-style accesses create inconsistency and can break if the installed tree_sitter exposes Point as an attribute-only object. Use the same typed API everywhere.
Suggested change
- line = op_node.start_point[0] + 1
+ line = op_node.start_point.row + 1
@@
- line = node.start_point[0] + 1
+ line = node.start_point.row + 1
@@
- line = expr_node.start_point[0] + 1
+ line = expr_node.start_point.row + 1Also applies to: 243-243, 309-309
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@util/mutant/mutation_operator.py` at line 108, Replace tuple-index access to
tree-sitter points with the attribute API: change uses of op_node.start_point[0]
and similar tuple-style accesses to op_node.start_point.row (and use
.end_point.row where applicable) in mutation_operator.py (locations around the
code that reference op_node.start_point and op_node.end_point—e.g., the
occurrence setting line = op_node.start_point[0] and the other occurrences
around the reported spots). Ensure all three instances (the one at the shown
spot plus the occurrences near the other reported lines) use .row consistently
to match util/tree_sitter_util.py and avoid tuple-style indexing.
| def traverse(current_node: Node) -> None: | ||
| if current_node.type == node_type: | ||
| result.append(current_node) | ||
| for child in current_node.children: | ||
| traverse(child) |
There was a problem hiding this comment.
Document the nested traversal helper.
This new non-test helper contains AST traversal logic but has no docstring. As per coding guidelines, **/!(test)/**/*.py: “All business logic should be documented, even helper code. Test code (i.e., any code under test/) need not be documented.”
Suggested fix
result: list[Node] = []
def traverse(current_node: Node) -> None:
+ """Collect matching nodes under `current_node` in pre-order."""
if current_node.type == node_type:
result.append(current_node)
for child in current_node.children:
traverse(child)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def traverse(current_node: Node) -> None: | |
| if current_node.type == node_type: | |
| result.append(current_node) | |
| for child in current_node.children: | |
| traverse(child) | |
| def traverse(current_node: Node) -> None: | |
| """Collect matching nodes under `current_node` in pre-order.""" | |
| if current_node.type == node_type: | |
| result.append(current_node) | |
| for child in current_node.children: | |
| traverse(child) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@util/tree_sitter_util.py` around lines 156 - 160, The nested helper traverse
lacks a docstring; add a concise docstring above the traverse function
describing its purpose (recursively collect AST nodes matching node_type), its
parameters (current_node: Node), its behavior (depth-first recursion over
current_node.children), its side-effects (appends matches to the outer-scope
result list and relies on the enclosing node_type and result variables), and
what it returns (None). Reference the nested function name traverse and the
outer symbols node_type and result so future readers know this helper uses
closure variables rather than returning a value.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
util/c_function.py (1)
155-161:⚠️ Potential issue | 🟡 MinorDocstring references a non-existent field; also revisit stale
source_code_with_specs.Two concerns on this new
set_source_code:
- Docstring mismatch: the summary says "Set the
source_codefield", but there is nosource_codeattribute onCFunction— the method actually writes_cached_source_code. The description should reflect that this updates the cached source (the same cache populated byget_source_code()at Line 136).- As previously noted, this setter leaves
source_code_with_specsuntouched. If a caller usesset_source_code(...)to override the source after specs were previously cached,get_source_code_with_specs()will return stale content. Giveninvalidate_source_code_fields()already exists at Line 171, consider either invalidatingsource_code_with_specshere or documenting that callers must callinvalidate_source_code_fields()first.✏️ Proposed docstring tweak
def set_source_code(self, source_code: str) -> None: - """Set the `source_code` field. - - Args: - source_code (str): The source code. - """ + """Cache the given source code as this function's source. + + This populates the same cache consulted by `get_source_code()` when no + formatting options are requested. Note: this does not invalidate + `source_code_with_specs`; call `invalidate_source_code_fields()` if a + previously cached spec-augmented source must also be discarded. + + Args: + source_code (str): The source code to cache. + """ self._cached_source_code = source_code🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@util/c_function.py` around lines 155 - 161, The set_source_code method documentation and behavior are inconsistent: it writes to _cached_source_code (used by get_source_code()) but the docstring mentions a non-existent source_code field and it does not clear source_code_with_specs, which can leave get_source_code_with_specs() stale; update the docstring to say it updates the cached source (_cached_source_code) and either call invalidate_source_code_fields() inside set_source_code (to clear source_code_with_specs) or explicitly document in the docstring that callers must call invalidate_source_code_fields() before setting new source; reference set_source_code, _cached_source_code, get_source_code(), get_source_code_with_specs(), source_code_with_specs, and invalidate_source_code_fields() when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@util/c_function.py`:
- Around line 155-161: The set_source_code method documentation and behavior are
inconsistent: it writes to _cached_source_code (used by get_source_code()) but
the docstring mentions a non-existent source_code field and it does not clear
source_code_with_specs, which can leave get_source_code_with_specs() stale;
update the docstring to say it updates the cached source (_cached_source_code)
and either call invalidate_source_code_fields() inside set_source_code (to clear
source_code_with_specs) or explicitly document in the docstring that callers
must call invalidate_source_code_fields() before setting new source; reference
set_source_code, _cached_source_code, get_source_code(),
get_source_code_with_specs(), source_code_with_specs, and
invalidate_source_code_fields() when making the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 490a4f25-48f3-4bdb-8f56-8d78fa233d54
📒 Files selected for processing (1)
util/c_function.py
Addresses #127.