Skip to content

Add f-string parser/printer support and unused-variable awareness#1465

Open
nmedveditsko-nvidia wants to merge 7 commits into
bazelbuild:mainfrom
nmedveditsko-nvidia:feat/f-string-starlark
Open

Add f-string parser/printer support and unused-variable awareness#1465
nmedveditsko-nvidia wants to merge 7 commits into
bazelbuild:mainfrom
nmedveditsko-nvidia:feat/f-string-starlark

Conversation

@nmedveditsko-nvidia
Copy link
Copy Markdown

@nmedveditsko-nvidia nmedveditsko-nvidia commented May 18, 2026

Description

Implements parser-level support for Python-style f-string literals (f"..."), addressing #1278. F-string fields are treated as opaque text: the lexer accepts the "f" prefix, the parser stores the literal value with braces intact, and the printer round-trips the prefix. Field contents are not parsed as Starlark expressions.

Parser/AST:

  • Lexer recognizes "f" as a string prefix alongside "r".
  • StringExpr gains a Prefix field, populated by the lexer for both "r" and "f". This gives consumers a single semantic field to query; the printer continues to consult Token for round-trip preservation. The two checks are now symmetric.
  • Unquote handles the "f" prefix the same as a non-raw string (escapes are interpreted; {...} fields ride through verbatim).

Printer:

  • f-string literals are preserved via Token when possible, mirroring raw-string handling. When Token cannot be reused, the printer regenerates the literal preserving the "f" prefix.
  • Single-quoted f-strings without inner double quotes are canonicalized to double quotes, matching the existing raw-string rule.

Rewriter:

  • deduplicateStringExprs, sortStringExprs, uniq, and isUniq factor StringExpr.Prefix into comparison so that f"x" and "x" are not treated as duplicates.

Linter (warn/warn_control_flow.go):

  • The unused-variable check scans {...} fields of f-strings with a permissive regex and marks the identifiers it finds as used. Comprehension and lambda scopes shadow correctly. Intentionally permissive: false positives only suppress warnings; false negatives would resurface the original regression.

Tests: f-string parsing, printing (including programmatic fallback), unquoting, identifier resolution under shadowing, and known acceptable false-positives (format-spec characters) are all covered.

Refs #1278

Buildtools PR checklist

  • The code in this PR is covered by unit/integration tests.
  • I have tested these changes and provide testing instructions below.
  • I have either responded to, or resolved all Gemini comments on the PR.
  • I have read Google Eng Practices on Small Changes, this PR either follows these guidelines or the description provides reasoning for why they can not be followed.

These changes were tested using the following steps

  • formatted existing .star files with f-strings in them
  • run the tests

Implements parser-level support for Python-style f-string literals
(f"..."), addressing bazelbuild#1278. F-string fields are treated as opaque
text: the lexer accepts the "f" prefix, the parser stores the
literal value with braces intact, and the printer round-trips the
prefix. Field contents are not parsed as Starlark expressions.

Parser/AST:
- Lexer recognizes "f" as a string prefix alongside "r".
- StringExpr gains a Prefix field, populated by the lexer for both
  "r" and "f". This gives consumers a single semantic field to
  query; the printer continues to consult Token for round-trip
  preservation. The two checks are now symmetric.
- Unquote handles the "f" prefix the same as a non-raw string
  (escapes are interpreted; {...} fields ride through verbatim).

Printer:
- f-string literals are preserved via Token when possible, mirroring
  raw-string handling. When Token cannot be reused, the printer
  regenerates the literal preserving the "f" prefix.
- Single-quoted f-strings without inner double quotes are
  canonicalized to double quotes, matching the existing raw-string
  rule.

Rewriter:
- deduplicateStringExprs, sortStringExprs, uniq, and isUniq factor
  StringExpr.Prefix into comparison so that f"x" and "x" are not
  treated as duplicates.

Linter (warn/warn_control_flow.go):
- The unused-variable check scans {...} fields of f-strings with a
  permissive regex and marks the identifiers it finds as used.
  Comprehension and lambda scopes shadow correctly. Intentionally
  permissive: false positives only suppress warnings; false negatives
  would resurface the original regression.

Tests: f-string parsing, printing (including programmatic fallback),
unquoting, identifier resolution under shadowing, and known
acceptable false-positives (format-spec characters) are all covered.

Refs bazelbuild#1278
@nmedveditsko-nvidia nmedveditsko-nvidia requested a review from a team as a code owner May 18, 2026 22:39
@nmedveditsko-nvidia nmedveditsko-nvidia requested review from vladmos and removed request for a team May 18, 2026 22:39
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements support for f-strings across the build tools, including updates to the lexer, parser, printer, and string utilities. It also enhances the unused-variable warning logic to detect identifiers referenced within f-string fields. Review feedback highlights a potential backward compatibility issue in the printer for existing ASTs, a change in string deduplication behavior for raw strings, and the need for the f-string identifier regex to support non-ASCII characters and better handle attributes.

Comment thread build/print.go Outdated
Comment thread build/rewrite.go Outdated
Comment thread warn/warn_control_flow.go Outdated
The addition of v.Prefix == "r" to this condition introduces a backward compatibility issue for the library. Existing ASTs constructed manually or by older versions of the library may have a valid raw string in the Token field but an empty Prefix field. These strings will now lose their r prefix during formatting and fall back to standard escaped strings. Similarly, f-strings with an empty Prefix but a valid f-prefixed Token will lose their f prefix, which is a semantic change.

Consider determining an effective prefix that defaults to the one found in v.Token if v.Prefix is empty to maintain compatibility with existing in-memory ASTs.
+ attribute names after a dot are not variable references
@nmedveditsko-nvidia
Copy link
Copy Markdown
Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements support for f-strings across the build tool's pipeline, including lexing, parsing, printing, and AST manipulation. It ensures f-strings are correctly distinguished from regular strings during deduplication and sorting, and adds logic to the static analysis pass to identify variables used within f-string interpolations. The review feedback suggests enhancing the f-string identifier extraction logic by making the brace-depth tracking aware of quotes, which would prevent potential issues when string literals inside f-strings contain braces.

Comment thread warn/warn_control_flow.go
@nmedveditsko-nvidia
Copy link
Copy Markdown
Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for f-strings (formatted strings) across the build tool's core components, including the lexer, parser, and printer. It adds a Prefix field to the StringExpr syntax node and updates the 'unused-variable' linting logic to correctly identify variables referenced within f-string interpolations. Furthermore, the deduplication and sorting logic for string expressions has been refined to distinguish f-strings from regular strings. I have no feedback to provide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant