Skip to content

Scalar-aware analyzer mode for undefined/out-of-scope analysis #149

@jbearak

Description

@jbearak

Context

Follow-up to the out-of-scope diagnostic cleanup spec (2026-04-22). That spec intentionally keeps the undefined-variable rewrite path narrow: a variable-kind reference only matches an out-of-scope `variable`, never a `scalar`, `matrix`, or `program`.

This issue tracks an optional future enhancement: make the analyzer / out-of-scope matcher scalar-aware.

The idea

In real Stata code, a bare identifier in an expression may be acting as a scalar, not a dataset variable — for example `display x + 1` where `x` was defined via `scalar x = 42`. The current analyzer only emits `UNDEFINED_VARIABLE` for varlist positions, so expression-position bare identifiers are not diagnosed today. A scalar-aware mode could:

  • Detect bare identifiers in expression positions with enough context that they are clearly scalar-like (e.g., `display`, `assert`, `di`, RHS of `generate`/`replace`).
  • Treat such references as scalar references for the purpose of undefined-symbol and out-of-scope matching — closer in spirit to global-macro analysis than to variable analysis.
  • Ship behind a new user-facing setting (e.g., `sight.diagnostics.scalarAwareness: 'off' | 'strict'`) so users can opt into the stricter signal.

Why deferred

  • Requires new analyzer work (identifying scalar-like positions) that is out of scope for the current cleanup.
  • The scope-resolver already carries scalar out-of-scope entries; lighting up the matcher for them would produce false positives today because the ref-kind classifier would have no way to distinguish variable-kind from scalar-kind references.
  • Bare-identifier disambiguation has corner cases (a user could shadow a scalar with an identically named variable — bad practice but legal). We do not need to resolve every pathological case before shipping a useful strictness option.

Scope when revisited

  • Analyzer: detect scalar-like reference positions and emit a new diagnostic code (e.g., `UNDEFINED_SCALAR`) or reuse the existing code with a side-band kind.
  • Provider: extend `classify_reference_kind` to return `'scalar'`; extend `out_of_scope_type_matches_reference` to allow `'scalar'` references to match `'scalar'` out-of-scope entries. No change to variable-kind matching.
  • Config: add the new severity / opt-in setting with a conservative default.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions