Restore per-context layer-output shape via configurable selective k-CFA by khatchad · Pull Request #371 · ponder-lab/ML

khatchad · 2026-06-09T21:32:34Z

Summary

A user tf.keras.Model subclass invoked from multiple sites (e.g. a training loop and a test-set evaluation) merged into one context-insensitive call-graph node, collapsing its per-caller layer-output allocations and losing per-context tensor shape. Downstream, tf.argmax of a collapsed layer output leaked one caller's shape into another context and threw a NonBroadcastableShapesException in tf.equal, which is why Argmax.getDefaultShapes returned ⊤ to sidestep the regression.

This PR resolves that by making the targeted k-CFA depth configurable (wala#379) and applying deeper context to user model forward methods (wala#530), then restoring the precise tf.argmax/tf.argmin output shape.

Changes

Configurable depth (Make k-CFA depth configurable for framework API context sensitivity wala/ML#379): the hard-coded 2 on the targeted context selector becomes DEFAULT_TARGETED_CFA_DEPTH (default 2, fully back-compat), exposed as a PythonTensorAnalysisEngine constructor parameter. Clients supply a deeper value when allocations merge across call sites; only the targeted selector (framework APIs plus model methods) uses it, never the whole program.
Model-aware context (Restore precise output shape on per-context layer-output allocations (testNeuralNetwork style) wala/ML#530): the targeted predicate is extended to user tf.keras.Model forward methods so a model called from multiple sites is analyzed per caller, keeping its layer-output allocations distinct.
Precise tf.argmax/tf.argmin shape: Argmax.getDefaultShapes now delegates to ReduceMean's keepdims=false reduction (input shape with the axis dimension removed), since the per-context separation removes the broadcast regression that previously forced ⊤.
Test harness: depth-parameterized makeEngine/test overloads (mirroring the analogous knob in the Java streams tooling), so testNeuralNetwork1-4 opt into depth 4 while every other test stays at the default.

Detection: Heuristic vs CHA-subtype (Dormant Path)

The clean way to recognize a user model method is a class-hierarchy subtype check against tf.keras.Model. That check is wired but intentionally dormant: the front-end does not yet record class X(Model) inheritance, so a user model class resolves its superclass to Lobject rather than the Keras Model type (filed as wala#571, an existing front-end TODO). Until that lands, a structural heuristic over user call/__call__/predict methods drives the predicate. When wala#571 is resolved, the heuristic is deleted and isModelSubclassMethod takes over—both are present so the switch is a one-line change.

Testing

Full suite green (all modules, mvn test).
testNeuralNetwork1-4 run at depth 4: accuracy's y_pred is now the correct per-context union {(256, 10) float32, ? float32}, and the per-context multiplicity for testNeuralNetwork rises to 24.
The six tf.argmax/tf.argmin shape tests tighten from ⊤ to the precise (3,) the fixtures assert at runtime.

Closes wala#379.
Closes wala#530.

Lift the hard-coded 2-CFA depth on the targeted context selector to a configurable `DEFAULT_TARGETED_CFA_DEPTH` (default 2, back-compat), exposed as a `PythonTensorAnalysisEngine` constructor parameter so clients can request deeper context when allocations merge across call sites. Extend the targeted predicate beyond framework APIs to user `tf.keras.Model` forward methods: a model invoked from multiple sites (train vs. test) otherwise merges into one context-insensitive node, collapsing its per-caller layer-output allocations and losing per-context shape. The CHA-subtype check is wired but dormant because the front-end does not yet record `class X(Model)` inheritance (superclass resolves to `Lobject`, wala#571); a structural heuristic over user `call`/`__call__`/`predict` methods is active until that lands. The test harness gains depth-parameterized `makeEngine`/`test` overloads, mirroring the Java streams tooling, so `testNeuralNetwork1-4` opt into depth 4 while every other test stays at the default. Closes wala#379. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

With per-context layer-output allocations now separated by the targeted k-CFA depth, un-nulling `Argmax.getDefaultShapes` no longer regresses `testNeuralNetwork*` via `ElementWiseOperation`'s cross-context cartesian pair. `Argmax`/`Argmin` delegate to `ReduceMean`'s keepdims=false reduction, emitting the input shape with the `axis` dimension removed. The six argmax/argmin shape tests tighten from ⊤ to the precise `(3,)` shape the fixtures assert at runtime, resolving the captured-imprecise TODOs that were gated on this fix. Closes wala#530. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR makes the call-graph context sensitivity for targeted framework/model methods configurable to restore per-caller tensor-shape precision (notably for user tf.keras.Model subclasses invoked from multiple sites), and then tightens tf.argmax/tf.argmin shape inference back to the precise reduced shape.

Changes:

Add PythonTensorAnalysisEngine.DEFAULT_TARGETED_CFA_DEPTH (default 2) and constructor parameters to configure the targeted k-CFA depth.
Extend the “targeted” predicate to include user model forward methods (via CHA-subtype check with a structural heuristic fallback).
Restore precise argmax/argmin output shapes and update tests to use deeper k-CFA (depth 4) for the affected neural-network fixtures.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/PythonTensorAnalysisEngine.java	Introduces configurable targeted k-CFA depth and expands targeted method selection to cover user model forward methods.
com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/Argmax.java	Switches argmax default shape from ⊤/unknown to a reduction-based precise shape.
com.ibm.wala.cast.python.ml.test/source/com/ibm/wala/cast/python/ml/test/TestTensorflow2Model.java	Updates neural-network tests to run with depth 4 and tightens argmax/argmin shape assertions to precise shapes.
com.ibm.wala.cast.python.ml.test/source/com/ibm/wala/cast/python/ml/test/TestPythonMLCallGraphShape.java	Adds depth-parameterized `makeEngine` helper to let specific tests opt into deeper targeted context sensitivity.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov · 2026-06-09T21:51:46Z

Codecov Report

❌ Patch coverage is 87.91209% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.67%. Comparing base (6381dec) to head (be43d95).

Files with missing lines	Patch %	Lines
...t/python/ml/client/PythonTensorAnalysisEngine.java	78.94%	2 Missing and 6 partials ⚠️
...com/ibm/wala/cast/python/ml/client/ReduceMean.java	86.66%	2 Missing ⚠️
...wala/cast/python/ml/test/TestTensorflow2Model.java	96.77%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff            @@
##             master     #371   +/-   ##
=========================================
  Coverage     71.66%   71.67%           
- Complexity     2712     2731   +19     
=========================================
  Files           272      272           
  Lines         20292    20347   +55     
  Branches       3273     3283   +10     
=========================================
+ Hits          14542    14583   +41     
- Misses         4459     4470   +11     
- Partials       1291     1294    +3

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Address PR #371 review feedback. - Reject a negative `targetedCfaDepth` at construction with an `IllegalArgumentException` via a shared `checkTargetedCfaDepth` guard, document the non-negative range, and cover it with a test. - Document that `Argmax`/`Argmin` shape inference does not honor the deprecated TF 1.x `dimension` axis alias (it reduces as `axis=None`); honoring it precisely is tracked by wala#572. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Routing every user `call`/`__call__`/`predict` method through the targeted k-CFA selector at the default depth added context-sensitivity suite-wide without retaining enough caller frames to fix the chained-layer collapse (which needs a depth above the default), inflating analysis time and call graphs for every model test. Apply the model-method predicate only when the caller opts into a depth greater than `DEFAULT_TARGETED_CFA_DEPTH`, so the default-depth behavior matches the prior context-insensitive analysis while `testNeuralNetwork*` (depth 4) keep the per-context layer-output fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Several `findOrCreatePointsToSet` call sites in `PythonTensorAnalysisEngine` (`getShapeSourceCalls`, `getSetShapeCallsSyntactic`, `getKeysDefinedByCall`) and `TensorGeneratorFactory.getPointsToSetVariable` materialized points-to-set variables for keys that can be implicitly represented. On an implicit key, WALA's `PropagationSystem.findOrCreatePointsToSet` prints the entire call graph's IR to stderr (a no-op-assertion debug path) and then continues. With the deeper context sensitivity added by the preceding commits, a single `testNeuralNetwork` hit this path ~911 times, dumping ~8.46M lines and dominating CI log I/O (the `verify` step ran ~4.5x longer). Guard these sites with `PropagationSystem.isImplicit`, matching the existing guards elsewhere in the analysis; an implicit key has no explicit dataflow variable to pin or track, so skipping it preserves behavior (full suite unchanged, and `testNeuralNetwork`'s output drops from ~15.3M lines to ~43K). Closes wala#573. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

This reverts commit fc20a93. The gate was added on the hypothesis that the model-method predicate caused the CI slowdown. It did not: the slowdown was WALA dumping the entire call graph's IR on implicitly-represented PointerKeys, fixed in the preceding commit (wala#573). The gate is perf-neutral (measured), so drop it and let the model-method predicate apply unconditionally for simpler code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

`Argmax` delegates shape inference to `ReduceMean`, which reads its `keepdims` argument at positional index 2. For `tf.math.argmax`, index 2 is `output_type`, so a positional call like `tf.argmax(x, 0, tf.int32)` had its `output_type` misread as `keepdims`: a dtype constant is neither boolean nor numeric, so `ReduceMean` widened `keepdims` to `{false, true}` and unioned a spurious `keepdims=true` shape (e.g. `(1, 3)`). Extract `ReduceMean.getKeepDimsValues` and override it in `Argmax` to force `{false}` — argmax/argmin have no `keepdims` parameter and always remove the scanned axis. Add `testArgmaxOutputTypePositional` as a regression guard. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…wardMethod` Replace the `"call"` and `"__call__"` string literals with the existing `PythonTypes.CALLABLE_METHOD_NAME_FOR_KERAS_MODELS` and `PythonTypes.CALLABLE_METHOD_NAME` constants so the heuristic and the front-end share a single source of truth for the method names. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

…s in `ReduceMean` `getKeepDimsValues`'s `ConstantKey` branch handled only `Boolean`, `Number`, and `null`, silently dropping any other constant and collapsing to `{false}`, contrary to the method's Javadoc. Add the missing `else` so such a constant widens to `{false, true}` for soundness, matching the non-constant case. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

`Argmax.getDefaultShapes` delegates to `ReduceMean`, which resolved the input argument by the keyword name `input_tensor`. Argmax's parameter is named `input`, so a keyword call like `tf.math.argmax(input=x, axis=0)` failed argument resolution and threw `IllegalStateException` during shape inference. Make the input-parameter name and position overridable in `ReduceMean` and override them in `Argmax`, keeping the reduction logic in one place. Add `tf2_test_argmax_input_keyword.py` and `testArgmaxInputKeyword` as a regression guard. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

khatchad and others added 2 commits June 9, 2026 17:31

Copilot AI review requested due to automatic review settings June 9, 2026 21:32

Copilot started reviewing on behalf of khatchad June 9, 2026 21:32 View session

Copilot AI reviewed Jun 9, 2026

View reviewed changes

Comment thread ...ala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/PythonTensorAnalysisEngine.java

Comment thread com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/Argmax.java

khatchad and others added 2 commits June 9, 2026 18:00

Copilot AI review requested due to automatic review settings June 9, 2026 23:53

Copilot started reviewing on behalf of khatchad June 9, 2026 23:54 View session

Copilot AI reviewed Jun 9, 2026

View reviewed changes

Comment thread com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/Argmax.java

khatchad and others added 2 commits June 9, 2026 21:02

Copilot AI review requested due to automatic review settings June 10, 2026 01:03

Copilot started reviewing on behalf of khatchad June 10, 2026 01:04 View session

Copilot AI reviewed Jun 10, 2026

View reviewed changes

Comment thread com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/Argmax.java

khatchad and others added 2 commits June 10, 2026 10:06

Copilot AI review requested due to automatic review settings June 10, 2026 18:37

Copilot started reviewing on behalf of khatchad June 10, 2026 18:37 View session

khatchad mentioned this pull request Jun 10, 2026

Forward a targeted k-CFA depth to the Ariadne engine for per-context input-signature precision ponder-lab/Hybridize-Functions-Refactoring#600

Open

Copilot AI reviewed Jun 10, 2026

View reviewed changes

Comment thread com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/ReduceMean.java

Comment thread com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/Argmax.java

This was referenced Jun 10, 2026

Release the next version with the configurable targeted k-CFA depth wala/ML#575

Open

Incorporate the next Ariadne release ponder-lab/Hybridize-Functions-Refactoring#603

Open

khatchad and others added 2 commits June 10, 2026 15:44

khatchad added this pull request to the merge queue Jun 10, 2026

Merged via the queue into master with commit 51d93bb Jun 10, 2026
13 checks passed

khatchad deleted the ml-530-layer-output-shape branch June 10, 2026 19:57

khatchad mentioned this pull request Jun 10, 2026

Correct the implicit-key skip comments: the debug dump is unconditional, not a no-op assertion #372

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore per-context layer-output shape via configurable selective k-CFA#371

Restore per-context layer-output shape via configurable selective k-CFA#371
khatchad merged 10 commits into
masterfrom
ml-530-layer-output-shape

khatchad commented Jun 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

khatchad commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Detection: Heuristic vs CHA-subtype (Dormant Path)

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

khatchad commented Jun 9, 2026 •

edited

Loading

codecov Bot commented Jun 9, 2026 •

edited

Loading