Restore per-context layer-output shape via configurable selective k-CFA#371
Conversation
Lift the hard-coded 2-CFA depth on the targeted context selector to a configurable `DEFAULT_TARGETED_CFA_DEPTH` (default 2, back-compat), exposed as a `PythonTensorAnalysisEngine` constructor parameter so clients can request deeper context when allocations merge across call sites. Extend the targeted predicate beyond framework APIs to user `tf.keras.Model` forward methods: a model invoked from multiple sites (train vs. test) otherwise merges into one context-insensitive node, collapsing its per-caller layer-output allocations and losing per-context shape. The CHA-subtype check is wired but dormant because the front-end does not yet record `class X(Model)` inheritance (superclass resolves to `Lobject`, wala#571); a structural heuristic over user `call`/`__call__`/`predict` methods is active until that lands. The test harness gains depth-parameterized `makeEngine`/`test` overloads, mirroring the Java streams tooling, so `testNeuralNetwork1-4` opt into depth 4 while every other test stays at the default. Closes wala#379. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
With per-context layer-output allocations now separated by the targeted k-CFA depth, un-nulling `Argmax.getDefaultShapes` no longer regresses `testNeuralNetwork*` via `ElementWiseOperation`'s cross-context cartesian pair. `Argmax`/`Argmin` delegate to `ReduceMean`'s keepdims=false reduction, emitting the input shape with the `axis` dimension removed. The six argmax/argmin shape tests tighten from ⊤ to the precise `(3,)` shape the fixtures assert at runtime, resolving the captured-imprecise TODOs that were gated on this fix. Closes wala#530. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR makes the call-graph context sensitivity for targeted framework/model methods configurable to restore per-caller tensor-shape precision (notably for user tf.keras.Model subclasses invoked from multiple sites), and then tightens tf.argmax/tf.argmin shape inference back to the precise reduced shape.
Changes:
- Add
PythonTensorAnalysisEngine.DEFAULT_TARGETED_CFA_DEPTH(default 2) and constructor parameters to configure the targeted k-CFA depth. - Extend the “targeted” predicate to include user model forward methods (via CHA-subtype check with a structural heuristic fallback).
- Restore precise
argmax/argminoutput shapes and update tests to use deeper k-CFA (depth 4) for the affected neural-network fixtures.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/PythonTensorAnalysisEngine.java | Introduces configurable targeted k-CFA depth and expands targeted method selection to cover user model forward methods. |
| com.ibm.wala.cast.python.ml/source/com/ibm/wala/cast/python/ml/client/Argmax.java | Switches argmax default shape from ⊤/unknown to a reduction-based precise shape. |
| com.ibm.wala.cast.python.ml.test/source/com/ibm/wala/cast/python/ml/test/TestTensorflow2Model.java | Updates neural-network tests to run with depth 4 and tightens argmax/argmin shape assertions to precise shapes. |
| com.ibm.wala.cast.python.ml.test/source/com/ibm/wala/cast/python/ml/test/TestPythonMLCallGraphShape.java | Adds depth-parameterized makeEngine helper to let specific tests opt into deeper targeted context sensitivity. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #371 +/- ##
=========================================
Coverage 71.66% 71.67%
- Complexity 2712 2731 +19
=========================================
Files 272 272
Lines 20292 20347 +55
Branches 3273 3283 +10
=========================================
+ Hits 14542 14583 +41
- Misses 4459 4470 +11
- Partials 1291 1294 +3 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Address PR #371 review feedback. - Reject a negative `targetedCfaDepth` at construction with an `IllegalArgumentException` via a shared `checkTargetedCfaDepth` guard, document the non-negative range, and cover it with a test. - Document that `Argmax`/`Argmin` shape inference does not honor the deprecated TF 1.x `dimension` axis alias (it reduces as `axis=None`); honoring it precisely is tracked by wala#572. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Routing every user `call`/`__call__`/`predict` method through the targeted k-CFA selector at the default depth added context-sensitivity suite-wide without retaining enough caller frames to fix the chained-layer collapse (which needs a depth above the default), inflating analysis time and call graphs for every model test. Apply the model-method predicate only when the caller opts into a depth greater than `DEFAULT_TARGETED_CFA_DEPTH`, so the default-depth behavior matches the prior context-insensitive analysis while `testNeuralNetwork*` (depth 4) keep the per-context layer-output fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Several `findOrCreatePointsToSet` call sites in `PythonTensorAnalysisEngine` (`getShapeSourceCalls`, `getSetShapeCallsSyntactic`, `getKeysDefinedByCall`) and `TensorGeneratorFactory.getPointsToSetVariable` materialized points-to-set variables for keys that can be implicitly represented. On an implicit key, WALA's `PropagationSystem.findOrCreatePointsToSet` prints the entire call graph's IR to stderr (a no-op-assertion debug path) and then continues. With the deeper context sensitivity added by the preceding commits, a single `testNeuralNetwork` hit this path ~911 times, dumping ~8.46M lines and dominating CI log I/O (the `verify` step ran ~4.5x longer). Guard these sites with `PropagationSystem.isImplicit`, matching the existing guards elsewhere in the analysis; an implicit key has no explicit dataflow variable to pin or track, so skipping it preserves behavior (full suite unchanged, and `testNeuralNetwork`'s output drops from ~15.3M lines to ~43K). Closes wala#573. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This reverts commit fc20a93. The gate was added on the hypothesis that the model-method predicate caused the CI slowdown. It did not: the slowdown was WALA dumping the entire call graph's IR on implicitly-represented PointerKeys, fixed in the preceding commit (wala#573). The gate is perf-neutral (measured), so drop it and let the model-method predicate apply unconditionally for simpler code. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`Argmax` delegates shape inference to `ReduceMean`, which reads its `keepdims`
argument at positional index 2. For `tf.math.argmax`, index 2 is `output_type`,
so a positional call like `tf.argmax(x, 0, tf.int32)` had its `output_type`
misread as `keepdims`: a dtype constant is neither boolean nor numeric, so
`ReduceMean` widened `keepdims` to `{false, true}` and unioned a spurious
`keepdims=true` shape (e.g. `(1, 3)`).
Extract `ReduceMean.getKeepDimsValues` and override it in `Argmax` to force
`{false}` — argmax/argmin have no `keepdims` parameter and always remove the
scanned axis. Add `testArgmaxOutputTypePositional` as a regression guard.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…wardMethod` Replace the `"call"` and `"__call__"` string literals with the existing `PythonTypes.CALLABLE_METHOD_NAME_FOR_KERAS_MODELS` and `PythonTypes.CALLABLE_METHOD_NAME` constants so the heuristic and the front-end share a single source of truth for the method names. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s in `ReduceMean`
`getKeepDimsValues`'s `ConstantKey` branch handled only `Boolean`, `Number`, and `null`, silently dropping any other constant and collapsing to `{false}`, contrary to the method's Javadoc. Add the missing `else` so such a constant widens to `{false, true}` for soundness, matching the non-constant case.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`Argmax.getDefaultShapes` delegates to `ReduceMean`, which resolved the input argument by the keyword name `input_tensor`. Argmax's parameter is named `input`, so a keyword call like `tf.math.argmax(input=x, axis=0)` failed argument resolution and threw `IllegalStateException` during shape inference. Make the input-parameter name and position overridable in `ReduceMean` and override them in `Argmax`, keeping the reduction logic in one place. Add `tf2_test_argmax_input_keyword.py` and `testArgmaxInputKeyword` as a regression guard. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Summary
A user
tf.keras.Modelsubclass invoked from multiple sites (e.g. a training loop and a test-set evaluation) merged into one context-insensitive call-graph node, collapsing its per-caller layer-output allocations and losing per-context tensor shape. Downstream,tf.argmaxof a collapsed layer output leaked one caller's shape into another context and threw aNonBroadcastableShapesExceptionintf.equal, which is whyArgmax.getDefaultShapesreturned ⊤ to sidestep the regression.This PR resolves that by making the targeted k-CFA depth configurable (wala#379) and applying deeper context to user model forward methods (wala#530), then restoring the precise
tf.argmax/tf.argminoutput shape.Changes
2on the targeted context selector becomesDEFAULT_TARGETED_CFA_DEPTH(default 2, fully back-compat), exposed as aPythonTensorAnalysisEngineconstructor parameter. Clients supply a deeper value when allocations merge across call sites; only the targeted selector (framework APIs plus model methods) uses it, never the whole program.testNeuralNetworkstyle) wala/ML#530): the targeted predicate is extended to usertf.keras.Modelforward methods so a model called from multiple sites is analyzed per caller, keeping its layer-output allocations distinct.tf.argmax/tf.argminshape:Argmax.getDefaultShapesnow delegates toReduceMean's keepdims=false reduction (input shape with theaxisdimension removed), since the per-context separation removes the broadcast regression that previously forced ⊤.makeEngine/testoverloads (mirroring the analogous knob in the Java streams tooling), sotestNeuralNetwork1-4opt into depth 4 while every other test stays at the default.Detection: Heuristic vs CHA-subtype (Dormant Path)
The clean way to recognize a user model method is a class-hierarchy subtype check against
tf.keras.Model. That check is wired but intentionally dormant: the front-end does not yet recordclass X(Model)inheritance, so a user model class resolves its superclass toLobjectrather than the KerasModeltype (filed as wala#571, an existing front-end TODO). Until that lands, a structural heuristic over usercall/__call__/predictmethods drives the predicate. When wala#571 is resolved, the heuristic is deleted andisModelSubclassMethodtakes over—both are present so the switch is a one-line change.Testing
mvn test).testNeuralNetwork1-4run at depth 4:accuracy'sy_predis now the correct per-context union{(256, 10) float32, ? float32}, and the per-context multiplicity fortestNeuralNetworkrises to 24.tf.argmax/tf.argminshape tests tighten from ⊤ to the precise(3,)the fixtures assert at runtime.Closes wala#379.
Closes wala#530.