Skip to content

Migrate set_shape recognition from Java IR-scan to <method> on Tensor classes in tensorflow.xml #550

@khatchad

Description

@khatchad

Motivation

PythonTensorAnalysisEngine.getSetShapeCallsSyntactic (introduced in ponder-lab/ML#333 as a near-term fix for #509) recognizes x.set_shape(s) callsites by scanning every CGNode's IR for the PythonPropertyRead("set_shape") + invoke pattern. This works but lives in Java, separate from the existing TF API modeling layer (tensorflow.xml).

The natural home for the recognition is tensorflow.xml, in line with how Dataset already models its instance methods.

Investigation Findings (2026-05-26)

A draft PR (ponder-lab/ML#338, now closed) explored declaring <method name="set_shape"> directly on each Tensor/SparseTensor <class> block. Two structural problems surfaced:

  1. The trampoline indirection. Declaring <method> on a <class> triggers WALA's PythonInstanceMethodTrampolineTargetSelector. User code's tensor.set_shape(shape) invokes a TRAMPOLINE (L$<class>/set_shape.trampoline<N>()LRoot;, with the $ prefix added at PythonInstanceMethodTrampolineTargetSelector.java:239 and the trampoline<numTotalParameters> selector per PythonMethodTrampolineTargetSelector.java:50), which then dispatches to the underlying do(). The legacy getShapeSourceCalls machinery either walks the wrong direction (targeting do() finds the trampoline, not user code) or, if pointed at the trampoline, can't recover the receiver because the trampoline's auto-generated body doesn't preserve the def-to-receiver aliasing that the legacy callable-as-attribute pattern relied on.

  2. Receiver vs. def semantics. set_shape's purpose is to MUTATE the receiver's tensor classification. getShapeSourceCalls pins call.getDef() (the call's return value). The LEGACY mechanism worked because <class name="set_shape">.do() had <return value="self"/> — in the callable-as-attribute context, self referred to the callable instance, aliasing through PA to the receiver. With the trampoline indirection, that aliasing path is broken.

Architecturally-Consistent Alternative: Dataset Pattern

The Dataset class doesn't declare its instance methods (batch, map, shuffle, etc.) as <method> blocks inside <class name="Dataset">. Instead, each is a STANDALONE callable class (<class name="batch">, <class name="shuffle">, etc.), and Dataset instances have these callables attached via <putfield> at every Dataset-producing endpoint (from_tensor_slices.do, shuffle.do, etc.).

This bypasses the trampoline entirely: PropertyRead-based attribute access on a putfield-attached callable invokes the callable's do() directly, no trampoline.

The legacy set_shape already uses this pattern on FixedLenFeature (tensorflow.xml:2129: <putfield field="set_shape" ref="x" value="set_shape_callable"/>). Extending the pattern to every Tensor allocation site would resolve #550 without touching the trampoline mechanism.

Proposed Migration

  1. Keep the existing <class name="set_shape"> callable at tensorflow.xml:1660.
  2. Add <putfield ref="x" field="set_shape" value="set_shape_callable"/> to every <new def="x" class="Ltensorflow/.../Tensor"/> (and SparseTensor) site in tensorflow.xml. Estimated count: ~130 Tensor allocations.
  3. Update PythonTensorAnalysisEngine to remove getSetShapeCallsSyntactic and restore the legacy getShapeSourceCalls(set_shape, ...) call now that every Tensor allocation has the attribute attached.
  4. Remove the cast pass_through alias (already done by Decouple set_shape from cast pass_through alias chain ponder-lab/ML#333).

Cost / Trade-Off

Related

Metadata

Metadata

Assignees

No one assigned

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions