Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
136 changes: 136 additions & 0 deletions EFFECT_INFERENCE_PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Effect Inference & Enforcement Plan

## Goal

Infer ownership effects (Borrow/Store/Send/Freeze) for user-defined functions and enforce them at call sites. This is the foundation for arena-based allocation and actor-safe value transfer.

## Current State

- Effect vocabulary: COMPLETE (Borrow, Store, Send, Freeze, BorrowAll, PassThrough, FlowInto)
- Builtin annotations: COMPLETE (rawset→Store, table.insert→Store, print→BorrowAll, etc.)
- Propagation engine: COMPLETE (fixpoint loop unions callee effects into caller)
- Serialization: COMPLETE (all labels round-trip through module manifests)
- Inference for user code: MISSING
- Enforcement: MISSING
- Diagnostics: MISSING

## Escape Sites (Where inference must happen)

### Table store: `t[k] = v` / `t.field = v`
- **File**: `compiler/check/flowbuild/assign/emit.go`
- **Location**: `TargetField` at line 515, `TargetIndex` at line 562
- **Effect**: If `v` is a parameter → `Store{Param: paramIdx(v), Into: paramIdx(t)}`
- **Note**: Only when target base symbol is a parameter or upvalue

### Upvalue capture
- **File**: `compiler/check/infer/captured/captured.go`, `FromParentFacts` line 11
- **Binding**: `compiler/bind/table.go`, `CapturedSymbols` line 428
- **Effect**: If captured symbol is a parameter AND the closure escapes → `Store{Param: paramIdx}`
- **Note**: Connect to `CaptureInfo.Escapes` (types/typ/capture.go line 28)

### Return escape
- **File**: `compiler/cfg/graph.go`, `EachReturn` line 647
- **Data**: `ReturnInfo.Symbols` in `compiler/cfg/types.go` line 349
- **Effect**: If returned symbol is a parameter → parameter escapes via return. Model as PassThrough (already exists) or Store{Into: -1}

### Yield escape
- **File**: `compiler/stdlib/coroutine.go` line 20
- **Fix**: Annotate `coroutine.yield` with `Send{FromParam: 0}`
- **Note**: No new label needed, Send covers this

### Global store
- **File**: `compiler/check/flowbuild/assign/emit.go`, `TargetIdent` at line 280
- **Condition**: `bindings.Kind(sym) == cfg.SymbolGlobal`
- **Effect**: If RHS is a parameter → `Store{Param: paramIdx, Into: -1}`

## No New Labels Needed

| Pattern | Label | Rationale |
|---------|-------|-----------|
| Table store | Store{Param, Into} | Value stored into structure |
| Closure capture + escape | Store{Param, Into: -1} | Capture is opaque store |
| Return | PassThrough / FlowInto | Already exist |
| Yield | Send{FromParam} | Cross-boundary transfer |
| Global store | Store{Param, Into: -1} | Long-lived destination |
| Freeze | Freeze{Param} | Already exists |

## Implementation

### Step 1: Annotate stdlib gaps
- File: `compiler/stdlib/coroutine.go`
- Add Send to yield, BorrowAll to pure coroutine functions
- Review channel/process stubs for missing Send annotations

### Step 2: Create `InferOwnershipEffects`
- New file: `compiler/check/effects/ownership.go`
- Signature: `func InferOwnershipEffects(graph *cfg.Graph, bindings *bind.Table, params []ParamInfo) effect.Row`
- Walks: EachAssign (field/index targets), EachReturn, captured symbols
- Produces: Store/Borrow labels relative to function parameters
- For parameters not observed to escape → Borrow{Param: i}
- For parameters that escape → Store{Param: i, Into: ...}

### Step 3: Integrate into effects.Propagate
- File: `compiler/check/effects/propagate.go` line 20
- After computing fnEffect.Row, call InferOwnershipEffects and union result
- Ownership labels enter the fixpoint alongside Throw/IO/Diverge

### Step 4: Translate callee ownership at call sites
- The critical piece: callee Store labels reference callee params, not caller params
- New function: `TranslateCalleeOwnership(calleeRow, callInfo, callerParams) effect.Row`
- For each callee Store{Param: i}, find which caller expression maps to callee param i
- If that expression is a caller param → Store{Param: callerParamIdx} on caller
- If expression is not a param (literal, local) → no ownership effect on caller
- Integrate into the EachCallSite callback in Propagate

### Step 5: Enforcement hooks
- File: `compiler/check/hooks/call_check.go`, `checkSingleCall` line 64
- Check: callee has Store + arg is frozen → diagnostic
- Check: callee has Send + arg is mutable → diagnostic or auto-freeze
- Check: callee has Freeze → mark arg as frozen in subsequent flow

### Step 6: Testing
- Unit tests: `compiler/check/effects/ownership_test.go`
- Fixtures: `testdata/fixtures/ownership/{table-store,closure-capture,return-escape,global-store,frozen-write}/`
- Cross-module: multi-file fixtures verifying exported function effects
- Effect assertions: extend fixture harness with `-- expect-effect: store(param[0])` or check via `sess.Store.InterprocPrev.Refinements`

## Performance

- InferOwnershipEffects runs once per function per fixpoint iteration (cheap)
- Body scan is O(assignments + returns + closures) — already iterated
- Effect rows use deduplication — adding same label twice is no-op
- Union is O(n*m) where n,m are label counts — typically 1-5 per function
- No new flow domain needed — ownership is function-level summary, not per-point lattice

## CaptureInfo relationship

- `CaptureInfo.Escapes` (types/typ/capture.go) → runtime codegen concern (heap vs stack upvalue)
- Effect `Store` label → type-level ownership concern (lifetime extends beyond call)
- Complementary, not redundant
- When inferring ownership: consult CapturedSymbols to identify captures, use CaptureInfo.Escapes to determine if closure escapes

## Dependency Order

```
Step 1 (stdlib annotations) → independent
Step 2 (InferOwnershipEffects) → needs graph API understanding
Step 3 (integrate Propagate) → depends on Step 2
Step 4 (callee translation) → depends on Step 3, most complex
Step 5 (enforcement) → depends on Steps 3-4
Step 6 (testing) → parallel with each step
```

## Critical Files

| Purpose | File |
|---------|------|
| Inference integration | `compiler/check/effects/propagate.go` |
| Escape site detection | `compiler/check/flowbuild/assign/emit.go` |
| Enforcement | `compiler/check/hooks/call_check.go` |
| Label definitions | `types/effect/label.go` |
| Fixpoint loop | `compiler/check/pipeline/driver.go` |
| Capture binding | `compiler/bind/table.go` |
| Capture types | `compiler/check/infer/captured/captured.go` |
| Return info | `compiler/cfg/graph.go` (EachReturn) |
| Effect export | `compiler/check/effects/export.go` |
| Stdlib annotations | `compiler/stdlib/coroutine.go` |
139 changes: 139 additions & 0 deletions PERF_FIX_PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# Lint Deadlock Fix: Route Through Salsa Query Layer

## Problem

`wippy lint` deadlocks on large projects (300-900 files). Root cause: expensive type operations called thousands of times without caching, bypassing the existing Salsa-style query system (`types/db/query.go`).

### Hot paths (from profiling keeper, docker-demo, be-common-components)

1. **`PruneSoftUnionMembers`** — 49% CPU. Called from `isSubtype` on every subtype check. Walks entire type tree looking for soft union members. Per-call memo discarded after each invocation.

2. **`ExpandInstantiated`** — was 42% CPU before per-call memo fix. 16 direct calls bypass `Engine.ExpandInstantiated` query.

3. **`applyInferSubst`** — was 75% CPU in constraint solver. No Salsa analog; internal memo added.

4. **`walkType` (`occursIn`, `containsTypeVar`)** — was 68% CPU. No Salsa analog; seen-set added.

5. **`typeContainsNever`** — caused 148GB allocations in docker-demo. No Salsa analog; seen-set added.

## Architecture

The Salsa query layer already exists:

```
types/query/core/Engine
.IsSubtype(ctx, sub, super) -> db.Query "IsSubtype"
.ExpandInstantiated(ctx, t) -> db.Query "ExpandInstantiated"
.Widen(ctx, t) -> db.Query "Widen"
.Field/Method/Index(ctx, ...) -> db.Query with memoization
```

Interface: `types/query/core/TypeOps` — accepted by synthesizer, hooks, etc.

Problem: 80+ call sites use raw `subtype.IsSubtype()` instead of `Engine.IsSubtype()` because they lack `TypeOps` or `*db.QueryContext`.

## Plan

### Phase 1: Thread TypeOps through hot subsystems

The 80 direct `subtype.IsSubtype` calls cluster in these files:

| File | Count | Has TypeOps? |
|------|-------|-------------|
| returns/join.go | 10 | No — needs threading from pipeline |
| synth/ops/check.go | 6 | Yes — has `synth.Engine` with TypeOps |
| returns/widen.go | 5 | No — needs threading from pipeline |
| flow/query.go | 4 | Has QueryContext |
| constraint/solver.go | 4 | No context — leaf-level |
| flowbuild/assign/error_return_policy.go | 4 | Has scope with TypeOps |
| narrow/narrow.go | 3 | No context — leaf-level |
| flow/transfer.go | 3 | Has QueryContext |
| constraint/infer.go | 3 | No context — leaf-level |
| hooks/assign_check.go | 3 | Yes — hooks receive TypeOps |
| Others (2 each) | ~20 | Mixed |

Strategy per group:

**A. Already have TypeOps (synth/ops, hooks, flowbuild):** Change `subtype.IsSubtype(a, b)` to `ops.IsSubtype(ctx, a, b)`. Mechanical replacement.

**B. Pipeline subsystems (returns/join.go, returns/widen.go):** Thread `TypeOps` from `pipeline.Runner` into return inference. The `Runner` already has `types core.TypeOps`. Pass it down to return join/widen functions.

**C. Flow system (flow/query.go, flow/transfer.go):** Already have `*db.QueryContext`. Need access to `Engine` or expose `IsSubtype` as a query they can call.

**D. Leaf-level (constraint/solver.go, narrow/narrow.go, constraint/infer.go):** These are deep in the type system with no infrastructure access. Options:
- Accept: these calls are on small types (constraint variables, narrowed results) and less hot
- Or: pass a subtype checker function as a parameter

### Phase 2: Make PruneSoftUnionMembers a derived query

Current: `isSubtype()` calls `PruneSoftUnionMembers(sub)` and `PruneSoftUnionMembers(super)` every time.

Option A — **Prune-on-construction**: When creating Union types (`typ.NewUnion`), prune soft members immediately. Then `PruneSoftUnionMembers` becomes a no-op. Risk: changes Union construction semantics, may break inference that needs soft members temporarily.

Option B — **Lazy flag**: Add a `hasSoft` bit to Union type. `PruneSoftUnionMembers` checks the flag and returns immediately if false. Set the flag during `NewUnion` construction by checking members. Cost: one bool per Union, one check per member at construction.

Option C — **Query wrapper**: Add `PruneSoft` to `Engine` as a memoized query. Callers going through `Engine.IsSubtype` already benefit. Direct `subtype.IsSubtype` calls still pay the cost.

**Recommendation: Option B (lazy flag)**. It's the most canonical — the type knows about itself, no external caching needed.

### Phase 3: Remove ad-hoc memos that duplicate Salsa

After Phase 1-2, review and remove:

- `subst.go` expand memo pool — **keep**: prevents exponential blowup *within* a single expansion. Salsa caches the final result but not intermediate recursion. This is algorithmic, not caching.
- `constraint/infer.go` `applyInferSubst` memo — **keep**: internal to constraint solver, no Salsa analog. Converts O(n^2) cycle detection to O(n) with result memo.
- `constraint/infer.go` `walkTypeMemo` seen set — **keep**: prevents re-walking same pointer in `occursIn`/`containsTypeVar`. No Salsa analog.
- `returns/join.go` `typeContainsNeverMemo` seen set — **keep**: prevents unbounded recursion. No Salsa analog.

These are not "non-canonical memoization" — they're algorithmic fixes within compute functions. Salsa memoizes query results; these prevent pathological behavior during a single query computation.

### Phase 4: Audit new code for direct bypasses

Add a lint/review rule: new code should use `TypeOps.IsSubtype` not `subtype.IsSubtype` when `TypeOps` is available. The raw function is for the engine's compute function and leaf-level code without infrastructure.

## Files to modify

### Phase 1A — Mechanical (TypeOps already available)
- `compiler/check/synth/ops/check.go` (6 calls)
- `compiler/check/hooks/assign_check.go` (3 calls)
- `compiler/check/hooks/return_check.go` (2 calls)
- `compiler/check/hooks/field_check.go` (2 calls)
- `compiler/check/synth/ops/call.go` (2 calls)
- `compiler/check/synth/ops/typecheck.go` (2 calls)
- `compiler/check/synth/ops/generic.go` (1 call)
- `compiler/check/synth/phase/extract/*.go` (3 calls)
- `compiler/check/synth/phase/resolve/resolver.go` (2 calls)
- `compiler/check/flowbuild/assign/error_return_policy.go` (4 calls)
- `compiler/check/flowbuild/assign/precision.go` (2 calls)

### Phase 1B — Thread TypeOps into return inference
- `compiler/check/returns/join.go` (10 calls)
- `compiler/check/returns/widen.go` (5 calls)
- `compiler/check/pipeline/runner.go` (pass TypeOps to return inference)

### Phase 1C — Flow system
- `types/flow/query.go` (4 calls)
- `types/flow/transfer.go` (3 calls)
- `types/flow/type_facts.go` (1 call)

### Phase 1D — Leaf-level (defer or pass function)
- `types/constraint/solver.go` (4 calls)
- `types/constraint/infer.go` (3 calls)
- `types/narrow/narrow.go` (3 calls)
- `types/query/core/field.go` (2 calls)
- `types/query/core/index.go` (2 calls)
- `types/query/core/instantiate.go` (1 call)
- `ltype.go` (1 call)

### Phase 2 — Union soft flag
- `types/typ/union.go` — add `hasSoft bool` field, set in constructor
- `types/typ/soft.go` — check flag in `PruneSoftUnionMembers`
- `types/typ/rebuild.go` — ensure flag propagation in NewUnion

## Execution order

1. Phase 2 first (Union soft flag) — biggest impact, self-contained
2. Phase 1A — mechanical, low risk
3. Phase 1B — moderate refactor, high impact (returns/join is hottest)
4. Phase 1C — flow system routing
5. Phase 1D — evaluate if needed after 1-3
57 changes: 48 additions & 9 deletions compiler/bind/table.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,9 @@ type BindingTable struct {
// names stores the original source name for each symbol
names map[cfg.SymbolID]string

// symbolsByName lazily indexes symbols by source name in ascending symbol order.
symbolsByName map[string][]cfg.SymbolID

// paramSymbols maps functions to their parameter symbol list
paramSymbols map[*ast.FunctionExpr][]cfg.SymbolID

Expand Down Expand Up @@ -167,7 +170,16 @@ func (t *BindingTable) Kind(sym cfg.SymbolID) (cfg.SymbolKind, bool) {

// SetName records the source name associated with a symbol.
func (t *BindingTable) SetName(sym cfg.SymbolID, name string) {
if t == nil || sym == 0 {
return
}
if prev, ok := t.names[sym]; ok {
if prev == name {
return
}
}
t.names[sym] = name
t.symbolsByName = nil
}

// Name returns the source name of a symbol, or empty if unknown.
Expand All @@ -179,21 +191,48 @@ func (t *BindingTable) Name(sym cfg.SymbolID) string {
//
// Results are sorted by symbol ID for deterministic iteration.
func (t *BindingTable) SymbolsByName(name string) []cfg.SymbolID {
syms := t.SymbolsByNameReadOnly(name)
if len(syms) == 0 {
return nil
}
out := make([]cfg.SymbolID, len(syms))
copy(out, syms)
return out
}

// SymbolsByNameReadOnly returns the stored symbols for a source name.
//
// The returned slice is sorted by symbol ID and must be treated as read-only.
func (t *BindingTable) SymbolsByNameReadOnly(name string) []cfg.SymbolID {
if t == nil || name == "" {
return nil
}
result := make([]cfg.SymbolID, 0, 2)
for sym, n := range t.names {
if n == name && sym != 0 {
result = append(result, sym)
if t.symbolsByName == nil {
t.symbolsByName = t.buildSymbolsByNameIndex()
}
return t.symbolsByName[name]
}

func (t *BindingTable) buildSymbolsByNameIndex() map[string][]cfg.SymbolID {
if t == nil || len(t.names) == 0 {
return nil
}
index := make(map[string][]cfg.SymbolID)
for sym, name := range t.names {
if sym == 0 || name == "" {
continue
}
index[name] = append(index[name], sym)
}
if len(result) > 1 {
sort.Slice(result, func(i, j int) bool {
return result[i] < result[j]
})
for name, syms := range index {
if len(syms) > 1 {
sort.Slice(syms, func(i, j int) bool {
return syms[i] < syms[j]
})
}
index[name] = syms
}
return result
return index
}

// SetParamSymbols records the ordered parameter symbols for a function.
Expand Down
31 changes: 31 additions & 0 deletions compiler/bind/table_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -315,6 +315,37 @@ func TestBindingTable_SymbolsByName_UnknownOrEmpty(t *testing.T) {
}
}

func TestBindingTable_SymbolsByNameReadOnly_TracksUpdatesAndCopyIsolation(t *testing.T) {
table := NewBindingTable()
alpha := cfg.NextSymbolID()
beta := cfg.NextSymbolID()

table.SetName(alpha, "collect")
table.SetName(beta, "collect")

got := table.SymbolsByNameReadOnly("collect")
if len(got) != 2 || got[0] != alpha || got[1] != beta {
t.Fatalf("SymbolsByNameReadOnly(\"collect\") = %v, want [%d %d]", got, alpha, beta)
}

copied := table.SymbolsByName("collect")
copied[0] = beta
got = table.SymbolsByNameReadOnly("collect")
if len(got) != 2 || got[0] != alpha || got[1] != beta {
t.Fatalf("SymbolsByName copy should not mutate stored index, got %v", got)
}

table.SetName(beta, "other")
got = table.SymbolsByNameReadOnly("collect")
if len(got) != 1 || got[0] != alpha {
t.Fatalf("SymbolsByNameReadOnly(\"collect\") after rename = %v, want [%d]", got, alpha)
}
other := table.SymbolsByNameReadOnly("other")
if len(other) != 1 || other[0] != beta {
t.Fatalf("SymbolsByNameReadOnly(\"other\") after rename = %v, want [%d]", other, beta)
}
}

func TestBindingTable_Globals(t *testing.T) {
table := NewBindingTable()

Expand Down
Loading