Skip to content

Research spike: compare current graph retrieval vs task-conditioned program slicing #71

@mohanagy

Description

@mohanagy

Goal

Validate whether graphify-ts should move from the current repo → graph → retrieval model toward task → anchors → program slice → budgeted context pack.

This is a research/measurement issue before large rewrites.

Why

Recent real usage showed inconsistent behavior: running against only backend/ was slower/noisier, while running against the full GoValidate workspace produced better/faster/lower-token results. That suggests the problem may be methodological, not just optimization.

Scope

Create a small evaluation harness that compares:

  1. Current graphify-ts retrieval/context-pack behavior
  2. Simple lexical/file retrieval baseline
  3. Prototype task-conditioned slicing strategy
  4. Optional manual/full-context baseline where practical

Use at least 5–10 real prompts from a TypeScript/NestJS backend or a similar large repo.

Prompts to include

Examples:

  • Explain the auth flow end to end
  • Why is report generation slow?
  • Can this PR break onboarding?
  • What tests should cover this change?
  • What can break if this service changes?
  • Where does this config/env variable affect runtime behavior?

Metrics

Capture:

  • runtime
  • output token count / context token count
  • selected files/symbols count
  • missing-context rate
  • irrelevant-context rate
  • whether selected evidence is enough to answer
  • false-confidence cases

Deliverables

  • docs/experiments/task-conditioned-slicing.md
  • Script or fixture under examples/ or src/**/__tests__ that can be rerun
  • A comparison table of current vs prototype behavior
  • Recommendation: keep current method, adjust it, or move to slicing architecture

Acceptance criteria

  • At least 5 real prompts evaluated
  • Current behavior and prototype behavior are compared using the same prompts
  • Results include both quality notes and token/runtime measurements
  • The issue ends with concrete next-step recommendations, not vague notes

Suggested labels

enhancement, research, performance, context-quality

Metadata

Metadata

Assignees

No one assigned

    Labels

    context-qualityQuality of the compiled context packenhancementNew feature or requestperformanceRuntime, latency, throughput, or token-cost workresearchResearch spike or measurement work

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions