Skip to content

Cache TTC pipeline results in a second OpenSearch index #533

@nickclyde

Description

@nickclyde

Summary

Add a result cache to the TTC Lambda pipeline so that repeat text candidates skip embedding, KNN, and reranking. Cache lives in a new OpenSearch index in the existing cluster, keyed by (normalized_text, data_field) with the final reranked LOINC Code as the value.

Motivation

Common lab test names (e.g. "Hemoglobin A1c", "BUN", "Creatinine") will likely repeat across eICRs. Currently every pipeline run pays for a SentenceTransformer encode, an OpenSearch KNN query, and a CrossEncoder rerank. The pipeline output is deterministic on (candidate text, data field), so a content-addressable cache eliminates the redundant work without changing semantics.

Proposed design

New cache index (ttc-result-cache, same OpenSearch domain, no KNN field):

{
  "mappings": {
    "properties": {
      "cache_key":   {"type": "keyword"},
      "text":        {"type": "keyword"},
      "data_field":  {"type": "keyword"},
      "code":        {"type": "object", "enabled": false},
      "score":       {"type": "float"},
      "cached_at":   {"type": "date"}
    }
  }
}

Key: sha256(text.strip().lower() + "|" + data_field.value), used as the OpenSearch doc _id.

Value: serialized Code (from shared_models) plus the reranker score and a cached_at timestamp for diagnostics.

Pipeline integration: in text-to-code-lambda/lambda_function.py, between select_relevant_text and embed:

  1. Compute key, attempt GET /ttc-result-cache/_doc/{key}.
  2. Hit → reconstruct Code, build NonstandardCodeInstance, emit cache_hit metric, continue.
  3. Miss → run existing embed/query/rerank, write the result back to the cache index, emit cache_miss metric.

Encapsulate this in a new ResultCache service under packages/text-to-code/src/text_to_code/services/result_cache.py.

Invalidation: wipe-and-refill on reindex. The ttc-index-lambda is invoked at deploy time; extend it to DELETE /ttc-result-cache and recreate the empty index. When the OSIS pipeline reloads LOINC data, a follow-up invocation of the index Lambda clears the cache so stale entries can't survive a LOINC refresh.

Tasks

  • Add ResultCache service in packages/text-to-code/src/text_to_code/services/result_cache.py with get / put methods and unit tests in packages/text-to-code/tests/unit/test_result_cache.py.
  • Extend packages/index-lambda/src/index_lambda/lambda_function.py to create (and on each invocation, drop and recreate) the ttc-result-cache index.
  • Wire the cache into packages/text-to-code-lambda/src/text_to_code_lambda/lambda_function.py per-error loop.
  • Emit cache_hit / cache_miss CloudWatch metrics (via embedded metric format) and a cache_lookup_ms timing.
  • Add RESULT_CACHE_INDEX_NAME env var to terraform/main.tf for both the TTC and index Lambdas (default ttc-result-cache).
  • Document the cache and its invalidation behavior in terraform/README.md.
  • Update the existing e2e test (e2e/test_e2e.py).
  • Verify that e2e pipeline output is unchanged hit-vs-miss using aws_e2e.sh.

Out of scope / questions

  • Caching of intermediate artifacts (embedding / raw KNN hits)? Should only the final reranked top result get cached, or should we store the retriever/reranker results?
  • Per-key TTL or version-tagging. Wipe-and-refill on reindex is the only invalidation mechanism for v1.

Risks

  • Cold-start penalty after reindex: every entry must be repopulated. Mitigated by the fact that reindex is monthly and warm-up is cheap relative to total volume.
  • Cache index growth: at ~1 KB/entry, 1 M unique texts ≈ 1 GB. Well within current 10 GB EBS per node, but worth a CloudWatch alarm on index size.
  • Filter logic drift: if VectorSearchParams.filter_value mapping changes (e.g., new data fields, changed loinc_type semantics), cached results become incorrect for the new mapping. Mitigation: include filter-mapping version in the key, OR force a wipe whenever the mapping changes.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions