Skip to content

feat(store): replace FTS5 default ranking with weighted BM25 #241

@Snakeblack

Description

@Snakeblack

Pre-flight Checks

  • I have searched existing issues and this is not a duplicate
  • I understand this issue needs status:approved before a PR can be opened

Problem Description / Descripción del Problema

English: The current FTS5 search in Store.Search uses SQLite's default ranking (fts.rank), which treats all columns equally. This means a match in title (high semantic value) has the same weight as a match in tool_name (irrelevant for search). This produces suboptimal ranking where noise-heavy columns dilute results.

Español: La búsqueda FTS5 actual en Store.Search usa el ranking por defecto de SQLite (fts.rank), que trata todas las columnas por igual. Esto significa que un match en title (alto valor semántico) tiene el mismo peso que un match en tool_name (irrelevante para búsqueda). Esto produce un ranking subóptimo donde columnas ruidosas diluyen los resultados.

Proposed Solution / Solución Propuesta

English: Replace fts.rank with bm25(observations_fts, 5.0, 1.0, 0.0, 0.0, 0.0, 3.0) as rank in the FTS5 search query. This applies weighted BM25 scoring:

  • title = 5.0 (highest discriminant)
  • content = 1.0 (baseline)
  • tool_name = 0.0 (irrelevant)
  • type = 0.0 (already filtered by SearchOptions)
  • project = 0.0 (already filtered by SearchOptions)
  • topic_key = 3.0 (high relational affinity)

The direct topic_key route (Rank = -1000) remains untouched. Only 2 lines of SQL change.

Español: Reemplazar fts.rank por bm25(observations_fts, 5.0, 1.0, 0.0, 0.0, 0.0, 3.0) as rank en la query FTS5. Esto aplica scoring BM25 ponderado con los pesos indicados arriba. La ruta directa por topic_key (Rank = -1000) no se modifica. Solo 2 líneas de SQL cambian.

Affected Area

Store (SQLite queries)

Additional Context

  • Zero dependencies, zero schema migration
  • BM25 is native to FTS5 — no custom functions needed
  • Verified against observations_fts 6-column schema

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions