Suggest categories for uncategorized chats using similarity scores

## Suggest categories for uncategorized chats using similarity scores

### Goal
Help users quickly scan **uncategorized chats** and identify likely topics by showing how similar each chat is to existing user-defined categories.

This is a **discovery + triage tool**, not a validation tool.

---

### UI Behavior

For chats that are **not yet assigned to a category**, display 1–3 similarity indicators:

Likely: Coding (82%)
Also: GitHub (74%)


or more compact:

Coding 82% · GitHub 74%


**Do NOT show this on already categorized chats (v1).**

---

### How It Works (Conceptually)

- User creates categories (e.g. Coding, Writing, Business)
- User assigns a minimum number of chats to each category (e.g. ≥5)
- Each category gets a **text fingerprint/profile**
- Uncategorized chats are compared against these category profiles
- The UI shows the strongest matches as **suggestions**

---

### Important UX Constraints

- This is **not auto-categorization**
- This is **not a ground truth label**
- Language should reflect uncertainty:
  - “Likely”
  - “Similar to”
  - “Matches pattern”
- The feature should feel **assistive**, not authoritative

---

### MVP Implementation Plan

Start simple and local-first:

- Use **TF-IDF + cosine similarity**
- Compare:
  - Chat title + user prompts (default)
  - (Optional later: full conversation)

---

### Performance Strategy

Avoid expensive comparisons:

Instead of:
> chat → every chat in every category

Do:
> chat → category fingerprint

Where each category fingerprint is:

- A cached vector representation
- Built from all chats in that category
- Recomputed only when category assignments change

---

### Acceptance Criteria

- Only appears when:
  - At least 1 category exists
  - Category has ≥5 assigned chats
- Shows top 1–3 category matches
- Scores are cached locally (SQLite)
- Updates when:
  - A chat is categorized/uncategorized
- Can be toggled on/off in settings
- Does not noticeably slow down UI

---

### Future Enhancements (NOT in this issue)

- “Auto-suggest category” button
- Batch categorize similar chats
- “Similar chats” panel
- Embeddings-based similarity (optional)
- Adjustable weighting (title vs prompt vs full text)

---

### Open Questions

- What minimum category size feels right? (3? 5? 10?)
- Should we normalize scores across categories?
- Should we hide scores below a threshold (e.g. <50%)?

---

### Why This Matters

- Reduces manual sorting friction
- Makes large chat histories scannable
- Helps surface patterns the user didn’t explicitly define yet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggest categories for uncategorized chats using similarity scores #1