Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 15 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,9 @@ For example, in a TMAP of pet breed images, following the branch from terriers t
Because the layout is a tree, you get operations that point clouds can't support:

```python
path = model.path(idx_a, idx_b) # nodes along the tree path
d = model.distance(idx_a, idx_b # sum of edge weights along the path
pseudotime = model.distances_from(idx) # tree distance from one point to all others
path = model.path(idx_a, idx_b) # nodes along the tree path
d = model.distance(idx_a, idx_b) # sum of edge weights along the path
pseudotime = model.distances_from(idx) # tree distance from one point to all others
```

## Installation
Expand Down Expand Up @@ -116,22 +116,22 @@ from tmap.utils.singlecell import from_anndata

| Notebook | Topic |
|----------|-------|
| [01 Quick Start](notebooks/01_quickstart.ipynb) | End-to-end walkthrough |
| [02 MinHash Deep Dive](notebooks/02_minhash_deep_dive.ipynb) | Encoding methods and when to use each |
| [03 Legacy LSH Pipeline](notebooks/03_legacy_lsh_pipeline.ipynb) | Lower-level MinHash + LSHForest + layout workflow |
| [04 Notebook Widgets](notebooks/04_jscatter_demo.ipynb) | Selection, filtering, zoom, export |
| [01 Quickstart](notebooks/01_quickstart.ipynb) | Shortest end-to-end walkthrough on a small molecule table |
| [02 Cheminformatics](notebooks/02_cheminformatics.ipynb) | SMILES → fingerprints → interactive molecular map |
| [03 Continuous Embeddings](notebooks/03_continuous_embeddings.ipynb) | Cosine and euclidean on MNIST: when to use each |
| [04 What's New](notebooks/04_new_functionalities.ipynb) | `add_points`, `transform`, tree paths, save/load, external kNN |
| [05 Single-Cell](notebooks/05_single_cell.ipynb) | RNA-seq with PBMC 3k, pseudotime, UMAP comparison |
| [06 Metric Guide](notebooks/06_metric_guide.ipynb) | Choosing the right metric |
| [07 FAQ](notebooks/07_faq.ipynb) | Troubleshooting and common questions |
| [08 Cheminformatics](notebooks/08_cheminformatics.ipynb) | Molecules, fingerprints, SAR |
| [09 Protein Analysis](notebooks/09_protein_analysis.ipynb) | FASTA, ESM embeddings, AlphaFold |
| [11 Card Configuration](notebooks/11_card_configuration.ipynb) | Pinned card layout, fields, and links |
| [11 Default Params Benchmark](notebooks/11_default_params_benchmark.ipynb) | Defaults across dataset sizes and types |
| [12 USearch Jaccard](notebooks/12_usearch_jaccard.ipynb) | Binary Jaccard with USearch backend |
| [06 FAQ](notebooks/06_faq.ipynb) | Troubleshooting and common questions |
| [07 MinHash Deep Dive](notebooks/07_minhash_deep_dive.ipynb) | Encoding methods and when to use each |
| [08 Notebook Widgets](notebooks/08_jscatter_demo.ipynb) | Coloring, tooltips, lasso selection with jupyter-scatter |
| [09 Card Configuration](notebooks/09_card_configuration.ipynb) | Pinned card layout, fields, and links |
| [10 Protein Analysis](notebooks/10_protein_analysis.ipynb) | FASTA, ESM embeddings, AlphaFold |
| [11 USearch Jaccard](notebooks/11_usearch_jaccard.ipynb) | Native binary Jaccard backend (high recall, low memory) |
| [12 Legacy LSH Pipeline](notebooks/12_legacy_lsh_pipeline.ipynb) | Lower-level MinHash + LSHForest + layout workflow |

## Lower-Level Pipeline

For direct control over indexing, hashing, and layout, see the [legacy pipeline notebook](notebooks/03_legacy_lsh_pipeline.ipynb). The main building blocks:
For direct control over indexing, hashing, and layout, see the [legacy pipeline notebook](notebooks/12_legacy_lsh_pipeline.ipynb). The main building blocks:

```python
from tmap.index import USearchIndex # dense / binary kNN
Expand Down
2 changes: 1 addition & 1 deletion notebooks/01_quickstart.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@
"source": [
"## What Next\n",
"\n",
"Move to `08_cheminformatics.ipynb` for molecular properties, scaffolds, and richer color layers.\n"
"Move to `02_cheminformatics.ipynb` for molecular properties, scaffolds, and richer color layers.\n"
]
}
],
Expand Down
2 changes: 1 addition & 1 deletion notebooks/03_continuous_embeddings.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@
"source": [
"## What about Jaccard?\n",
"\n",
"For binary fingerprints (molecular Morgan, MACCS, ECFP), use `metric=\"jaccard\"`. The estimator auto-routes to USearch with native Jaccard distance on the bits. See `08_cheminformatics.ipynb` for a full chemistry walkthrough.\n",
"For binary fingerprints (molecular Morgan, MACCS, ECFP), use `metric=\"jaccard\"`. The estimator auto-routes to USearch with native Jaccard distance on the bits. See `02_cheminformatics.ipynb` for a full chemistry walkthrough.\n",
"\n",
"For sparse single-cell data, `metric=\"jaccard\"` with a CSR matrix routes to MinHash and LSHForest. See `05_single_cell.ipynb` for that path.\n"
]
Expand Down
8 changes: 4 additions & 4 deletions notebooks/04_new_functionalities.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -449,11 +449,11 @@
"source": [
"## Where to go next\n",
"\n",
"- `08_cheminformatics.ipynb`: chemistry workflows with binary fingerprints\n",
"- `09_protein_analysis.ipynb`: protein sequences and embeddings\n",
"- `02_cheminformatics.ipynb`: chemistry workflows with binary fingerprints\n",
"- `10_protein_analysis.ipynb`: protein sequences and embeddings\n",
"- `05_single_cell.ipynb`: large sparse single-cell data\n",
"- `10_jscatter_demo.ipynb`: interactive notebook widgets\n",
"- `07_faq.ipynb`: short answers to common questions\n"
"- `08_jscatter_demo.ipynb`: interactive notebook widgets\n",
"- `06_faq.ipynb`: short answers to common questions\n"
]
}
],
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion notebooks/05_faq.ipynb → notebooks/06_faq.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
{
"cell_type": "markdown",
"metadata": {},
"source": "## My map changes between runs\n\nSet `seed=42`. If you also pass a `LayoutConfig`, set `cfg.deterministic = True` and `cfg.seed = 42`.\n\nThe `seed` controls the OGDF tree layout, which is fully deterministic: same kNN graph + same seed = identical coordinates.\n\nThe kNN step depends on the backend:\n\n- **MinHash + LSHForest** (sets / strings): deterministic for a given seed.\n- **USearch HNSW** (binary matrices, cosine, euclidean): approximate and multi-threaded. Neighbor sets may vary slightly across runs or platforms, but the resulting trees are nearly identical because the MST is robust to small kNN variations.\n\nIf you need bit-exact reproducibility for binary data, use the MinHash + LSHForest pipeline directly (see [03_legacy_lsh_pipeline.ipynb](03_legacy_lsh_pipeline.ipynb))."
"source": "## My map changes between runs\n\nSet `seed=42`. If you also pass a `LayoutConfig`, set `cfg.deterministic = True` and `cfg.seed = 42`.\n\nThe `seed` controls the OGDF tree layout, which is fully deterministic: same kNN graph + same seed = identical coordinates.\n\nThe kNN step depends on the backend:\n\n- **MinHash + LSHForest** (sets / strings): deterministic for a given seed.\n- **USearch HNSW** (binary matrices, cosine, euclidean): approximate and multi-threaded. Neighbor sets may vary slightly across runs or platforms, but the resulting trees are nearly identical because the MST is robust to small kNN variations.\n\nIf you need bit-exact reproducibility for binary data, use the MinHash + LSHForest pipeline directly (see [12_legacy_lsh_pipeline.ipynb](12_legacy_lsh_pipeline.ipynb))."
},
{
"cell_type": "markdown",
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@
"## 2. MinHash\n",
"\n",
"`batch_from_binary_array()` is the usual entry point for dense binary fingerprints.\n",
"See `02_minhash_deep_dive.ipynb` for the full set of `from_*` and `batch_from_*` methods.\n"
"See `07_minhash_deep_dive.ipynb` for the full set of `from_*` and `batch_from_*` methods.\n"
]
},
{
Expand Down