Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
141 changes: 132 additions & 9 deletions integrations/voyage.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,17 +24,47 @@ toc: true

- [Installation](#installation)
- [Usage](#usage)
- [Supported Models](#supported-models)
- [Example](#example)
- [Contextualized Embeddings Example](#contextualized-embeddings-example)
- [Multimodal Embeddings](#multimodal-embeddings)

[Voyage AI](https://voyageai.com/)'s embedding and ranking models are state-of-the-art in retrieval accuracy. The integration supports the following models:
- **`voyage-3.5`** and **`voyage-3.5-lite`** - Latest general-purpose embedding models with superior performance
- **`voyage-3-large`** and **`voyage-3`** - High-performance general-purpose embedding models
- **`voyage-4-large`**, **`voyage-4`**, and **`voyage-4-lite`** - Latest general-purpose embedding models with shared embedding space and MoE architecture
- **`voyage-3.5`** and **`voyage-3.5-lite`** - General-purpose embedding models with superior performance
- **`voyage-code-3`** - Optimized for code retrieval
- **`voyage-context-3`** - Contextualized chunk embedding model that preserves document context for improved retrieval accuracy
- **`voyage-2`** and **`voyage-large-2`** - Proven models that outperform `intfloat/e5-mistral-7b-instruct` and `OpenAI/text-embedding-3-large` on the [MTEB Benchmark](https://github.com/embeddings-benchmark/mteb)
- **`voyage-multimodal-3.5`** - Multimodal model supporting text, images, and video (preview)

For the complete list of available models, see the [Embeddings Documentation](https://docs.voyageai.com/embeddings/) and [Contextualized Chunk Embeddings](https://docs.voyageai.com/docs/contextualized-chunk-embeddings).

## Supported Models

### Text Embedding Models

| Model | Description | Dimensions |
|-------|-------------|------------|
| `voyage-4-large` | The best general-purpose and multilingual retrieval quality | 1024 (default), 256, 512, 2048 |
| `voyage-4` | Optimized for general-purpose and multilingual retrieval quality | 1024 (default), 256, 512, 2048 |
| `voyage-4-lite` | Optimized for latency and cost | 1024 (default), 256, 512, 2048 |
| `voyage-3.5` | General-purpose embedding model | 1024 |
| `voyage-3.5-lite` | Efficient model with lower latency | 1024 |
| `voyage-code-3` | Optimized for code retrieval | 1024 |

### Multimodal Embedding Models

| Model | Description | Dimensions | Modalities |
|-------|-------------|------------|------------|
| `voyage-multimodal-3` | Multimodal embedding model | 1024 | Text, Images |
| `voyage-multimodal-3.5` | Multimodal embedding model (preview) | 256, 512, 1024, 2048 | Text, Images, Video |

### Reranker Models

| Model | Description |
|-------|-------------|
| `rerank-2` | High-accuracy reranker model |
| `rerank-2-lite` | Efficient reranker with lower latency |

## Installation

```bash
Expand All @@ -43,10 +73,11 @@ pip install voyage-embedders-haystack

## Usage

You can use Voyage models with four components:
You can use Voyage models with five components:
- [VoyageTextEmbedder](https://github.com/awinml/voyage-embedders-haystack/blob/main/src/haystack_integrations/components/embedders/voyage_embedders/voyage_text_embedder.py) - For embedding query text
- [VoyageDocumentEmbedder](https://github.com/awinml/voyage-embedders-haystack/blob/main/src/haystack_integrations/components/embedders/voyage_embedders/voyage_document_embedder.py) - For embedding documents
- [VoyageContextualizedDocumentEmbedder](https://github.com/awinml/voyage-embedders-haystack/blob/voyage_context-3_model/src/haystack_integrations/components/embedders/voyage_embedders/voyage_contextualized_document_embedder.py) - For contextualized chunk embeddings with `voyage-context-3`
- [VoyageContextualizedDocumentEmbedder](https://github.com/awinml/voyage-embedders-haystack/blob/main/src/haystack_integrations/components/embedders/voyage_embedders/voyage_contextualized_document_embedder.py) - For contextualized chunk embeddings with `voyage-context-3`
- [VoyageMultimodalEmbedder](https://github.com/awinml/voyage-embedders-haystack/blob/main/src/haystack_integrations/components/embedders/voyage_embedders/voyage_multimodal_embedder.py) - For multimodal embeddings with `voyage-multimodal-3.5`
- [VoyageRanker](https://github.com/awinml/voyage-embedders-haystack/blob/main/src/haystack_integrations/components/rankers/voyage/ranker.py) - For reranking documents

### Standard Embeddings
Expand All @@ -58,11 +89,10 @@ To create semantic embeddings for documents, use `VoyageDocumentEmbedder` in you
For improved retrieval quality, use `VoyageContextualizedDocumentEmbedder` with the `voyage-context-3` model. This component preserves context between related document chunks by grouping them together during embedding, reducing context loss that occurs when chunks are embedded independently

**Important:** You must explicitly specify the `model` parameter when initializing any component. Choose from the available models listed in the [Embeddings Documentation](https://docs.voyageai.com/embeddings/). Recommended choices include:
- `voyage-3.5` - Latest general-purpose model for best performance
- `voyage-3.5-lite` - Efficient model with lower latency
- `voyage-3-large` - High-capacity model for complex tasks
- `voyage-4-large` - Best general-purpose and multilingual retrieval quality
- `voyage-4` - Balanced general-purpose and multilingual retrieval quality
- `voyage-4-lite` - Optimized for latency and cost
- `voyage-context-3` - Contextualized embeddings for improved retrieval (use with `VoyageContextualizedDocumentEmbedder`)
- `voyage-2` - Proven general-purpose model

You can set the environment variable `VOYAGE_API_KEY` instead of passing the API key as an argument. To get an API key, please see the [Voyage AI website.](https://www.voyageai.com/)

Expand Down Expand Up @@ -188,6 +218,99 @@ result = embedder.run(documents=docs)

For more examples, see the [contextualized embedder example](https://github.com/awinml/voyage-embedders-haystack/blob/voyage_context-3_model/examples/contextualized_embedder_example.py).

## Multimodal Embeddings

Voyage AI's `voyage-multimodal-3.5` model transforms unstructured data from multiple modalities (text, images, video) into a shared vector space. This enables mixed-media document retrieval and cross-modal semantic search.

### Features

- **Multiple modalities**: Supports text, images, and video in a single input
- **Variable dimensions**: Output dimensions of 256, 512, 1024 (default), or 2048
- **Interleaved content**: Mix text, images, and video in single inputs
- **No preprocessing required**: Process documents with embedded images directly

### Limits

- Images: Max 20MB, 16 million pixels
- Video: Max 20MB
- Context: 32,000 tokens
- Token counting: 560 image pixels = 1 token, 1120 video pixels = 1 token

### Basic Multimodal Example

Use the `VoyageMultimodalEmbedder` component for multimodal embeddings. Each input is a list of content items (text, images, or videos):

```python
from haystack.dataclasses import ByteStream
from haystack_integrations.components.embedders.voyage_embedders import VoyageMultimodalEmbedder

# Text-only embedding
embedder = VoyageMultimodalEmbedder(model="voyage-multimodal-3.5")
result = embedder.run(inputs=[["A sunset over the ocean"]])
print(f"Embedding dimensions: {len(result['embeddings'][0])}")

# Mixed text and image embedding
image_bytes = ByteStream.from_file_path("image.jpg")
result = embedder.run(inputs=[["Product photo for online store", image_bytes]])
print(f"Tokens used: {result['meta']['total_tokens']}")
```

### Multimodal Example with Custom Dimensions

```python
from haystack.dataclasses import ByteStream
from haystack_integrations.components.embedders.voyage_embedders import VoyageMultimodalEmbedder

# Configure output dimensions (256, 512, 1024, or 2048)
embedder = VoyageMultimodalEmbedder(
model="voyage-multimodal-3.5",
output_dimension=2048, # Higher dimensions for better accuracy
input_type="document", # Optimize for document retrieval
)

# Embed multiple inputs at once
image1 = ByteStream.from_file_path("doc1.jpg")
image2 = ByteStream.from_file_path("doc2.jpg")

result = embedder.run(inputs=[
["Document about machine learning", image1],
["Technical diagram", image2],
])

print(f"Number of embeddings: {len(result['embeddings'])}")
print(f"Image pixels processed: {result['meta']['image_pixels']}")
```

### Video Embedding Example

Video inputs require the `voyageai.video_utils` module:

```python
from voyageai.video_utils import Video
from haystack_integrations.components.embedders.voyage_embedders import VoyageMultimodalEmbedder

embedder = VoyageMultimodalEmbedder(model="voyage-multimodal-3.5")

# Load video using VoyageAI's Video utility
video = Video.from_path("video.mp4", model="voyage-multimodal-3.5")

# Embed video with optional text context
result = embedder.run(inputs=[["Machine learning tutorial", video]])

print(f"Embedding dimensions: {len(result['embeddings'][0])}")
print(f"Video pixels processed: {result['meta']['video_pixels']}")
print(f"Total tokens: {result['meta']['total_tokens']}")
```

### Use Cases

- Mixed-media document retrieval (PDFs, slides with images)
- Image-text similarity search
- Video content retrieval and search
- Cross-modal semantic search

For more information, see the [Multimodal Embeddings Documentation](https://docs.voyageai.com/docs/multimodal-embeddings).

## License

`voyage-embedders-haystack` is distributed under the terms of the [Apache-2.0 license](https://github.com/awinml/voyage-embedders-haystack/blob/main/LICENSE).