[Bug/Model Request]: LateInteractionTextEmbedding("colbert-ir/colbertv2.0") creates different size of embeddings for large set of documents

### What happened?

A bug happened! 

I am using embedding_model = LateInteractionTextEmbedding("colbert-ir/colbertv2.0") in fastembed==0.3.0 and
following the code in https://qdrant.github.io/fastembed/examples/ColBERT_with_FastEmbed/#colbert-in-fastembed.

I created embeddings for some documents. However, I got an error on this part of the code when running on large collection of documents:
sorted_indices = compute_relevance_scores(
        np.array(query_embeddings[0]), np.array(document_embeddings), k=3
    )

complaining that it can not create np.array from document_embeddings. Looking into it I realized that sizes of each document_embedding in document_embeddings are different. For instance for 442 documents, the first ~260 documents have embedding size of (182,128) and for the next half the document embedding size is (164, 128). I am wondering if you can help me with that. Thanks.




### What Python version are you on? e.g. python --version

Python 3.12

### Version

0.2.7 (Latest)

### What os are you seeing the problem on?

MacOS

### Relevant stack traces and/or logs

```shell
Traceback (most recent call last):
    np.array(query_embeddings[0]), np.array(document_embeddings), k=3
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (442,) + inhomogeneous part.
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug/Model Request]: LateInteractionTextEmbedding("colbert-ir/colbertv2.0") creates different size of embeddings for large set of documents #273

What happened?

What Python version are you on? e.g. python --version

Version

What os are you seeing the problem on?

Relevant stack traces and/or logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug/Model Request]: LateInteractionTextEmbedding("colbert-ir/colbertv2.0") creates different size of embeddings for large set of documents #273

Description

What happened?

What Python version are you on? e.g. python --version

Version

What os are you seeing the problem on?

Relevant stack traces and/or logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions