Skip to content

#49 deserialize $vector -> $binary#123

Open
a-random-steve wants to merge 3 commits intodatastax:mainfrom
a-random-steve:mr/49
Open

#49 deserialize $vector -> $binary#123
a-random-steve wants to merge 3 commits intodatastax:mainfrom
a-random-steve:mr/49

Conversation

@a-random-steve
Copy link
Copy Markdown
Contributor

@a-random-steve a-random-steve commented Apr 2, 2026

Completes Pre-GA portion of #49 (and #67). Also might eliminate need for DataAPIVector given serialization options available.

  1. $binary is properly deserialized for float[] properties

  2. User can control how float arrays are serialized via attributes on the properties:

public class FloatArrayWriterObject
{
    [DocumentId]
    public string _id { get; set; }
    [JsonConverter(typeof(FloatArrayWriter))]
    public float[] Vector { get; set; }
}

public class FloatBinaryWriterObject
{
    [DocumentId]
    public string _id { get; set; }
    [JsonConverter(typeof(FloatBinaryWriter))]
    public float[] Vector { get; set; }
}

Fixes #49 .

… for DataAPIVector given serialization options available.
@toptobes
Copy link
Copy Markdown
Collaborator

toptobes commented Apr 3, 2026

For clarity, could we not have [DocumentMapping(DocumentMappingField.Vector)] just make the field automatically use [JsonConverter(typeof(FloatBinaryWriter))] while using [JsonConverter(typeof(FloatArrayWriter))] for float[]s?

@a-random-steve
Copy link
Copy Markdown
Contributor Author

For clarity, could we not have [DocumentMapping(DocumentMappingField.Vector)] just make the field automatically use [JsonConverter(typeof(FloatBinaryWriter))] while using [JsonConverter(typeof(FloatArrayWriter))] for float[]s?

@toptobes we could if $binary is only ever used (by other clients and cql) when inside the $vector structure. I thought that float arrays as a whole could be serialized/deserialized as $binary. The current structure allows the user to control that if desired, but also handles being able to deserialize any float array from $binary structure.

@toptobes
Copy link
Copy Markdown
Collaborator

toptobes commented Apr 3, 2026

@toptobes we could if $binary is only ever used (by other clients and cql) when inside the $vector structure. I thought that float arrays as a whole could be serialized/deserialized as $binary. The current structure allows the user to control that if desired, but also handles being able to deserialize any float array from $binary structure.

That's fair. So to be certain, if you don't provide an explicit type, and it's a vector field (whether by DocumentMappingField.Vector or ColumnVector, it'd use $binary, and otherwise it'll just use the normal float[] writer?

And if you wanted to override it manually you do something like this, or am I misunderstanding?

[JsonConverter(typeof(FloatArrayWriter))]
[DocumentMapping(DocumentMappingField.Vector)]
public float[] Vector { get; set; }

@stephenatsembit
Copy link
Copy Markdown
Contributor

@toptobes we could if $binary is only ever used (by other clients and cql) when inside the $vector structure. I thought that float arrays as a whole could be serialized/deserialized as $binary. The current structure allows the user to control that if desired, but also handles being able to deserialize any float array from $binary structure.

That's fair. So to be certain, if you don't provide an explicit type, and it's a vector field (whether by DocumentMappingField.Vector or ColumnVector, it'd use $binary, and otherwise it'll just use the normal float[] writer?

And if you wanted to override it manually you do something like this, or am I misunderstanding?

[JsonConverter(typeof(FloatArrayWriter))]
[DocumentMapping(DocumentMappingField.Vector)]
public float[] Vector { get; set; }

Yep.

@skedwards88
Copy link
Copy Markdown
Collaborator

we could if $binary is only ever used (by other clients and cql) when inside the $vector structure

I'm not sure I fully grasp the resolution of this thread, but in case it's relevant: python client and curl at least do support $binary outside of $vector: https://docs.datastax.com/en/astra-db-serverless/api-reference/document-methods/insert-one.html

@toptobes toptobes enabled auto-merge April 8, 2026 17:59
@toptobes toptobes added this pull request to the merge queue Apr 8, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to a conflict with the base branch Apr 8, 2026
# Conflicts:
#	test/DataStax.AstraDB.DataApi.IntegrationTests/TestObjects.cs
@toptobes toptobes added this pull request to the merge queue Apr 9, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 9, 2026
@sl-at-ibm
Copy link
Copy Markdown
Collaborator

sl-at-ibm commented Apr 9, 2026

I am afraid there is a regression in the form of plain lists of floats (not 'vectors') being serialized as $binary. This is something the Data API does not accept.

Observed and reproduced with the latest commit (aeecc0c) by running test Test_FloatingPoint_StringLiterals. The FloatingPointTest object has a float[] p_list_float field, that results in the following (faulty) payload:

{
    "insertOne": {
        "document": {
            "id": "1",
            "p_float_nan": "NaN",
            "p_float_pinf": "Infinity",
            "p_float_minf": "-Infinity",
            "p_double_nan": "NaN",
            "p_double_pinf": "Infinity",
            "p_double_minf": "-Infinity",
            "p_list_double": [
                -43.21,
                "NaN",
                "Infinity",
                "-Infinity",
                12.34
            ],
            "p_set_double": null,
            "p_list_float": {
                "$binary": "wizXCv/AAAB/gAAA/4AAAEFFcKQ="
            },
            "p_set_float": null
        }
    }
}

Suggestion: if not present, add a similar test that uses typed collections and a float[] object field similarly, just to cover both collections and tables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DataAPIVector does not exist; client doesn't handle $binary vectors

5 participants