fix(csharp/client): DH-19910: Support ColumnSource<IList>#8148
Conversation
No docs changes detected for 07a0a3e |
There was a problem hiding this comment.
Pull request overview
This PR completes C# client support for list-valued table elements (e.g., group-by results) by adding ColumnSource<IList> plumbing and Arrow list conversions, and by extracting TableMaker.ColumnBuilder into a dedicated utility.
Changes:
- Add Arrow
ListTypehandling end-to-end (Arrow →IListColumnSource/ListChunk→ Arrow). - Introduce read-only adapters that present Arrow arrays as
IList/IList<T>(andIList<T?>for value types). - Expand round-trip tests to cover list columns across many scalar element types and add adapter-specific tests.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| csharp/client/Dh_NetClientTests/RoundTripTest.cs | Updates round-trip coverage to include list columns for many scalar element types. |
| csharp/client/Dh_NetClientTests/ReadOnlyListAdapterTest.cs | Adds unit tests for the new read-only list adapter behavior. |
| csharp/client/Dh_NetClient/util/TableMaker.cs | Updates AddColumn logic to support nulls and list-typed elements. |
| csharp/client/Dh_NetClient/util/TableComparer.cs | Minor cleanup of unused usings; comparer already supports nested IList comparison. |
| csharp/client/Dh_NetClient/util/ColumnBuilder.cs | Extracts ColumnBuilder into its own file and adds list-builder support. |
| csharp/client/Dh_NetClient/ReadOnlyListAdapters.cs | Adds adapter hierarchy to expose Arrow arrays as IList/IList<T> in a read-only manner. |
| csharp/client/Dh_NetClient/column/ColumnSource.cs | Adds IListColumnSource alias and IHasElementType for list element type exposure. |
| csharp/client/Dh_NetClient/column/ArrayColumnSource.cs | Adjusts backing storage to nullable-annotated arrays. |
| csharp/client/Dh_NetClient/chunk/ChunkMaker.cs | Adds visitor support to produce ListChunk for list columns. |
| csharp/client/Dh_NetClient/chunk/Chunk.cs | Makes chunk storage nullable-annotated (T?[]) to support list chunks and nullability flow. |
| csharp/client/Dh_NetClient/arrow_util/ArrowUtil.cs | Improves list enumeration conversion by handling null list slices. |
| csharp/client/Dh_NetClient/arrow_util/ArrowColumnSource.cs | Adds list column source creation, list copying into ListChunk, and element-type discovery. |
| csharp/client/Dh_NetClient/arrow_util/ArrowArrayConverter.cs | Adds conversion from IListColumnSource back to Arrow ListArray using ColumnBuilder. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| foreach (var value in values) { | ||
| if (value == null) { | ||
| cb.AppendNull(); | ||
| continue; | ||
| } |
There was a problem hiding this comment.
@copilot please look at this again. Of course it compiles. Your statement "(operator == is not defined for all T)" is false.
| tm.AddColumn<char?[]?>("char", [['a', 'X'], ['a', 'b', null], null, ['c']]); | ||
| tm.AddColumn<DateTimeOffset?[]?>("DateTimeOffset", [[dto1, dto2], [DateTimeOffset.MinValue, DateTimeOffset.MaxValue, null], null, [dto2]]); | ||
| tm.AddColumn<DateOnly?[]?>("DateOnly", [[DateOnly.FromDateTime(dto1.Date), DateOnly.FromDateTime(dto2.Date)], [DateOnly.MinValue, DateOnly.MaxValue, null], null, [DateOnly.FromDateTime(dto2.Date)]]); | ||
| tm.AddColumn<TimeOnly?[]?>("TimeOnly", [[TimeOnly.FromDateTime(dto1.Date), TimeOnly.FromDateTime(dto2.Date)], [TimeOnly.MinValue, TimeOnly.MaxValue, null], null, [TimeOnly.FromDateTime(dto2.Date)]]); |
| /// IList and List<Nullable<Int32>> (via its base class), and | ||
| /// also IList<Int32> via the derived class. | ||
| /// | ||
| /// Background and rationale for this class hierarchy:. The library initially supported a set of |
| /// the IList elements contained in these ColumnSource and Chunk Types will always implement | ||
| /// all ofIList, IList<T> and, for value types, IList<Nullable<T>>. |
| using System.Collections; | ||
| using Deephaven.Dh_NetClient; | ||
| using Microsoft.CSharp.RuntimeBinder; |
| public ReadOnlyListAdapterForReferenceTypes(IReadOnlyList<T> data) : base(data) { } | ||
| } | ||
|
|
||
| public class AdaptorSelector : IArrowArrayVisitor, |
| } | ||
| } | ||
|
|
||
| public class ElementTypeVisitor : |
| } | ||
| } | ||
|
|
||
| public class ListArrowColumnSource(ChunkedArray chunkedArray, Type elementType) : ArrowColumnSource, IListColumnSource, IHasElementType { |
| public abstract class ColumnBuilder { | ||
| public static ColumnBuilder<T> ForType<T>(IArrowArrayBuilder? callerProvidedBuilder) { | ||
| return (ColumnBuilder<T>)ForType(typeof(T), callerProvidedBuilder); |
| for (var i = 0; i != _numRows; ++i) { | ||
| if (!_nulls.Data[i]) { | ||
| arrowBuilder.Append(typedData[i]); | ||
| arrowBuilder.Append(typedData[i]!); |
There was a problem hiding this comment.
Again with null annotations. Sometimes you are in a situation where the compiler (just looking at the type annotations and the control flow) thinks you are in danger of passing a null to a function that is not declared to take a null. ! is your promise that the item is not null. But, for references, nullness is just an annotation, not part of the type system. The consequences of being wrong about your promise are not some kind of VM catastrophe, but merely that you might get a null reference exception when you weren't expecting one.
| sealed class ListCopier(ListChunk typedDest, BooleanChunk? nullFlags) : FillChunkHelper { | ||
| protected override void DoCopy(IArrowArray src, int srcOffset, int destOffset, int count) { | ||
| var typedSrc = (ListArray)src; | ||
| // var srcValues = (Apache.Arrow.Array)typedSrc.Values; |
There was a problem hiding this comment.
Dead code in this function...
| public sealed class Chunk<T> : Chunk { | ||
| public static Chunk<T> Create(int size) { | ||
| return new Chunk<T>(new T[size]); | ||
| return new Chunk<T>(new T?[size]); |
There was a problem hiding this comment.
It's a relatively long and somewhat messy story. Depending on how much detail you want, we might need to take it offline. Briefly:
? applied to a value type T is exactly identical to saying Nullable<T>. And if you're not familiar, Nullable<T> is itself a value type which has "optional" kind of semantics. So when T is known to be a value type, T, and T? are distinct types.
? applied to a reference type T is a compiler type annotation that does not change the type. Basically it makes the compiler try to do nullability checking. When you declare a parameter or variable as T that is your "promise" that that parameter or variable is not null. When you declare it as T? you are saying that it might be null. This is a "promise" in quotes because it is not a hard runtime guarantee. It's more of a compile-time constraint check meant to catch bugs.
When it is not known whether T is value or reference type, as in this generic method where T is unconstrained, the meaning falls back to the second definition. Here, "new T?[size]" means, make an array of T with the given size, and (of course) "new[]" is going to make an array of default-initialized values, which will be null if T is a reference type.
Put another way, ? is a bit of a virus that contaminates everything it touches. Once you start using it, it spreads to your callers etc.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| } | ||
|
|
||
| public interface IMutableColumnSource<T> : IMutableColumnSource, IColumnSource<T> { | ||
| public interface IMutableColumnSource<T> : IMutableColumnSource, IColumnSource<T>; |
There was a problem hiding this comment.
which is not a valid C# interface declaration
I disagree. @claude please provide a citation that proves this point.
| public override void Append(TList list) { | ||
| _listBuilder.Append(); | ||
| foreach (var element in list) { | ||
| _underlyingBuilder.Append(element); | ||
| } | ||
| } |
| var isNull = src.IsNull(srcOffset); | ||
| if (nullFlags != null) { | ||
| nullFlags.Data[destOffset] = isNull; | ||
| } | ||
| if (isNull) { | ||
| typedDest.Data[destOffset] = null; |
| using System.Collections; | ||
| using Deephaven.Dh_NetClient; | ||
| using Microsoft.CSharp.RuntimeBinder; | ||
|
|
| /// or, to avoid boxing, they can cast them down to their actual concrete type. We promise that | ||
| /// the IList elements contained in these ColumnSource and Chunk Types will always implement | ||
| /// all ofIList, IList<T> and, for value types, IList<Nullable<T>>. | ||
| /// |
| } | ||
|
|
||
| public interface IMutableColumnSource<T> : IMutableColumnSource, IColumnSource<T> { | ||
| public interface IMutableColumnSource<T> : IMutableColumnSource, IColumnSource<T>; |
| /// ReadOnlyListAdapterForReferenceTypes<string> which would provide | ||
| /// IList and List<string> to the user. | ||
| /// |
| /// Apache.Arrow.Int32Array is an example of an array containing value types. | ||
| /// It implements IReadOnlyList<Nullable<Int32>>. We would wrap it as | ||
| /// ReadOnlyListAdapterForValueTypes<Int32> which would provide | ||
| /// IList and List<Nullable<Int32>> (via its base class), and | ||
| /// also IList<Int32> via the derived class. |
This should complete our ability to support list types as table elements.
These occur e.g. when someone does a group by in their table.
Note to reviewers: a big part of this change is moving
TableMaker.ColumnBuilderto its own file.