Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions standard/arrays.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,38 @@ At run-time, a value of an array type can be `null` or a reference to an instanc

> *Note*: Following the rules of [§17.6](arrays.md#176-array-covariance), the value may also be a reference to a covariant array type. *end note*

### §arrays-of-nullable-arrays Arrays of nullable arrays

The nullable annotation `?` may be placed on an array type, as in `T[R]?`. Such an annotated array type may be used as the element type of another array type, as in `T[R]?[R₂]`.

`T[R₁][R₂]?[R₃][R₄]` is not a single *array_type* with four ranks. Rather, it is two *array_type*s, each of which has two ranks. The outer *array_type* has ranks `[R₃][R₄]`, read left to right, and its element type is `T[R₁][R₂]?`. The element type is another *array_type* with a nullable annotation, and this inner *array_type* has ranks `[R₁][R₂]`, read left to right. This is the sole exception to the general rule that the meaning of a program remains the same when nullable reference types annotations are removed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm uneasy about this, because it's so drastic. I think it gets to the right place, but I wonder if there's a way we can be more consistent between nullable and non-nullable array types, such that we end up with the same number of types involved, and you "just" need to parse the text differently to get at them.

As Nigel puts in his comments, if we don't have the shape of arrays-of-arrays adequately specified yet, let's do that now. (I can't work out whether we do or not, but it's certainly pretty terse at the moment.)

Currently it looks like the bulk of the specification of arrays-of-arrays is within the "general" section (17.2.1) - I wonder whether having a new subsection of "arrays of arrays" which bifurcates into "when there are no nullable annotations" and "when there are nullable annotations" parts would be useful. We should make sure it explicitly tackles nullable value types as well.

I'm happy to come up with an alternative PR based on this one if that sounds like a good approach (or at least one worth exploring) to both @jnm2 and @Nigel-Ecma - but I'd like feedback on whether my preference of trying to make the difference in the number of types involved only about how you read it, rather than a "deeper" number of types sounds feasible or not before I try.

My expectation is that in order to make sure I fully understand what I'm doing, I'll write a lot of concrete examples (using string and int, with and without nullable annotations), and then wrap explanations around those examples.

If folks would like me to do that, I won't even get to starting it this week, but I could probably have a very rough version available shortly before the next meeting, so we could at least discuss whether it's worth pursuing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning: I think I’ve always got “array type” and array_type right, but this is of course self-proofed… So if you think I’ve put the wrong one in anywhere I may well have!


This PR has two issues to tackle:

  1. That array elements can now be other arrays – but that was always true…
  2. That it blows away the NRT guarantee that there is no semantic impact of nullable annotations, now there is.

@jskeet you are talking about (1) and reference §17.2.1. This is the (original!) clause where things can get confusing. It starts with:

An array type is written as a non_array_type followed by one or more rank_specifiers.

A non_array_type is any type that is not itself an array_type.

Here there is both “array type” and “array_type”, are they the same?

The usual description of arrays in C# is that the element type is not an array (non_array_type) with one or more ranks, and of which have have multiple dimensions. But then the clause continues:

The element type of an array type is the type that results from deleting the leftmost rank_specifier:

  • An array type of the form T[R] is an array with rank R and a non-array element type T.
  • An array type of the form T[R][R₁]...[Rₓ] is an array with rank R and an element type T[R₁]...[Rₓ].

Which (clearly?) defines that the element type of a multi-rank is in fact an array type… So an array type with x ranks consists of x array types…

To this the PR adds:

T[R₁][R₂]?[R₃][R₄] is not a single array_type with four ranks. Rather, it is two array_types, each of which has two ranks. The outer array_type has ranks [R₃][R₄], read left to right, and its element type is T[R₁][R₂]?. The element type is another array_type with a nullable annotation, and this inner array_type has ranks [R₁][R₂], read left to right. This is the sole exception to the general rule that the meaning of a program remains the same when nullable reference types annotations are removed.

Which is still an array type with a total of four array types just grouped in a different order…

What this PR does with the grammar is to produce one which allows an array_type to occur as part of another array_type – as opposed to an “array type” being part of another array type, which was always true…

Confused?

And that is just issue (1).


Issue (2) is maybe more subjective and has seen two approaches:

  • Don’t talk about it – what the NRT changes originally did; and
  • Minimise the impact – AKA positive thinking “Sure there is a gaping hole below the water line, but most of the hull is intact!” 😉

As I said, subjective 🙂 This PR tends towards the latter, I don’t think it has the balance right, but that is my subjective opinion.


@jskeet – I’m more than happy for you to have a go at it!

Copy link
Contributor Author

@jnm2 jnm2 Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever we land with, I would just want it to be absolutely clear to any implementer what the meaning of T[]?[,] is, since its meaning is not an optional part of the language.

I described it in terms of the spec's existing array_type grammar whose element type is never directly an array_type, but may now be a nullable_reference_type consisting of array_type nullable_annotation. If the definition of the term "array type" does not line up with the array_type grammar, that does complicate things when this section goes to reference the term. However, if I describe things consistently in terms of array_type, this complication seems no worse than it was prior to this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From meeting: let's not say "sole exception," just "an exception"


Every reference type which contains nullable annotations has a corresponding unannotated type with no semantic difference (§8.9.1). The corresponding unannotated type for an array of nullable arrays is a single array type which recursively collects all the ranks of all the nested *array_type*s.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With array types there is not in general a single “corresponding nullable type” (§8.9.1) – there are multiple, e.g.:

static void ArrayEquivalentTypes(int[][,]?[,,]?[,,,]?[,,,,] a1)
{
    int[,,][][,]?[,,,]?[,,,,] a2 = a1;

    int[,,,][,,][][,]?[,,,,] a3_1 = a1;
    int[,,,][,,][][,]?[,,,,] a3_2 = a2;

    int[,,,,][,,,][,,][][,] a4_1 = a1;
    int[,,,,][,,,][,,][][,] a4_2 = a2;
    int[,,,,][,,,][,,][][,] a4_3_1 = a3_1;
    int[,,,,][,,,][,,][][,] a4_3_2 = a3_2;

    int[,,][][,]?[,,,]?[,,,,] b3_1 = a3_1;
    int[,,][][,]?[,,,]?[,,,,] b3_2 = a3_2;
    int[,,][][,]?[,,,]?[,,,,] b4_1 = a4_1;
    int[,,][][,]?[,,,]?[,,,,] b4_2 = a4_2;
    int[,,][][,]?[,,,]?[,,,,] b4_3_1 = a4_3_1;
    int[,,][][,]?[,,,]?[,,,,] b4_3_2 = a4_3_2;

    int[,,,][,,][][,]?[,,,,] c4_1 = a4_1;
    int[,,,][,,][][,]?[,,,,] c4_2 = a4_2;
    int[,,,][,,][][,]?[,,,,] c4_3_1 = a4_3_1;
    int[,,,][,,][][,]?[,,,,] c4_3_2 = a4_3_2;

    // etc...
}

This bundle of joy ;-) has four array types that are semantically equivalent to each other, and that’s not the limit by a long shot.

It needs to be specified that any two types (which need not be distinct) selected from a semantically equivalent set are implicitly convertible to each other. (Depending on implementation some, but not all, of the conversions may elicit a nullable warning.)

And while we are here, consider:

static void ArrayDifferentTypesSameLiteral()
{
    int[]?[,] a1 =
        {
            {   new int[] { 1, 2 },
                new int[] { 3, 4 }
            }
        };
    int[,][] a2 =
        {
            {   new int[] { 1, 2 },
                new int[] { 3, 4 }
            }
        };

Different array types which are semantically equivalent can be init’ed using the same array literal. That will need to be specified.

Copy link
Contributor Author

@jnm2 jnm2 Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that both things are already specified when we say there is no semantic difference between an array of nullable arrays and its corresponding unannotated single-array type.

The spec as it is currently drafted in this PR already defines there to be no semantic difference between all of the array types in the examples you share above. If there is no semantic difference, then it's not even a matter of conversion.

Likewise, the existing spec does not explicitly address conversion between string and string?. It simply relies on "there is no semantic difference." If that is good enough for string and string?, it's good enough for string[]?[,] and string[,][].

The specification only talks about type conversions for the purpose of providing warnings, and so the stuff about type conversions and nullability is conditionally normative. §8.9.5.3 Type conversions:

A compiler that generates diagnostic warnings conforms to these rules.

Note: Differences in top-level or nested nullability annotations in types do not affect whether conversion between the types is permitted, since there is no semantic difference between a non-nullable reference type and its corresponding nullable type (§8.9.1). end note

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think you can fallback on that – what types are semantically equivalent to each other is not always obvious (think two different arrays of nullable arrays with the same underlying type). And when it comes to array initialisation, starting with the type of the array who does one figure out the shape of the initialiser? Are other shaped initialiser valid?

All you might need is some examples.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've previously had some description of "equivalent types" (not just semantically equivalent) and there's some language around conversions. We should make sure we're using that here. @BillWagner may be able to remember; if not I'll see if I can dig it up. (The intention was to avoid having to state parts about implicit reference conversions all over the place.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this again, I don't think I understand at this point what "which recursively collects all the ranks of all the nested array_types" means - I think this is just a matter of adding "as described below" but I'm not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "as described below" would work. I'm also happy to workshop the wording! The term 'recursion' just describes the outside-in traversal of the array types as they are interspersed with ?s.


The unannotated array type of an array of nullable arrays cannot be found by simply removing the nullable annotations `?` from the grammar and reparsing. This is because array ranks are read left to right while nested *array_type* productions are read outside-in, with outer array type ranks to the right, inner array type ranks to the left. Thus, the type `T[R₁][R₂]?[R₃][R₄]` has an unannotated array type of `T[R₃][R₄][R₁][R₂]`.

To obtain the unannotated array type of an array of nullable arrays, the following steps are followed:

1. Take the ranks on the outermost array type in order from left to right. (From `T[R₁][R₂]?[R₃][R₄]`, take `[R₃]` and then `[R₄]`.)
2. Move to the array type inside the nullable element type and take its ranks in order from left to right. (From `T[R₁][R₂]`, take `[R₁]` and then `[R₂]`.)
3. Repeat in this fashion until the element type is no longer a nullable array type. (`T`)
4. Take this remaining element type and place on it all the collected ranks in order from first to last to obtain the unannotated array type. (On `T`, place `[R₃]`, then `[R₄]`, then `[R₁]`, then `[R₂]`, obtaining the final result of `T[R₃][R₄][R₁][R₂]`.)

> *Example*:
>
> The following table demonstrates the effect on the unannotated array type caused by breaking up array types by inserting nullable annotations:
>
> | Annotated | Unannotated |
> |----------------------------------|-------------------------------------------|
> | `T?[][,][,,]` | `T[][,][,,]` (not intervening, no change) |
> | `T[][,][,,]?` | `T[][,][,,]` (not intervening, no change) |
> | `T[]?[,]?[,,]` | `T[,,][,][]` |
> | `T[]?[,][,,]` | `T[,][,,][]` |
> | `T[][,]?[,,]` | `T[,,][][,]` |
> | `T[][,]?[,,][,,,]?[,,,,][,,,,,]` | `T[,,,,][,,,,,][,,][,,,][][,]` |
>
> *end example*

### 17.2.2 The System.Array type

The type `System.Array` is the abstract base type of all array types. An implicit reference conversion ([§10.2.8](conversions.md#1028-implicit-reference-conversions)) exists from any array type to `System.Array` and to any interface type implemented by `System.Array`. An explicit reference conversion ([§10.3.5](conversions.md#1035-explicit-reference-conversions)) exists from `System.Array` and any interface type implemented by `System.Array` to any array type. `System.Array` is not itself an *array_type*. Rather, it is a *class_type* from which all *array_type*s are derived.
Expand Down
26 changes: 22 additions & 4 deletions standard/types.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,17 +54,22 @@
;

array_type
: non_array_type rank_specifier+
: array_type nullable_type_annotation rank_specifier+
Copy link
Contributor

@jskeet jskeet Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amusingly, coming back to this now, it explains the rule that I figured out independently (trying not to refer to this PR at all)... basically the rank for any array_type is the rank_specifier+ part, which has to be either "after the last ?" or there not be a ? at all...

| non_array_type rank_specifier+
Comment on lines 56 to +58
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the property that when parsing string?[][,]?[,,][,,,]? we end up with two array_type nodes: one being string?[][,]?[,,][,,,], and one being the inner string?[][,].

@Nigel-Ecma It's not mutual left recursion! 😁

;

non_array_type
: value_type
: non_array_non_nullable_type nullable_type_annotation?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely not competent enough to review this grammar change - @Nigel-Ecma you've not commented on the grammar, and I don't know whether that's because you're happy with it, or whether it's because you wanted to get the prose settled first.

| pointer_type // unsafe code support
;

non_array_non_nullable_type
: non_nullable_value_type
| class_type
| interface_type
| delegate_type
| 'dynamic'
| type_parameter
| pointer_type // unsafe code support
;

rank_specifier
Expand Down Expand Up @@ -732,7 +737,7 @@

> *Note:* The types `R` and `R?` are represented by the same underlying type, `R`. A variable of that underlying type can either contain a reference to an object or be the value `null`, which indicates “no reference.” *end note*

The syntactic distinction between a *nullable reference type* and its corresponding *non-nullable reference type* enables a compiler to generate diagnostics. A compiler must allow the *nullable_type_annotation* as defined in [§8.2.1](types.md#821-general). The diagnostics must be limited to warnings. Neither the presence or absence of nullable annotations, nor the state of the nullable context can change the compile time or runtime behavior of a program except for changes in any diagnostic messages generated at compile time.
The syntactic distinction between a *nullable reference type* and its corresponding *non-nullable reference type* enables a compiler to generate diagnostics. A compiler must allow the *nullable_type_annotation* as defined in [§8.2.1](types.md#821-general). The diagnostics must be limited to warnings. Neither the presence or absence of nullable annotations, nor the state of the nullable context can change the compile time or runtime behavior of a program except for changes in any diagnostic messages generated at compile time, with one exception: arrays of nullable arrays are not parsed as a single *array_type*, but rather as multiple nested *array_type*s. The corresponding *non-nullable reference type* of an array of nullable arrays is not the single array type that would be parsed if the nullable annotations were removed; see §arrays-of-nullable-arrays.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will surprise everybody ;-) and say “compiler” -> ”implementation”.

Not changing the compile time behavior is no longer correct, add/remove a nullable annotation on an array type without changing all associated index operations will produce compile-time errors.

I’m not sure you can say changing the number of array_type productions in the parse is an exception per se – the annotations do not change the described array shape in anyway (one might argue that the existing description of arrays is less than clear on the shape, if so adding nullable arrays is the time to fix that).

Copy link
Contributor Author

@jnm2 jnm2 Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about taking on the change of compiler -> implementation. "Compiler" is used consistently 31 times in the surrounding context, and the sentence you're commenting on precedes this PR; the term is only shown with a green background in the diff because I tacked some text on to the end of the line.

image

Copy link
Contributor Author

@jnm2 jnm2 Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not changing the compile time behavior is no longer correct, add/remove a nullable annotation on an array type without changing all associated index operations will produce compile-time errors.

The "with one exception" part already addresses this, doesn't it? It is correct that the compile time behavior does not change, with that one exception of adding or removing a nullable annotation on an array type... right?

Copy link
Contributor Author

@jnm2 jnm2 Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure you can say changing the number of array_type productions in the parse is an exception per se – the annotations do not change the described array shape in anyway (one might argue that the existing description of arrays is less than clear on the shape, if so adding nullable arrays is the time to fix that).

Here is why I think that it is an exception per se. The general rule is that the presence or absence of nullable annotations does not change the compile time or runtime behavior. Thus, the removal of ? that is present in a C# program would generally not affect compile time or runtime behavior. However, the removal of ? from T[]?[,] does affect compile time and runtime behavior. The causal mechanism by which the presence of the ? has meaning is the parse of the grammar; with the ?, the parse has two array_types, and without the ?, the parse has only one array_type. The meaning of the resulting compile time behavior and runtime type are defined in terms of the nested structure (if any) of those array_type(s) produced by the parse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The “exception” is that adding/removing nullable annotations to array declarations changes the semantics of the array and the order in which array indices must be given when addressing array elements. An implementation which chooses not to implement nullable warnings, and otherwise ignores nullable annotations, must still implement the semantic changes nullable annotations in array declarations produce. The Standard needs to be absolutely clear on this. See overall review comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree it needs to be really clear - and while I think Joseph's change tries to make it clear (not skirting around it) but that there may be alternative formulations which would work better. I'm happy to try that - potentially multiple times - when we're happy with the rest, but it should probably be the last thing we do.


### 8.9.2 Non-nullable reference types

Expand Down Expand Up @@ -1016,8 +1021,8 @@
> int len = t.DisappearingProperty.Length; // No warning. A compiler can assume property is stateful
> }
> }
> }

Check warning on line 1024 in standard/types.md

View workflow job for this annotation

GitHub Actions / Markdown to Word Converter

standard/types.md#L1024

MDC032::Line length 82 > maximum 81
> ```

Check warning on line 1025 in standard/types.md

View workflow job for this annotation

GitHub Actions / Markdown to Word Converter

standard/types.md#L1025

MDC032::Line length 83 > maximum 81
>
> In the previous example, the backing field for the `DisappearingProperty` is set to null when it is read. However, a compiler may assume that reading a property doesn’t change the null state of that expression. *end example*

Expand Down Expand Up @@ -1098,6 +1103,8 @@

A compiler may follow rules for interface variance ([§18.2.3.3](interfaces.md#18233-variance-conversion)), delegate variance ([§20.4](delegates.md#204-delegate-compatibility)), and array covariance ([§17.6](arrays.md#176-array-covariance)) in determining whether to issue a warning for type conversions.

(See §arrays-of-nullable-arrays for the specification of the corresponding non-nullable array type used in `M7` and `M8`.)

> <!-- Example: {template:"code-in-class-lib", name:"NullVariance", ignoredWarnings:["CS8619"]} -->
> ```csharp
> #nullable enable
Expand Down Expand Up @@ -1135,6 +1142,17 @@
> string[] v1 = p; // Warning
> string[] v2 = p!; // No warning
> }
>
> public void M7(string[][,] p)
> {
> string[,]?[] v1 = p; // No warning
> }
>
> public void M6(string[]?[,] p)
> {
> string[,][] v1 = p; // Warning
> string[,][] v2 = p!; // No warning
> }
> }
> ```
>
Expand Down
Loading