Zend: store the zend_type ptr discriminator in a dedicated kind field#22312
Zend: store the zend_type ptr discriminator in a dedicated kind field#22312azjezz wants to merge 5 commits into
Conversation
zend_type packed the "what does ptr point at" discriminator (_ZEND_TYPE_NAME_BIT / _ZEND_TYPE_LITERAL_NAME_BIT / _ZEND_TYPE_LIST_BIT) into the high bits of type_mask, sharing the word with the MAY_BE_* mask and the arg-info extra flags. Only two bits were left free, which is not enough room for additional type kinds. Move the discriminator into a new `uint32_t kind` field. On 64-bit this occupies the existing 4 bytes of struct padding, so sizeof(zend_type) stays 16 (resolves the long-standing TODO); on 32-bit zend_type grows from 8 to 12 bytes. This frees type_mask bits 22-24 and gives the kind field room for future kinds. Pure refactor, no behaviour change. The bit constants keep their values and are simply read from / written to `kind` now. ZEND_TYPE_IS_SET() additionally checks `kind`, and the JIT's inlined typed-property IS_SET check ORs the kind field in. Signed-off-by: azjezz <azjezz@protonmail.com>
The kind field added in the previous commit was modelled as one-hot bit flags (NAME, LITERAL_NAME, LIST), but exactly one is ever set at a time: it discriminates what type.ptr points at, it is not a bitset like type_mask. Treating it as a bitmask was misleading and forced bit ops where a plain compare suffices. Renumber the values sequentially (NONE=0, NAME=1, LITERAL_NAME=2, LIST=3) and turn the predicates into equality comparisons: IS_COMPLEX kind != NONE HAS_NAME kind == NAME HAS_LITERAL_NAME kind == LITERAL_NAME HAS_LIST kind == LIST Assignment replaces read-modify-write at the few mutation sites (zend_convert_internal_arg_info_type no longer clears then ORs). No behaviour change. Signed-off-by: azjezz <azjezz@protonmail.com>
Address review feedback. Signed-off-by: azjezz <azjezz@protonmail.com>
Co-authored-by: Tim Düsterhus <timwolla@googlemail.com>
Co-authored-by: Tim Düsterhus <timwolla@googlemail.com>
TimWolla
left a comment
There was a problem hiding this comment.
I'm not seeing any more obvious issues, but can't comment on whether this change is desirable.
|
@TimWolla for why i personally wanted this: i was experimenting with a few type additions, literal types, shapes, and negated types. i was able to use the existing 2 spare bits for literals and shapes, but i couldn't implement negated types because there was no space left. Gina also pointed out on Discord that post-9.0 we could take a few bits back as types get removed. removing so best case post-9.0, assuming we don't add any new types until then, we'd have maybe ~5 free bits give or take. but this is just delaying the problem for no reason, since the 8 bits are already going into padding. |
|
Yes, makes sense. I'm just gonna let the engine experts review this and only did a first pass for the clear “code style” problems. |
zend_typepacks two unrelated things into a singleuint32_t type_mask:MAY_BE_*flags (bits 0-17). This is genuinely a set:int|stringis two bits OR'd together, and union handling relies on that.type.ptrpoints at: azend_string*class name, aconst char*literal name, or azend_type_list*. These are mutually exclusive, but they were encoded as three one-hot bits (22-24) carved out of the same word.Overloading one word for both meanings has two costs. It is misleading: the discriminator looks like it composes with the mask when it never does. And it is expensive in space that does not exist. The mask is nearly full. Bits 0-21 are MAY_BE_* plus UNION/INTERSECTION/ARENA/ITERABLE, 25-29 are arg-info extra flags, leaving almost nothing. Any future type-system work that needs an inline flag has nowhere to put it, and the discriminator bits were sitting in the middle of that budget.
What this does
Move the ptr discriminator out of
type_maskinto a dedicateduint32_t kindfield onzend_type, and model it as what it actually is: a small enum where exactly one value applies at a time.The HAS_* / IS_COMPLEX predicates become plain equality compares instead of bit tests, and the mutation sites assign the field instead of read-modify-writing individual bits.
Cost
On 64-bit the new field lands in padding that already existed after type_mask, so sizeof(zend_type) is unchanged (16 bytes). On 32-bit the struct grows by 4 bytes. kind is copied as part of the struct everywhere a zend_type is copied, including opcache SHM persist and the file cache, so no serialization changes were needed.