Skip to content

Zend: store the zend_type ptr discriminator in a dedicated kind field#22312

Open
azjezz wants to merge 5 commits into
php:masterfrom
carthage-software:type-kinda
Open

Zend: store the zend_type ptr discriminator in a dedicated kind field#22312
azjezz wants to merge 5 commits into
php:masterfrom
carthage-software:type-kinda

Conversation

@azjezz

@azjezz azjezz commented Jun 14, 2026

Copy link
Copy Markdown

zend_type packs two unrelated things into a single uint32_t type_mask:

  1. A bitset of MAY_BE_* flags (bits 0-17). This is genuinely a set: int|string is two bits OR'd together, and union handling relies on that.
  2. A discriminator for what type.ptr points at: a zend_string* class name, a const char* literal name, or a zend_type_list*. These are mutually exclusive, but they were encoded as three one-hot bits (22-24) carved out of the same word.

Overloading one word for both meanings has two costs. It is misleading: the discriminator looks like it composes with the mask when it never does. And it is expensive in space that does not exist. The mask is nearly full. Bits 0-21 are MAY_BE_* plus UNION/INTERSECTION/ARENA/ITERABLE, 25-29 are arg-info extra flags, leaving almost nothing. Any future type-system work that needs an inline flag has nowhere to put it, and the discriminator bits were sitting in the middle of that budget.

What this does

Move the ptr discriminator out of type_mask into a dedicated uint32_t kind field on zend_type, and model it as what it actually is: a small enum where exactly one value applies at a time.

typedef struct {
    void *ptr;
    uint32_t type_mask;  /* MAY_BE_* bitset + metadata flags */
    uint32_t kind;       /* discriminates what ptr points at */
} zend_type;

#define _ZEND_TYPE_KIND_NONE         0u  /* ptr is NULL            */
#define _ZEND_TYPE_KIND_NAME         1u  /* zend_string*           */
#define _ZEND_TYPE_KIND_LITERAL_NAME 2u  /* const char*            */
#define _ZEND_TYPE_KIND_LIST         3u  /* zend_type_list*        */

The HAS_* / IS_COMPLEX predicates become plain equality compares instead of bit tests, and the mutation sites assign the field instead of read-modify-writing individual bits.

Cost

On 64-bit the new field lands in padding that already existed after type_mask, so sizeof(zend_type) is unchanged (16 bytes). On 32-bit the struct grows by 4 bytes. kind is copied as part of the struct everywhere a zend_type is copied, including opcache SHM persist and the file cache, so no serialization changes were needed.

azjezz added 2 commits June 14, 2026 16:43
zend_type packed the "what does ptr point at" discriminator
(_ZEND_TYPE_NAME_BIT / _ZEND_TYPE_LITERAL_NAME_BIT / _ZEND_TYPE_LIST_BIT)
into the high bits of type_mask, sharing the word with the MAY_BE_* mask
and the arg-info extra flags. Only two bits were left free, which is not
enough room for additional type kinds.

Move the discriminator into a new `uint32_t kind` field. On 64-bit this
occupies the existing 4 bytes of struct padding, so sizeof(zend_type)
stays 16 (resolves the long-standing TODO); on 32-bit zend_type grows
from 8 to 12 bytes. This frees type_mask bits 22-24 and gives the kind
field room for future kinds.

Pure refactor, no behaviour change. The bit constants keep their values
and are simply read from / written to `kind` now. ZEND_TYPE_IS_SET()
additionally checks `kind`, and the JIT's inlined typed-property
IS_SET check ORs the kind field in.

Signed-off-by: azjezz <azjezz@protonmail.com>
The kind field added in the previous commit was modelled as one-hot bit
flags (NAME, LITERAL_NAME, LIST), but exactly one is ever set at a time:
it discriminates what type.ptr points at, it is not a bitset like
type_mask. Treating it as a bitmask was misleading and forced bit ops
where a plain compare suffices.

Renumber the values sequentially (NONE=0, NAME=1, LITERAL_NAME=2,
LIST=3) and turn the predicates into equality comparisons:

  IS_COMPLEX        kind != NONE
  HAS_NAME          kind == NAME
  HAS_LITERAL_NAME  kind == LITERAL_NAME
  HAS_LIST          kind == LIST

Assignment replaces read-modify-write at the few mutation sites
(zend_convert_internal_arg_info_type no longer clears then ORs).

No behaviour change.

Signed-off-by: azjezz <azjezz@protonmail.com>
Comment thread Zend/zend_types.h Outdated
Address review feedback.

Signed-off-by: azjezz <azjezz@protonmail.com>
Comment thread Zend/zend_types.h Outdated
Comment thread Zend/zend_types.h Outdated
azjezz and others added 2 commits June 14, 2026 20:00
Co-authored-by: Tim Düsterhus <timwolla@googlemail.com>
Co-authored-by: Tim Düsterhus <timwolla@googlemail.com>

@TimWolla TimWolla left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing any more obvious issues, but can't comment on whether this change is desirable.

@azjezz

azjezz commented Jun 14, 2026

Copy link
Copy Markdown
Author

@TimWolla for why i personally wanted this: i was experimenting with a few type additions, literal types, shapes, and negated types. i was able to use the existing 2 spare bits for literals and shapes, but i couldn't implement negated types because there was no space left.

Gina also pointed out on Discord that post-9.0 we could take a few bits back as types get removed. removing resource frees its own bit, and ITERABLE ( which also has a dedicated bit ) is going away too, so that's 2. MIXED already costs nothing today ( it's just all the bits set, MAY_BE_ANY ), and NEVER ( the bottom type, conceptually just 0 ) does still take a dedicated bit, since 0 already means "no type", so that one we could only reclaim with a re-encoding.

so best case post-9.0, assuming we don't add any new types until then, we'd have maybe ~5 free bits give or take. but this is just delaying the problem for no reason, since the 8 bits are already going into padding.

@TimWolla

Copy link
Copy Markdown
Member

Yes, makes sense. I'm just gonna let the engine experts review this and only did a first pass for the clear “code style” problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants