So it seems it's not possible to make ob_item in lists and tuples use _PyHeapRef or store an untagged int directly without breaking a whole lot of extensions capi-workgroup/decisions#64.
A possible alternative that maintains compatibility is to use an internal representation of a list, perhaps called _PyInternalListObject. This would be exactly the same as PyListObject, except it uses a _PyHeapRef * for ob_item.
When passing the object to external C code, we convert it to a normal PyListObject by simply swapping out the unboxed ints within the PyListObject to boxed ones, then passing it to the external code. This means _PyInternalListObject has a compatible memory layout with PyListObject. Conversion is only done once and in-place. So this is fast and efficient.
We can minimize the number of conversions required by relying on the specializing interpreter to tell us when we're calling out to unknown C functions.
To keep stackref conversion quick, we should reserve one bit meaning an "internal container". It seems 10 is still unused right now, and it's perfect for that. x1 represents "skip the refcount", which we don't want (we do want to refcount internal containers!). So the 10 bit can now mean "internal container" (meaning list, tuples, or dicts). After conversion, the tag just becomes 00.
So it seems it's not possible to make
ob_itemin lists and tuples use_PyHeapRefor store an untagged int directly without breaking a whole lot of extensions capi-workgroup/decisions#64.A possible alternative that maintains compatibility is to use an internal representation of a list, perhaps called
_PyInternalListObject. This would be exactly the same asPyListObject, except it uses a_PyHeapRef *forob_item.When passing the object to external C code, we convert it to a normal
PyListObjectby simply swapping out the unboxed ints within thePyListObjectto boxed ones, then passing it to the external code. This means_PyInternalListObjecthas a compatible memory layout withPyListObject. Conversion is only done once and in-place. So this is fast and efficient.We can minimize the number of conversions required by relying on the specializing interpreter to tell us when we're calling out to unknown C functions.
To keep stackref conversion quick, we should reserve one bit meaning an "internal container". It seems
10is still unused right now, and it's perfect for that.x1represents "skip the refcount", which we don't want (we do want to refcount internal containers!). So the10bit can now mean "internal container" (meaning list, tuples, or dicts). After conversion, the tag just becomes00.