Feature or enhancement
Hi,
Creating a tuple with the current C API has multiple issues:
PyTuple_SetItem() and PyTuple_SET_ITEM() modify an immutable tuple.
PyTuple_New() creates an incomplete object: items are set to NULL. This is bad:
PyTuple_New() tracks directly the tuple in the garbage collector. For example, gc.get_objects() gives access to the incomplete tuple. Using the tuple, like calling repr(tuple), can crash Python.
_PyTuple_Resize() is private which is surprising for a documented API. It stays private because it has issues.
_PyTuple_Resize() modifies an immutable tuple.
_PyTuple_Resize() must not be used of the refcount is greater than 1: the API is fragile.
I propose adding a new efficient PyTupleWriter API: work on a temporary "writer" object, and then call Finish() on it to get the tuple.
I already proposed a similar API in 2023: _PyTupleBuilder. Since that, Python C API got the PyUnicodeWriter API and the PyBytesWriter API which are efficient writers for str and bytes objects, and the C API Working Group was created. The proposed API is now public and allocates the structure on the heap memory to hide the implementation details (the structure).
Mark Shannon asked if it would be possible to work on a list and then convert the list to a tuple, but it's less efficient. Tuples are commonly used in Python, and so creating a tuple should be efficient.
Mark Shannon also tried to initialize tuple items to None instead of NULL in PyTuple_New(). His attempt failed because of implementation issues. Also, this change only fix some of the issues that I listed, not all of them.
An alternative is to fill a C array of Python objects, call the new PyTuple_FromArray(), and then call Py_DECREF() on the array items. It requires to allocate and deallocate an array, and call Py_DECREF() on items. It can be less efficient.
Linked PRs
Feature or enhancement
Hi,
Creating a tuple with the current C API has multiple issues:
PyTuple_SetItem()andPyTuple_SET_ITEM()modify an immutable tuple.PyTuple_New()creates an incomplete object: items are set toNULL. This is bad:PyTuple_New()tracks directly the tuple in the garbage collector. For example,gc.get_objects()gives access to the incomplete tuple. Using the tuple, like callingrepr(tuple), can crash Python._PyTuple_Resize()is private which is surprising for a documented API. It stays private because it has issues._PyTuple_Resize()modifies an immutable tuple._PyTuple_Resize()must not be used of the refcount is greater than1: the API is fragile.I propose adding a new efficient
PyTupleWriterAPI: work on a temporary "writer" object, and then callFinish()on it to get the tuple.I already proposed a similar API in 2023:
_PyTupleBuilder. Since that, Python C API got thePyUnicodeWriterAPI and thePyBytesWriterAPI which are efficient writers forstrandbytesobjects, and the C API Working Group was created. The proposed API is now public and allocates the structure on the heap memory to hide the implementation details (the structure).Mark Shannon asked if it would be possible to work on a list and then convert the list to a tuple, but it's less efficient. Tuples are commonly used in Python, and so creating a tuple should be efficient.
Mark Shannon also tried to initialize tuple items to
Noneinstead ofNULLinPyTuple_New(). His attempt failed because of implementation issues. Also, this change only fix some of the issues that I listed, not all of them.An alternative is to fill a C array of Python objects, call the new
PyTuple_FromArray(), and then callPy_DECREF()on the array items. It requires to allocate and deallocate an array, and callPy_DECREF()on items. It can be less efficient.Linked PRs