Modules and imports test panel#94
Conversation
Audit the 12 flat files plus test_import/, test_importlib/, test_module/ against CPython 3.14.5 under the 1726 zero-skip bridge, and lay out a phased plan: os.altsep + module dir() surface first, then the pure-Python stdlib modules (modulefinder, pyclbr, zipapp), frozen modules, the runpy residual, the directory suites, and finally the PEP 554 interpreters.
The os module published sep/extsep/pathsep but not altsep, so any code doing os.altsep raised AttributeError. test_pkgutil, test_zipimport and test_zipimport_support all reach for it through ntpath/posixpath. Add it to the module constants, matching CPython (None on POSIX, '/' on nt).
All three are pure-Python Lib modules the import panel reaches for, and all three import cleanly under gopy. test_modulefinder, test_pyclbr and test_zipapp now get past the ModuleNotFoundError and surface the real gaps (importlib.machinery.PathFinder, test_importlib package, io __class__) tracked as follow-ups.
A few gaps that surfaced once test_zipapp could import: - Python subclasses of the io base types are *Instance objects, so their own methods and instance dict have to win over the synthesized native methods. The custom getattr/setattr now route those instances through the generic path, matching PyObject_GenericGetAttr's MRO walk. This is what let _ZipWriteFile override close() and carry _zinfo. - BytesIO and StringIO keep object's identity hash (they define no __eq__), so they're hashable again. - SystemExit grows its code member, derived from the constructor args like SystemExit_init and overridable by assignment. - os.chmod accepts str, bytes, or any os.PathLike via __fspath__. test_zipapp now matches CPython 3.14 (35 passing).
test.test_importlib.util guards itself with import_module("_testmultiphase")
at import time, so every test that pulls in that helper (test_pkgutil,
test_pyclbr, the test_importlib extension suites) was raising SkipTest
under gopy where CPython runs them. Reproduce the PEP 489 extension's main
module Go-side: foo, call_state_registration_func, the Example/error/Str
types, and the int_const/str_const constants the C execfunc installs.
Vendor test_importlib into stdlib/test so test.test_importlib.util resolves
as a support module, and re-export importlib.__import__ to match CPython's
public surface (util.py reaches for source_importlib.__import__).
test_pkgutil and test_pyclbr now run their suites instead of skipping; the
remaining failures need the importlib PathFinder/importer surface, which is
the next batch.
pyclbr.readmodule_ex calls importlib.util._find_spec_from_path to locate a module's source without importing it. Port it on top of the existing directory-scan helper (factored out of find_spec): check sys.modules first, returning the cached __spec__ or raising when it is missing/None, otherwise scan the supplied path. test_pyclbr gets past the import and now only trips on modules that lack __spec__, which is the next gap.
gopy's import runs Go-side, so modules loaded through PathFinder, the inittab, and the script-as-main entry never picked up the ModuleSpec surface CPython's _init_module_attrs fills in. Tools that introspect a module by name (pyclbr._readmodule, runpy, inspect) read __spec__ and broke on the missing/None attribute. Build the spec by calling importlib.util.spec_from_file_location (file modules) or spec_from_loader (built-ins) once the body has run, mirroring what FileFinder/BuiltinImporter produce. Modules imported before importlib.util itself is importable are queued and flushed the moment it becomes available. The vendored test-as-main module gets a file-location spec so test.<name> matches what regrtest's import produces; plain __main__ keeps __spec__ = None like python script.py. Also force submodule imports for __all__ entries in 'from pkg import *', and vendor the sre_parse/sre_constants/sre_compile deprecation shims plus the pyclbr_input fixture.
The machinery.ModuleSpec and util._ModuleSpec stubs diverged from CPython: no parent property, no has_location/cached descriptors, no __repr__/__eq__. runpy._run_code reads mod_spec.parent and tools compare specs, so the stubs broke. Replace both with a faithful port of importlib._bootstrap.ModuleSpec (gopy keeps it in machinery since the bootstrap is Go-side) and have importlib.util import it. spec_from_file_location now follows _bootstrap_external: abspath the location, set _set_fileattr, derive submodule_search_locations from the loader. Add _get_cached to _bootstrap_external for the cached property. The abspath call tolerates posixpath still being mid-import during the bootstrap spec flush.
Drive test_runpy to parity with CPython 3.14: - sys.dont_write_bytecode/path_hooks/path_importer_cache top-level attrs - ModuleNotFoundError carries name= through the import miss path so runpy._get_module_details can keep searching dotted names - subprocess cwd accepts path-like (PyUnicode_FSConverter parity) - propagate GOPY_STDLIB to child interpreters so subprocess.run with a changed cwd still bootstraps encodings - exit via SIG_DFL SIGINT on unhandled KeyboardInterrupt (bpo-1054041) - PEP 420 namespace packages on the Go and Python find paths - _testinternalcapi.get_recursion_depth
Split exitSigint into unix/windows build-tagged files so the windows runner stops failing on syscall.Kill. Drop the nilerr lint hit in the ImportError member getter by treating a dict miss as None rather than an error. Accept os.PathLike argv members in _posixsubprocess.fork_exec the way fsconvert_strdup does, so subprocess calls passing pathlib.Path args no longer raise TypeError.
…tartup gopy reported sys.flags.no_site as 0, claiming the site module had run, while exit/quit/help/copyright/credits/license were missing. Vendor site.py and _sitebuiltins.py unchanged from CPython 3.14 and import site during bootstrap (after encodings) the way init_import_site does, so site.main() runs setquit/setcopyright/sethelper and the builtins land. Also fall back to GenericGetAttr in _io.File getattr so dunders like __class__ resolve through the MRO; abc.__instancecheck__ probes sys.stdout and was raising AttributeError on the missing __class__.
…d a stale slot A value-replacement on an at-capacity dict triggers dictResize before the replace-vs-insert branch in dictInsert. The resize rebuilds the table and renumbers every slot, but the replace path returns without touching the keys version, so LOAD_ATTR_INSTANCE_VALUE kept reading the cached (now wrong) slot index. CPython hands a resized dict a fresh keys object whose dk_version is 0, which drops every stamped inline cache; mirror that by resetting the version inside dictResize. Also route sys.excepthook through the live sys.stderr and format the full traceback via errors.FormatException, the way _PyErr_Display does, so a test that mocks sys.stderr captures the output.
ModuleType.__repr__ now forwards to importlib._bootstrap._module_repr the same way CPython's module_repr goes through _PyImport_ImportlibModuleRepr, so the __spec__/__loader__/__file__ variants (namespace packages, the '?' name fallback, bare/full loader reprs) all render identically. Wired the _bootstrap_external module global that _install_external_importers would normally set, vendored NamespaceLoader/_NamespacePath, and re-exported NamespaceLoader from importlib.machinery. Modules are now GC-tracked with a tp_traverse over md_dict. A module whose __dict__ holds functions closing over that same dict is a reference cycle; without the traverse edge the collector treated md_dict as rooted and never ran __del__ on cyclic objects defined in the module body. test_module: 39 tests, all green.
zipimport plugs into sys.path_hooks and leans on _bootstrap_external for the loader machinery. CPython freezes _bootstrap/_bootstrap_external and runs _setup()/_install_external_importers() at startup to inject sys, _imp and cross-link the two modules; gopy imports them like ordinary modules and never runs that startup, so the bindings have to happen at import time. Bind sys/_imp into _bootstrap, point _bootstrap_external back at _bootstrap, and have _bootstrap_external register itself as _bootstrap._bootstrap_external at the end of its module body. Also fill in the _bootstrap_external pieces zipimport reaches for: _path_stat, _LoaderBasics, _compile_bytecode, SourcelessFileLoader, spec_from_file_location, _get_supported_file_loaders and _fix_up_module.
… path hooks Two fixes that unblock importing modules out of a zip archive on sys.path. zlib.Compress.flush() defaulted to Z_SYNC_FLUSH, but CPython defaults the mode to Z_FINISH. The common compressobj().compress(x) + flush() idiom has to emit a complete deflate stream (final block) or a one-shot decompressor reads back a truncated stream and raises 'unexpected EOF'. zipfile stores compressed members through that idiom and zipimport inflates them with raw deflate, so every compressed-zip import was failing. Add sys.meta_path so import_helper's save/restore around each test stops raising AttributeError, register zipimport.zipimporter on sys.path_hooks ahead of the FileFinder hook, and have the Go path finder consult sys.path_hooks for non-directory sys.path entries: it builds the importer, asks it for the spec, and loads the module via module_from_spec + exec_module, mirroring _bootstrap._load_unlocked.
PEP 420 namespace packages were dropping portions found in zip path-hook importers, so a package split across two archives ended up with a __path__ of length 2 instead of the merged single entry. PathFinder now accumulates namespace portions from path-hook specs the same way it does for plain directories. A real-filesystem namespace portion that only holds .pyc files (no .py) was also invisible to the directory scan, so submodules under it raised ModuleNotFoundError. Added __init__.pyc and <tail>.pyc handling that loads the marshalled code directly. importlib.util.module_from_spec was a divergent stub that only set __file__ when origin was not None; namespace specs left it unset and mod.__file__ raised AttributeError. Re-export _bootstrap.module_from_spec so namespace modules get __file__ = None like CPython. zlib.crc32/adler32 now accept bytearray. Drops two test_zipimport scratch artifacts that were committed by mistake.
…Error.msg gopy compiles every extension module into the binary, so they behave exactly like statically-linked builtins: they are found before the path finder and cannot be shadowed by a module of the same name on sys.path. sys.builtin_module_names only listed builtins and sys, which left the importlib builtin/extension finder tests with no usable module name and made test_zipimport.testAFakeZlib run (and fail) instead of skipping the way it does on a statically-linked CPython build. Build the tuple from the inittab snapshot, minus the few pure-Python modules gopy keeps there as an import shortcut so 'os' in sys.builtin_module_names stays False. ImportError now exposes the msg member CPython sets from the single positional argument, so exc.msg (read by zipimport's bad-magic test and others) works.
func_getattro pulled the attribute straight out of the function __dict__ and returned it without an incref, so the caller's arg-drop could decref a value the dict still held. A list stored on a function (mock wraps its patchings list on the decorated function this way) got emptied by list_dealloc after the first read, so a second read saw an empty list and the shared decorator silently stopped patching across test classes. Matches PyXINCREF in Objects/funcobject.c func_getattro.
|
test_zipimport is fully green now (91 tests, 4 skipped to match CPython). Two things were behind the last failures:
That incref fix is broad, not zipimport-specific, so worth a look. Where the rest of the panel stands: |
gopy shipped a trimmed importlib (stub machinery.py, a util.py that imported source_hash from _bootstrap_external instead of defining it, a _bootstrap.py that injected sys/_imp at module top). sys.meta_path was empty and the Python finders were dead code, so anything that introspected the import system or walked meta_path failed. Vendor the unmodified CPython 3.14 files (__init__, _bootstrap, _bootstrap_external, util, machinery, _abc, abc) plus the metadata, resources, readers and simple submodules, then run the two-phase install at startup the way pylifecycle does: __init__ self-bootstraps through its except-ImportError branch (we have no frozen _frozen_importlib), then we call _bootstrap._install / _bootstrap_external._install directly so meta_path ends up as [BuiltinImporter, FrozenImporter, PathFinder] and the FileFinder / zipimport path hooks are registered. Port the _imp C-function surface the full bootstrap drives (find_frozen, get_frozen_object, is_frozen_package, create_builtin, exec_builtin, extension_suffixes, _fix_co_filename) and add sys.pycache_prefix so cache_from_source works. test_runpy goes green (40), test_pkg green, test_pkgutil down to a couple of residuals.
…mpare equal ImportModuleLevel now walks sys.meta_path for finders a program installs, skipping the BuiltinImporter/FrozenImporter/PathFinder entries gopy realizes in Go and driving any spec a custom finder returns through loadFromSpec. This lets test-installed importers (pkgutil's MyTestImporter) satisfy 'import foo'. classmethod_get now stamps methOrigin so two bindings of int.from_bytes compare equal and hash alike, matching meth_richcompare's m_ml pointer test.
Port _PyModule_IsPossiblyShadowing to read the startup-captured leading sys.path entry (config->sys_path_0) instead of live sys.path[0], so a script that mutates sys.path after startup keeps consistent shadowing detection. The leading entry is now prepended to sys.path after site.main runs, matching CPython, so a -c run keeps sys.path[0] == '' rather than letting site.removeduppaths absolutize it. Set spec._initializing around module exec so a self-importing module hits the circular-import and 'consider renaming' hints. Pass the live __name__ object through to PySet_Contains so an unhashable str subclass raises, and guard stdlib_module_names with PyAnySet_Check. Module getattro now formats with %U-style literal quotes; os.__getattr__ miss uses single quotes.
A backward jump computed off an instrumented bytecode position read the live byte (INSTRUMENTED_LINE or an INSTRUMENTED_<X> variant) and looked its cache width up in the per-opcode table, which is keyed by base opcode only. The marker returned a zero cache count, so the jump landed one codeunit short of its target. Under sys.settrace this dropped the loop header by one instruction in inlined comprehensions, leaving the freshly built list on top of the stack instead of the iterator and raising 'list object is not an iterator' on the next FOR_ITER. advance() now resolves the marker (via the line original-opcode table) and de-instruments before reading the cache stride.
CPython binds the builtins module object (not its dict) to __builtins__ in the __main__ namespace; every other module gets the dict. The frame builder already unwraps a module back to its dict for LOAD_GLOBAL, so the only behavioural change is that 'del __builtins__.__import__' now reaches a module attribute and the import machinery raises ImportError afterwards, matching test_import.test_delete_builtins_import.
The integer-fd and path-open FileIO constructors wrap an os.File whose descriptor gopy already owns through FileIO.Close and the closefd flag. Go's runtime also arms its own finalizer on those os.File values, and a GC mid-run could fire it after the descriptor number had been freed and reused by an unrelated open file, closing that file's fd out from under it. Long write loops then failed with a spurious EBADF. Clear the Go finalizer at every borrowed-fd wrap site, and also on os.isatty's throwaway wrapper, so release stays deterministic.
…ystems On macOS the filesystem is case-insensitive but case-preserving, so a plain os.Stat probe lets `import RAnDoM` resolve random.py. CPython's FileFinder guards against this by testing the candidate name against the exact-case set(os.listdir(dir)) unless _relax_case() allows folding. Port that check: confirm each resolved candidate's final component matches a real directory entry with exact case, relaxed only on case-insensitive platforms when PYTHONCASEOK is set.
…on Windows importlib._bootstrap_external does 'import winreg' at module top level when sys.platform == 'win32', so on Windows the bootstrap failed with ModuleNotFoundError: No module named 'winreg'. Register a constant-only winreg module on every platform, the same way _winapi is wired: the HKEY_* predefined handles, the KEY_*/REG_* access and value-type constants, and the error alias of OSError. Those are all WindowsRegistryFinder reads at find_spec time, and that finder is deprecated and not on the default meta_path, so the unported registry functions are never reached. Also drop the temporary import-trace prints from pathfinder and eval_import that pinned this down.
importlib._bootstrap_external reimplements os.path.join and os.path.isabs on Windows with nt._path_splitroot, so the path-based finder cannot start without it. Port the posixmodule.c accelerator following ntpath.splitroot and register it only on Windows, matching the #ifdef MS_WINDOWS gate. Also consume an EXTENDED_ARG prefix that surfaces from the dispatch loop when an instrumented line resolves back to a prefixed instruction, so traced code with wide args no longer trips the not-implemented path.
Keeps golangci-lint happy: the EXTENDED_ARG handling moved into applyInstrumentation / resolveExtendedArgPrefix so dispatch stays under the cyclomatic-complexity limit, and the ntpath.splitroot port uses package-level helpers instead of nested closures.
site._get_path reads sys.winver under os.name == 'nt' to assemble the per-user site-packages directory, and the preload of site runs before any user code, so the attribute has to exist on Windows. Set it to the major.minor DLL id string the way sysmodule does.
…x elsewhere) Lib/os.py picks ntpath vs posixpath by testing which name is in sys.builtin_module_names. gopy registered 'posix' unconditionally and also 'nt' on Windows, so os.path resolved to posixpath on Windows and mangled drive-absolute paths (site.removeduppaths turned the stdlib entry into cwd + '/' + path, dropping it from sys.path). Match CPython, which compiles posixmodule.c under a single MODNAME per platform.
On Windows, Go's syscall.Errno carries the raw WinAPI error code (e.g. ERROR_ALREADY_EXISTS 183), not a POSIX errno. The OSError synthesizer keyed exc.errno on that raw value, so errnomap never promoted to the right subclass and `except FileExistsError` slipped past, e.g. in py_compile.makedirs over an existing __pycache__. Port PC/errmap.h winerror_to_errno and route buildOSErrorFromGo and promoteOSErrorByErrno through it. No-op on non-Windows.
Go's syscall package fabricates its E* constants on Windows as 1<<29+iota, which neither matches the Universal CRT errno values CPython exposes (EEXIST is 17) nor the small codes the winerror translation produces. That left errnomap keyed on the fabricated values, so a translated EEXIST (17) found no entry and OSError never promoted to FileExistsError, e.g. py_compile's makedirs over an existing __pycache__. Key errnomap and the errno module table on the ucrt values directly and keep the POSIX syscall values on every other platform.
The test fed int(syscall.E*) inputs, which on Windows are the fabricated 1<<29+iota codes that no longer key errnomap. Move the test internal so it uses the same errEEXIST-style codes errnomap is built from, matching the values that reach ErrnoSubclass at runtime on every platform.
|
Windows is green now. Two distinct things were biting us there. First, Second, errno. Go's |
…verride test_frozen imports the toy frozen modules CPython compiles into the interpreter (__hello__, __phello__ and the alias entries). Wire them up: - imp.FrozenModule grows Source/OrigName so an entry can carry its .py text and compile lazily through FrozenCompiler, plus an alias origin for find_frozen. - _imp honors the frozen-modules override that import_helper toggles via _override_frozen_modules_for_tests, so the frozen/disk split the tests exercise actually flips. - sys._stdlib_dir is exposed so FrozenImporter._resolve_filename can find the on-disk copy of a frozen module, letting an unfrozen submodule load when its parent was imported frozen. - vendor Lib/__hello__.py and Lib/__phello__/ for the disk path. test_frozen now passes 3/3, matching CPython.
|
test_frozen is green now (3/3, matching CPython). This one needed the toy frozen modules CPython bakes into the interpreter ( What I wired up:
Where the rest of the panel stands after this: |
BufferedWriter.write only matched *Bytes and *ByteArray, so writing a memoryview (what pathlib.Path.write_bytes hands it) raised TypeError. CPython runs the data through PyObject_GetBuffer, accepting any contiguous buffer. Route through objects.AsBytesLike to match.
A class with a class-level attribute plus an __init__ that writes through vars(self).update(kwargs) was reading the class default instead of the per-instance value on the second and later instances built in a loop. objectGetDict materialized and exposed the instance dict but left inlineValid set, so LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES kept serving the cached class attribute even after a direct dict store shadowed it. Once Python code holds the dict it can store without routing through instanceSetAttr, so the cached-keys tracking can't stay in sync. Flip inlineValid off on exposure, matching make_dict_from_instance_attributes clearing values->valid, so the WITH_VALUES arms deopt and re-read the dict. Also route defaultdict.__getitem__ through a type-level __missing__ lookup so a subclass that overrides __missing__ (FreezableDefaultDict) is honored instead of always inserting via the default factory.
…t generically A ModuleType subclass (importlib.util._LazyModule) reaches its namespace via object.__getattribute__(self, '__dict__'). type_new skips installing the __dict__ getset on the subclass because the dict slot is inherited, and ModuleType never carried one, so the generic getattr path found no descriptor and raised AttributeError. Install the getset on ModuleType itself and let objectGetDict hand back md_dict for a *Module.
rlock_acquire held the GIL while blocking on the gate mutex, so a thread waiting to acquire an RLock owned by another thread would pin the GIL and deadlock every other Python thread (intermittent hang across test_importlib). Wrap the blocking gate.Lock and the timed poll in AllowThreads, matching the non-reentrant lock path and CPython's Py_BEGIN_ALLOW_THREADS around the lock wait. compile() also now accepts an os.PathLike filename (PyUnicode_FSDecoder), so importlib source loaders that pass a pathlib.Path compile instead of raising TypeError.
…ilder import_helper.import_fresh_module blocks _frozen_importlib and re-imports importlib from scratch. Two things broke during that fresh load: - The eager Go-side spec builder force-imported importlib.util while the importlib package was still mid-bootstrap, so spec_from_file_location ran against an unwired _bootstrap_external (_bootstrap=None) and crashed. Defer spec attachment until the package finishes initializing, then drain the queue post-exec, mirroring importlib's own do-not-import-until-bootstrapped rule. - A blocked None entry surfaced as a bare, name-less ImportError, so importlib/abc.py's 'except ImportError as exc: if exc.name != ...' guard re-raised. Raise the proper ModuleNotFoundError carrying name= instead.
…form gate object() attribute assignment now raises AttributeError (object inherits PyObject_GenericSetAttr / GenericGetAttr like PyBaseObject_Type) instead of a bare TypeError, so importlib's legacy-attribute fallbacks catch it. Built-in modules carry __loader__ = BuiltinImporter again. _setup wires the right spec during bootstrap; the deferred spec flush no longer clobbers it, and the builtin-spec builder falls back to importlib._bootstrap for the importer class when importlib.machinery is not imported yet. winreg is only registered on Windows, matching CPython, so import_module skips test_windows on other platforms.
The FileFinder caches directory listings keyed on the directory's st_mtime and only refills when the mtime changes. gopy's os.stat truncated every timestamp to whole seconds, so a directory entry created in the same second as the cache fill was never noticed and find_spec kept returning None for a freshly written submodule. Thread full nanosecond precision through newStatResult and the per-platform stat field extractors: the integer slot floors to seconds, the float slot carries the fraction, and the *_ns slot keeps the raw nanoseconds, matching _pystat_fromstructstat's fill_time. _pyio is vendored unmodified so test_importlib.test_util's atomic-write tests can swap io.FileIO for the pure-Python implementation.
The builtin __import__ was flattening fromlist into a []string and rejecting any non-str entry up front with its own message. CPython never type-checks fromlist there: builtin___import___impl hands the object straight to _handle_fromlist, which iterates it and raises the "Item in ``from list'' must be str" TypeError itself, and happily walks an arbitrary iterable. Pass the object through unchanged so the error text and the iterable handling match. This is what test_fromlist's test_invalid_type checks (both the bytes-in-a-list and the iter([bytes]) cases). Also routes the builtin's dotted-head / fromlist selection through the PyImport_ImportModuleLevelObject port instead of stringifying.
… open TextIOWrapper's incremental decoders hard-coded strict error handling and ignored the errors argument, so open(path, errors='ignore').read() raised UnicodeDecodeError instead of applying the handler. Thread errors through every decoder and delegate the byte decode to the codecs package, which applies the named handler the same way bytes.decode does. Also reject opening a directory: open() succeeds on a directory on Unix, but a file object must never wrap one. fstat the descriptor and raise IsADirectoryError at construction time the way CPython does, rather than deferring the failure to the first read.
…enter time.sleep parked the interpreter with the GIL held, so a thread that slept starved every other Python thread. Wrap the sleep in AllowThreads, the gopy spelling of Py_BEGIN_ALLOW_THREADS that time_sleep uses, so siblings run while one thread waits. This is what unblocked test_importlib/test_locks, which spins up nine threads that each sleep between contended _ModuleLock acquires. start_new_thread also handed the child its ident only after enter() took the GIL. The parent still held the GIL and blocked waiting on that ident, so the child could never make progress. Publish the ident before enter() so the parent gets to its next allow-threads point. Vendor lock_tests and test_py_compile so test_locks and the source loader tests import their helpers.
CPython freezes importlib._bootstrap and importlib._bootstrap_external, so their code objects keep the synthetic <frozen importlib._bootstrap[_external]> co_filename for the whole process. gopy loads them from source and was caching a .pyc adjacent to them; on the next import readBytecodeCache ran fixCoFilename and rewrote co_filename to the real disk path. That left the import-machinery frames carrying a real path, so remove_importlib_frames could no longer strip them and the ImportTracebackTests saw extra _bootstrap_external frames. Skip the bytecode cache for these two files, matching CPython where they are never byte-compiled. The source compiler stamps the frozen name on every load.
Next slice of the spec 1700 vendored-test work: the Modules / imports panel. That's the 12 flat files plus the test_import/, test_importlib/ and test_module/ directory suites, driven to CPython 3.14.5 parity under the 1726 zero-skip bridge (we run what CPython runs and skip what it skips).
Baseline audit against CPython 3.14.5 (all of these are green on CPython):
Plan, smallest blast radius first:
Spec: website/docs/specs/1700/1731. Opening as a draft; will fill in as each phase lands and keep CI green.