Skip to content

Replace autoconf configure.ac with a Python-based configure system #148112

@nascheme

Description

@nascheme

Feature or enhancement

Proposal:

This will be submitted as a PEP. The draft of the PEP appears below.

PEP: XXX — Replace autoconf configure with Python-based configuration

Field Value
Title Replace autoconf configure with Python-based configuration
Author Neil Schemenauer nas@arctrix.com
Status Draft
Type Standards Track
Created 04-Apr-2026
Python-Version 3.16
Post-History Pending
Discussions-To Pending

Abstract

CPython's current configure script is a ~1 MB shell file generated from an 8 000-line configure.ac written in M4. This PEP proposes to replace that autoconf layer with an equivalent system written in Python, together with an automatically generated POSIX‐sh + AWK fallback. The new script is a drop-in replacement: it accepts the same options, honours the same environment variables, and produces identical output files (Makefile.pre, pyconfig.h …). All benefits are obtained without changing the rest of the build (Makefiles, Clinic output, etc.).

Motivation

Autoconf served CPython well for decades, but it now imposes four significant costs that are best addressed independently of any larger build-system overhaul.

  1. Maintainability: M4 macro rules, quoting, and ad-hoc shell make configure.ac difficult to read. Few core developers are fluent in M4; everyone is fluent in Python.
  2. Developer experience: M4 enjoys no IDE, linter, type checker or language-server support, whereas Python integrates with ruff and all modern tooling.
  3. Bootstrapping friction: Regenerating configure today requires a specific autoconf version inside a container. The proposed replacement regenerates with nothing but the repo's minimum Python version.
  4. Incremental evolution: A Python code base can be refactored and unit-tested gradually. Contributors routinely improve Python code; almost nobody touches configure.ac.

Specification

The new configuration system consists of three cooperating parts.

  • configure.py – Python driver that imports a set of conf_*.py modules mirroring the logical sections of configure.ac and using a pyconf runtime that re-implements the familiar AC_CHECK_* family (check_header, check_func, …).

  • configure – tiny shell wrapper that runs the transpiled AWK version by default. If the user sets the environment variable PYTHON_FOR_CONFIGURE to a command (for example python3 or /usr/bin/python3), the wrapper will execute that command with configure.py as the program to run. If PYTHON_FOR_CONFIGURE is unset, the AWK path is used. Making the AWK path the default ensures it is exercised in normal builds while allowing integrators to opt into a specific Python interpreter when desired.

  • Transpiler (Tools/configure/transpile.py) – converts the Python modules into POSIX AWK so that the bootstrap path requires only sh, awk and a C compiler.

The command-line interface, cache handling and output files remain unchanged. Supported platforms are exactly those in PEP 11 tier 1–3, including cross-compiled WASI / Emscripten targets.

Rationale

Why not "rewrite the whole build" now?

Prior attempts to jump directly to CMake or Meson stalled because the scope is enormous. Replacing only the autoconf layer delivers concrete benefits in one release cycle and creates reusable Python modules that any future build system can import.

Why AWK as the default path instead of requiring Python or emitting pure shell?

AWK is present on every Unix-like host, offers associative arrays and decent performance, and avoids the circular dependency of "build Python, to run Python, to build Python". Making AWK the default ensures this strictly-portable path is continuously tested. Advanced users and integrators can opt into a specific Python interpreter by setting PYTHON_FOR_CONFIGURE (for example PYTHON_FOR_CONFIGURE=python3).

Backwards Compatibility

The new script is intended as a strict drop-in replacement. Any project that drives CPython's build by invoking ./configure with standard flags will continue to work. The cache file's format may differ; tools that parse rather than delete it might need updates. A new optional environment variable, PYTHON_FOR_CONFIGURE, lets integrators opt into running a specific Python interpreter. If PYTHON_FOR_CONFIGURE is set (e.g. PYTHON_FOR_CONFIGURE=python3), the wrapper will invoke that command with configure.py; if it is unset, the AWK default is used.

During the transition the autoconf version will be kept in the repository as configure-old for one release cycle to allow side-by-side testing.

Reference Implementation

A complete implementation lives at https://github.com/nascheme/cpython/tree/configure_py. It already generates identical outputs on Linux, macOS, the BSDs, and the cross-compile targets exercised in CI.

Security Implications

No new security-sensitive operations are introduced; the replacement compiles the same small C test programs as autoconf currently does.

AI Assistance Disclosure

Large-language-model assistants (Claude 4.6 and GPT-4) were used to help translate configure.ac into Python, write the transpiler, write unit tests, and draft this PEP. All code and prose must be reviewed by human contributors before acceptance.

Appendix A – Implementation Overview (informative)

Directory layout (high level):

Tools/configure/
    configure.py            – driver, arg parsing
    pyconf.py               – autoconf-like check helpers
    conf_*.py               – 20 modules, literal translation
    test_pyconf.py          – unit tests
    transpiler/             – Python → AWK pipeline
        transpile.py        – entry point
        pyconf.awk          – AWK runtime mirror

Transpiler pipeline:

Python AST → pysh AST → AWK AST → text → configure.awk

The AWK runtime stays ≈2 400 lines, the Python runtime ≈3 700. See Appendix C for a quantitative breakdown of how much code in the conf_*.py modules targets each PEP 11 platform tier.

Appendix B – Platform Test Matrix (informative)

Automated scripts test that both Python and AWK paths generate identical outputs on:

  • Host Linux (native)
  • Ubuntu 24.04 (Docker)
  • FreeBSD 14 (QEMU)
  • OpenBSD 7.8 (QEMU)
  • NetBSD 10 (QEMU)
  • macOS 14 (native)
  • WASI (cross)
  • Emscripten 3.x (cross)

More tiers can be exercised by extending the provided Docker/VM files.

Appendix C – Platform Tier Code Distribution (informative)

Every function in the twenty conf_*.py modules was manually classified by which PEP 11 platform tier its logic primarily serves. The classification used the following rules:

Tier 1 — code whose only purpose is to support an officially Tier 1 platform: x86_64/aarch64 Linux, x86_64/arm64 macOS, x86_64 Windows. Examples: JIT stencil selection for x86_64 and aarch64, LTO/PGO flags for mainstream compilers, macOS universal-binary handling.

Tier 2 — code specific to officially Tier 2 platforms not captured above: aarch64 Windows, Windows 32-bit, additional Linux variants. Examples: i686 JIT stencil, architecture-specific SIMD probes that cover hardware beyond Tier 1.

Tier 3 — code specific to officially Tier 3 platforms: FreeBSD, OpenBSD, NetBSD, AIX, Solaris/illumos, iOS, Android, WASI, Emscripten. Examples: the entire conf_wasm.py module, Android API-level checks, AIX shared-library linker flags, BSD broken-semaphore workarounds.

Unsupported — code specific to platforms not listed in any PEP 11 tier: HP-UX/HPPA, SCO OpenServer, Tru64/OSF1, legacy QNX, AtheOS, and historical Cray references. This code is carried forward verbatim from configure.ac but targets systems Python no longer supports.

Support code — everything else: generic compiler detection, option parsing, path computation, module-build infrastructure, output-file generation, and cross-platform feature probes whose result is used on all platforms equally.

Results (≈6 350 lines total across all conf_*.py modules):

Category Lines (approx.) Share
Support code (generic / cross-platform) 2 400 38 %
Tier 3 (FreeBSD, BSDs, AIX, Solaris, iOS, Android, WASI, Emscripten) 1 500 24 %
Tier 1 (x86_64/arm64 Linux, macOS, Windows) 1 450 23 %
Tier 2 (aarch64 Windows, Win32, other variants) 900 14 %
Unsupported platforms (HP-UX, SCO, Tru64, legacy QNX, …) 100 1.5 %

Key takeaway — unsupported-platform code is only 1.5 %. Stripping it would barely reduce the size of the code base. The overwhelming majority of the logic in conf_*.py covers platforms that are actively supported by CPython today.

This has an important implication for future build-system migration: any replacement (CMake, Meson, or otherwise) would need to retain substantially all of this logic. The conf_*.py modules represent a clean, tested, Python-readable distillation of decades of platform-detection knowledge. They are structured to be importable and reusable, so a future build system could consume them directly rather than rediscovering the same platform quirks from scratch.

Copyright

This document is placed in the public domain or under the CC0-1.0-Universal licence, whichever is more permissive.

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions