Drop Python 2 compatibility code#62
Closed
mattst88 wants to merge 4 commits into
Closed
Conversation
Python 3's open() accepts str paths directly and always uses UTF-8 as the filesystem and content encoding on modern systems. The _unicode_encode(path, encoding=_encodings["fs"]) pattern was Python 2 compatibility code that converted str paths to bytes before passing them to open(); it is unnecessary in Python 3. Replace open(_unicode_encode(path, encoding=_encodings["fs"]), encoding=_encodings["content"]) with open(path, encoding="utf-8") throughout, and drop the portage imports. Signed-off-by: Matt Turner <mattst88@gentoo.org>
portage.os is a re-export of the stdlib os module, kept for Python 2 compatibility. Import os directly. Signed-off-by: Matt Turner <mattst88@gentoo.org>
The try/except block that defined unicode = str when unicode was not a builtin was Python 2 compatibility code. In Python 3, unicode does not exist; str is always correct. Drop the shim and use str directly. Signed-off-by: Matt Turner <mattst88@gentoo.org>
Without an explicit encoding, open() uses the locale encoding, which may not be UTF-8 on all systems. All files read or written here are UTF-8 (ebuilds, Gentoo profile files). Signed-off-by: Matt Turner <mattst88@gentoo.org>
thesamesam
reviewed
Jun 25, 2026
| output("Parsing Exclude file: " + filepath) | ||
| try: | ||
| file_ = open( | ||
| _unicode_encode(filepath, encoding=_encodings["fs"]), |
Member
There was a problem hiding this comment.
Right, see https://bugs.gentoo.org/914722 and 4f5f6f571e52af6d2703db760bad4e0ad7439d5a in Portage.
Sadly, we still have a lot of baggage to remove: gentoo/portage#700
Contributor
Author
There was a problem hiding this comment.
Yep, I started cleaning things up in this area with gentoo/portage#1601
I'll review the one you linked to see if there is something valuable there.
thesamesam
reviewed
Jun 25, 2026
thesamesam
left a comment
Member
There was a problem hiding this comment.
I think this is okay but I'm always nervous with the Portage unicode stuff (see e.g. ea1c12ac91faeb25b30d364afa1764506b7a1535 in portage.git) :/
gentoo-bot
pushed a commit
that referenced
this pull request
Jun 25, 2026
Python 3's open() accepts str paths directly and always uses UTF-8 as the filesystem and content encoding on modern systems. The _unicode_encode(path, encoding=_encodings["fs"]) pattern was Python 2 compatibility code that converted str paths to bytes before passing them to open(); it is unnecessary in Python 3. Replace open(_unicode_encode(path, encoding=_encodings["fs"]), encoding=_encodings["content"]) with open(path, encoding="utf-8") throughout, and drop the portage imports. Signed-off-by: Matt Turner <mattst88@gentoo.org> Part-of: #62
gentoo-bot
pushed a commit
that referenced
this pull request
Jun 25, 2026
portage.os is a re-export of the stdlib os module, kept for Python 2 compatibility. Import os directly. Signed-off-by: Matt Turner <mattst88@gentoo.org> Part-of: #62
gentoo-bot
pushed a commit
that referenced
this pull request
Jun 25, 2026
The try/except block that defined unicode = str when unicode was not a builtin was Python 2 compatibility code. In Python 3, unicode does not exist; str is always correct. Drop the shim and use str directly. Signed-off-by: Matt Turner <mattst88@gentoo.org> Part-of: #62
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Remove portage._unicode_encode/_encodings wrappers around open() and os.lstat(). These were unnecessary since Python 3 accepts str paths and uses UTF-8 by default on modern systems. Also drop portage.os (a re-export of stdlib os), the Python 2 unicode/str shim in pprinter, and add explicit encoding="utf-8" to the remaining text-mode open() calls that lacked it.