Skip to content

gh-122996: Clarify encoding behavior in dbm module documentation#138030

Open
furkanonder wants to merge 1 commit intopython:mainfrom
furkanonder:gh-122996
Open

gh-122996: Clarify encoding behavior in dbm module documentation#138030
furkanonder wants to merge 1 commit intopython:mainfrom
furkanonder:gh-122996

Conversation

@furkanonder
Copy link
Copy Markdown
Contributor

@furkanonder furkanonder commented Aug 21, 2025

Copy link
Copy Markdown
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought UTF-8 is always used. What backends use different encoding?

@furkanonder
Copy link
Copy Markdown
Contributor Author

I thought UTF-8 is always used. What backends use different encoding?

Yes, UTF-8 is the default in most cases. But in this section, I wanted to reference SQLite databases specifically. SQLite can use different encodings. But for the sake of simplicity, I didn't mention SQLite directly in the documentation.

@serhiy-storchaka
Copy link
Copy Markdown
Member

If this is so, then I think that this is a bug in the SQLite backend. The main data type is binary, setting string key and value is acceptable (for compatibility with Python 2), but you get bytes objects when read them back, and their values should be predicable and backend-independent.

I think we can live with a difference in handling surrogate characters.

cc @erlend-aasland

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting review docs Documentation in the Doc dir extension-modules C modules in the Modules dir skip news

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

2 participants