-
Notifications
You must be signed in to change notification settings - Fork 8
Description
Hi there,
I was looking through the utility functions in _dict_utils.py and noticed a few spots where the code might break or behave unexpectedly when hitting certain edge cases. I think adding some defensive checks could save users from some confusing debugging sessions.
Here are the main points I found:
- Potential
RuntimeErrorindelete_keys
In theelseblock ofdelete_keys, the code iterates overselected_keysand deletes items from the dictionary. If a user passes a live view (likemy_dict.keys()), Python will throw aRuntimeErrorbecause the dictionary size changes during iteration.
Fix: It’s safer to wrap selected_keys in a list() or set() to ensure we’re iterating over a static snapshot.
- Brittle sequence/dict conversions
In seq_of_dict_to_dict_of_seq, the code assumes all dictionaries have the same keys as the first one. If they don't, it'll either raise a KeyError or produce mismatched list lengths.
In dict_of_seq_to_seq_of_dict, passing an empty dictionary causes next(iter(values.keys())) to raise a StopIteration error.
Fix: Adding a quick check for empty inputs and ensuring key consistency would make this much more robust.
- Implicit data loss in
rename_keys
The logic for handling collisions inrename_keys(via theomitset) is clever, but it might surprise users by silently dropping data if they aren't careful with their mapping.
Fix: Maybe a simple warning or just a clearer docstring note about how collisions are handled would help.
I've already played around with some fixes for these. Would you be open to a PR?
Best, Salim