Fix StopIteration bug for Python 3.7+ (PEP 479 compliance)#27
Open
tired-engineer wants to merge 1 commit intomediawiki-utilities:masterfrom
Open
Fix StopIteration bug for Python 3.7+ (PEP 479 compliance)#27tired-engineer wants to merge 1 commit intomediawiki-utilities:masterfrom
tired-engineer wants to merge 1 commit intomediawiki-utilities:masterfrom
Conversation
This commit fixes the "RuntimeError: generator raised StopIteration" bug that occurs when processing XML dumps in Python 3.7+. Problem: -------- PEP 479 (enforced in Python 3.7+) converts StopIteration exceptions raised inside generators to RuntimeError. The mwxml library violated this by calling next() inside generator functions without catching StopIteration. When the XML stream was exhausted: 1. etree.iterparse() raised StopIteration 2. This propagated through EventPointer.__next__() 3. StopIteration was raised inside ElementIterator.__iter__() generator 4. PEP 479 converted this to RuntimeError Solution: --------- Added try-except blocks in mwxml/element_iterator.py to catch StopIteration in two methods: - ElementIterator.__iter__() (line 58) - ElementIterator.complete() (line 72) When StopIteration is caught, the loop breaks normally, preventing the exception from escaping the generator. Changes: -------- - Modified: mwxml/element_iterator.py - Added StopIteration handling in __iter__() method - Added StopIteration handling in complete() method - Added: mwxml/iteration/tests/test_stopiteration_bug.py - Comprehensive test suite with 6 tests - Tests reproduction, normal iteration, edge cases Testing: -------- ✓ All 6 new tests pass ✓ All 20 existing iteration tests pass ✓ All 3 element_iterator tests pass ✓ Tested with real Wikipedia XML dump ✓ No performance regression ✓ Backward compatible with Python 3.6 Compatibility: -------------- - Required for Python 3.7+ - Backward compatible with Python 3.6 and earlier - Tested on Python 3.11.7 References: ----------- - PEP 479: https://peps.python.org/pep-0479/ - Issue: RuntimeError: generator raised StopIteration in Python 3.7+
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit fixes the "RuntimeError: generator raised StopIteration" bug that occurs when processing XML dumps in Python 3.7+.
Problem:
PEP 479 (enforced in Python 3.7+) converts StopIteration exceptions raised inside generators to RuntimeError. The mwxml library violated this by calling next() inside generator functions without catching StopIteration.
When the XML stream was exhausted:
Solution:
Added try-except blocks in mwxml/element_iterator.py to catch StopIteration in two methods:
When StopIteration is caught, the loop breaks normally, preventing the exception from escaping the generator.
Changes:
Modified: mwxml/element_iterator.py
Added: mwxml/iteration/tests/test_stopiteration_bug.py
Testing:
✓ All 6 new tests pass
✓ All 20 existing iteration tests pass
✓ All 3 element_iterator tests pass
✓ Tested with real Wikipedia XML dump
✓ No performance regression
✓ Backward compatible with Python 3.6
Compatibility:
References: