-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
Opening the file here is useless:
Line 30 in daeaa73
| with open(file, "r"): |
because lxml itself opens it there:
Line 35 in daeaa73
| xml = etree.parse(file) |
Cf.:
>>> help(lxml.etree.parse)
parse(source, parser=None, *, base_url=None)
Return an ElementTree object loaded with source elements. If no parser
is provided as second argument, the default parser is used.
The ``source`` can be any of the following:
- a file name/path
- a file object
- a file-like object
- a URL using the HTTP or FTP protocol
And you should also specify encoding explicitly, especially here:
Line 89 in daeaa73
| with open(outfile,"w") as output: |
I'm quoting from the documentation:
In text mode, if encoding is not specified the encoding used is platform dependent:
locale.getpreferredencoding(False)is called to get the current locale encoding.
AFAIK, Windows does not use UTF-8 here. This might lead to problems.
Thanks in any case for the TEI Xpath expressions 🙂
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels