Skip to content

Consider additional dependencies for performance, security #29

@bollwyvl

Description

@bollwyvl

Thanks again for graphtage!

While I haven't used XML diffing in anger yet, it would be interesting to explore some (optional) dependencies to increase the robustness and performance of that component:

  • lxml has the same API, but better performance, than stdlib
  • defusedxml helps prevent well-known malicious XML attacks that works with stdlib or lxml

Similarly, a number of far-higher performance JSON parsers are available, with different ease-of-installation/speed/memory tradeoffs for which it might be hard to anticipate user preference:

If there is interest, I could probably take a stab at a PR for this:

  • change the json API to accept an optional parser
    • add extras with a sensible bottom version pins
  • change the xml API to accept an optional parser
    • add defusedxml in install_requires
    • add lxml in an extras section
      • or install_requires, as "complexity of installation" is no longer really a concern once scipy enters the picture...
  • test against different combinations with tox in CI

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions