Conversation
|
1e2aa7d also fixes a slight bug in the |
|
We could trivially forward the location information to the DOM itself. I think this could be useful for any application which manipulates HTML provided by the user. Any time the user provides a DOM which parses but is considered invalid by the application, they would want to give the user information about what node caused the problem. Being able to point back into the original file would be quite useful here. However, for applications where this isn't the case (such as programs generating or modifying HTML), this could be an added complexity which might be inconvenient for the user. To store the location for each node type, we would probably have to change the Another, more convenient way would be to add the A third, slightly more radical way would be to change all instances of I will not make any choice here, but I might implement one of these approaches in my fork to use in an application I'm developing. |
This pull request does two things: introduce an error enum for possible failures during parsing, as well as track locations for all tokens to give information for errors. A few tests have been added to ensure the validity of the error tracking.
I made the decision to track locations as char indices rather than bytes in the source. This is mainly because this makes the tracking easier to write — we can simply call
.enumerate()on the HTML in thehtml_to_stackfunction. I have tried to ensure that location gathering will never be O(n²), which could occur if you need to "count backwards" to see how long a thing you've kept in memory is in chars.