-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
I'm wondering if it would be possible to add an attribute to the result for how many characters were parsed. My usecase has me parsing a string input that has random text, interspersed with multiple json objects.So I want to do something like this:
from dirtyjson import loads
text = 'example text containing {"foo":0, "bar":1} multiple json objects {"bazz":2, "boo":3} possibly separated by random text [1,2,4,7] and other junk'
while len(text) > 0:
#skip text ahead to next object/array
if not text.startswith('{') and not text.startswith('['):
index = next((i for i, c in enumerate(text) if c in ('{', '[')), -1)
if i == -1:
break #no more objects to eat
text = text[index:]
#parse the current object
chunk = loads(text)
#do something with the json object
print(chunk)
#strip the object out of the string
characters_eaten = #somehow get the number of characters used for the parse
text = text[characters_eaten:]But right now it's really not feasible to do this because there's not way to measure how many characters were eaten while parsing the current object. I guess technically it would be possible to use the row/column annotation of the last element in the object/list and then find the closing delimiter, but that's super cumbersome. Having the length of the characters eaten would be very useful
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels