We have currently had great success with 1. https://github.com/datalab-to/marker to extract data from `pdf` to `markdown` components. but it would be interesting to compare to a couple of newly released tools: 1. https://github.com/Yuliang-Liu/MonkeyOCR
We have currently had great success with
to extract data from
pdftomarkdowncomponents.but it would be interesting to compare to a couple of newly released tools: