Document with Layout

Hello, I have a question regarding the "Document with Layout" code mentioned in your README : It seems this code doesn't run—the Image class is not defined. How should I modify it, or am I doing something wrong?

Also, can this approach achieve functionality similar to Deepseek-OCR—not only converting a PDF to markdown but also providing the corresponding layout information?

```python
import llava

# Load model
model = llava.load("./easy_deepocr_sam_clip")

prompt = [
    Image("document.pdf"), 
    "<|grounding|>Convert the document to markdown."
]
response = model.generate_content(prompt)




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document with Layout #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Document with Layout #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions