Skip to content

Chain-of-Box not found in log #5

@bug-to-share

Description

@bug-to-share

Hi, thank you very much for your great work!

I have a question regarding the grounding results. In the paper, it is mentioned that "whenever the model references a ROI region in the image, it explicitly appends the corresponding bounding box coordinates [x1, y1, x2, y2] after the region text. This Chain-of-Box approach ensures the visual information is seamlessly integrated into the reasoning context, enabling VLMs to perform multimodal reasoning effectively."

However, I couldn’t find any grounding results (e.g., bounding boxes or coordinate information) in the section of the file eval/logs/rec22_results_cxr_test_qwen2_5vl_7b_instruct_r1_450.json.

Could you please check whether this is the correct file, or if the grounding results are stored elsewhere?
Thank you for your time and help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions