Skip to content

Failure to correctly parse Table 1 spanning multiple pages #4

@ainilaha

Description

@ainilaha

When Table 1 spans across multiple pages, the current parsing pipeline may lose structural continuity, resulting in incorrect or incomplete extraction.

Observed behavior

  • Table content is split at page boundaries
  • Rows and columns become misaligned after the page break
  • Header or column definitions may not be consistently propagated
  • In some cases, the second half of the table is treated as a separate or unrelated structure

Expected behavior

  • Multi-page tables should be detected as a single logical table
  • Structural consistency (headers, column alignment) should be preserved across pages
  • Table reconstruction should maintain row continuity and semantic coherence

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions