Add citation/reference hover preview#5611
Conversation
Hovering an internal-document link (citation, figure, footnote, etc.) now shows a small popup that renders the destination region of the target page, so the bibliography entry / figure / footnote is visible without leaving the current page. Inspired by PDFRefPreview. Detection: - IsCitationLink: any internal kindPageElementDest link. - DetectEntryBox: per-glyph text+coords scan from the destination anchor, ending at one of: "[N" at the entry's first-line X, indent change back to that X, an X that matches neither first-line nor continuation X, vertical paragraph break, single-line-entry pattern, or column wrap. - Falls back to a fixed strip when text-based detection finds nothing near destY, and to a taller strip when the detected box is too small (figure / diagram fragment). - Result is rendered via engine->RenderPage and blitted into a yellow popup, letterboxed to fit max bounds. Engine API: - IPageDestination::GetDestPoint2() (default returns GetRect2()). - PageDestGetDestPoint() helper. - PageDestinationMupdf caches resolved (destX, destY) from fz_resolve_link at construction. Plumbing: - New RefHover.cpp/.h. - MainWindow gets a refHover member, destroyed in dtor. - Canvas OnMouseMove schedules / hides the popup; OnTimer renders. Configurable via the EnableCitationHover advanced setting (default true). fixes sumatrapdfreader#128 fixes sumatrapdfreader#4221 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Looks very usefull do you have a compiled exe for me to play with while testing some other files |
Thank you! I was searching for that feature for a long time - but none of the existing apps convinced me. Sioyek kind of looked nice, but it has other issues (e.g., ahrm/sioyek#1350)
Sure! I put it at https://files.jabref.org/sumatra/SumatraPDF-dll.exe . |
Initial render uses page zoom so popup text height matches visible page text. Wheel on the citation link re-renders a region sized to fill the popup at the new zoom (anchored at detection top-left, clamped to page). Wheeling out brings in new page content; wheeling in crops to detail. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Good question — I just added mouse-wheel zoom for the popup. Hover a citation link to bring up the popup, then scroll the wheel while still hovering the link to zoom the preview in/out:
The popup window keeps its initial size; only the rendered content changes. The initial zoom now matches the document's current page zoom, so popup text height is comparable to the visible page text. New build: https://files.jabref.org/sumatra/SumatraPDF-dll-2.exe |
|
@koppor Fantastic improvement though took me a while to grasp the concept of zoom the source box but once tried out it makes sense as an alternative to try to zoom the quick viewbox (I have had issues with that when attempting a magnifier box as mouse focus is then relocated). The last core issue I see is the "window" is highly varied depending on sources own desired target. Take this silly example, where the target is a wide set of goto with a topbar target. It would be better if the quickview is a fixed ratio. |
Reviewer noted that internal links pointing to non-bibliography destinations (TOC entries, topbar goto targets, page anchors) produced ugly wide-thin popups because the entry-detection fallback returned a strip that ran from destX to the right page margin. Replace those fallback paths with a fixed-pt box anchored on the destination so every non-reference target gets the same popup shape. Bibliography entries that the detector can identify keep their existing (content-shaped) box. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@GitHubRulesOK Good catch — pushed 77b017f to address that. Now the popup shape depends on what the link actually points to:
So your wide-thin "topbar goto" example now gets the same compact box as any other non-reference target, rather than stretching to the right page margin. Build with this fix: https://files.jabref.org/sumatra/SumatraPDF-dll-3.exe |
|
@koppor I am not trying to be a killjoy this IS a good improvement |
The atIndentX check used a ±5pt tolerance to decide whether a new line was still part of the current bibliography entry. That's too tight when a continuation line switches from roman to italic (or vice versa) — the per-glyph bbox left edge shifts with the new side-bearings, even though both lines actually start at the same pen position. The result was that 3+ line entries with italic continuation (e.g. titles followed by an italicized journal/conference name on a new line) got truncated at the italic line's first glyph, with only the 6pt of padding leaking the top of that line into the popup. Bumping tolerance to ±25pt covers any realistic side-bearing variance while still catching truly external content (footers, author bios) which sit at a meaningfully different X. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Removed the rule that terminated entry detection when a continuation line's first glyph X drifted from the captured indentX. Even with the recently bumped tolerance, font-metric variance on continuation lines (especially when one continuation line starts in italic and another in roman/digit) could still trigger premature termination, cutting off the last line of long bibliography entries. The remaining rules already cover all the realistic next-entry signals: - "[N" at firstLineLeftX (numeric bibs) - new line back at firstLineLeftX after a continuation (hanging-indent) - vertical paragraph break - single-line-entry case If a future PDF has external content following a bibliography entry without any of those signals, we'd over-include — but that's vanishingly rare and clearly preferable to silently truncating real entries. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The thing is that with the AI helper, it is much easier to adress user comments and wishes. I only hope that the code style is good enough. 😅 |
Reviewer noted that the small fixed-ratio popup (360×260pt) used for non-reference targets often shows too little context — e.g. a 'Table N' link surfaces only the caption text, not the table itself; a section- heading link shows only the heading; an image-only PDF link shows a narrow patch of art. Replace it with a landscape view: full page width × half page height, anchored at the destination. Capped by kMaxPopupWidth/Height (the latter already enforced; this commit also wires the width cap into the auto- fit). Also recognise single-line detected entries with no continuation indent as ambiguous (likely caption / heading rather than a real bibliography entry) and route them through the same landscape view. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two related fixes:
1. Relax rule (a): a "[" at firstLineLeftX is now treated as a next-entry
marker regardless of what follows. Previously it required a digit
("[1]", "[2]"), which missed alphanumeric description-list styles
like "[Nyg11]", "[Foo+09]", "[KAZ18]".
2. The "single-line + no indent → landscape view" heuristic was firing on
single-line description-list bibliography entries, treating them as
non-references and replacing the fitted entry box with a half-page
landscape slice (which then included the next several entries below).
Skip that redirect when the detected entry starts with "[" — that's a
strong bibliography signal even on a one-line entry.
The destY-at-page-top case (where the citation link's destination has no
specific Y, just page-level) still falls back to the landscape view —
that one needs source-text extraction on the source page to identify the
citation key, which is a larger change.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@kjk can the PR be adjusted to your expectation of common user desires my test with others suggest they use a fixed window size so click figure1 will show a large area around that goto |
Reshape the citation-hover popup so it picks the right region for non-
bibliography destinations:
- "Figure / Table / Listing / Algorithm N.M" caption anywhere below the
detected box: the destination is figure / table / listing body. Show
landscape view so the popup includes the caption. Catches code/console
listings whose lines start with "[TAG]" (linter output, etc.) that
would otherwise short-circuit through the bracket-bib check.
- Code-listing detector via brace/semicolon/paren density: figures whose
body is example code route to landscape view.
- Numbered / labeled headings ("6.2 Foo", "Figure 2.2 ...", "Section 7"):
routed to landscape so the popup includes the heading + content below.
- Tabular indent (continuation X far right of firstLineLeftX): routed to
landscape to show the whole table.
Description-list bibliography improvements:
- Track a descListSibling flag on rule (a) "[" at firstLineLeftX, rule
(b) indent-change-back, rule (c) vertical paragraph break with the
next glyph at firstLineLeftX, and rule (d) single-line case. When set,
the post-loop returns the fitted box even for single-line entries with
no continuation indent — fixes abbreviation-list hovers (JVM, AKM,
ADR, OMT, ...) where blank lines separate sibling entries.
- Recover startIdx via leftmost-on-line walk after the y-band scan: a
poorly-authored link destX can land startIdx mid-line on hanging-
indent layouts, dropping the "[KOS06]" / "Philippe Kruchten" leading
portion from the popup. Walking to the line's actual leftmost glyph
recovers it.
- Trim trailing blank page margin in LandscapeBox so the popup ends just
below the last text glyph instead of including the full page footer.
- Cap popup height by monitor work-area (~90%, max 1400px) and clamp
popup position to the work-area edges so it doesn't get clipped at
the top/left when the cursor is near a screen corner.
Resolve page-level link destinations using the source link's text:
- PageDestGetDestPoint returns {0,0,0,0} when the link has no specific
anchor — common for body-text abbreviation / glossary links. Plumb
the source page + source rect through RefHoverSchedule, then on
destY <= 0 extract the longest alphanumeric run from the source rect
(preferring "(XYZ)"-flanked tokens over bracketed citation keys) and
search the destination page for it. The matching glyph's Y becomes
destY, so DetectEntryBox can crop to the matching entry. Hovering
"(AKM)" body text now yields just the AKM entry, not the whole
abbreviations page.
Re-render guard tightened: the popup short-circuit now also compares
destX/destY, so two adjacent links that share a destination page but
land at different positions (e.g. "Section 7" link + a bib ref both on
page 41) each render their own content rather than reuse a stale popup.
A manual-test checklist comment at the top of RefHover.cpp documents the
case classes covered and the remaining known limits.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Pushed Iterated through several non-bibliography hover targets and edge cases in description-list bibliographies. The detector now handles: Figures / tables / code listings / headings
Description-list bibliographies and abbreviation lists
Page-level link destinations
Popup positioning / re-render
Top-of-file checklist comment in |
|
The new build is available at https://files.jabref.org/sumatra/SumatraPDF-dll-4.exe |
I personally would like to see where the hover comes from - so that my eyes can jump from the source to the target and back. I will try that soon. |
|
@koppor It is not for me to be a director (I script/program bits but am not a programmer) simply an OLD observer on behalf of others (collaborator) But my brief observation based on suggested Sioyek is they seem to use roughly 80% of current viewer width (10% to 90%) and perhaps 40 - 45 % height where you could still see the source if the box be flipped above or below that point. |
Popup positioning & sizing: - Center popup horizontally on the source page (using the source page's on-screen rect, plumbed through from Canvas.cpp via dm->GetPageInfo) so the popup expands symmetrically into gray margins rather than drifting toward one edge. - Width capped at ~95% of monitor work area; height capped at 45% of source page on screen (default) or 75% / spaceAbove-based cap for the taller figure-with-caption case so the figure body + caption both fit. - 30px gap above/below cursor (was 16) so 1-2 lines of context around the hovered word stay visible. - Popup never overlaps the cursor: prefer below, flip above on overflow; if neither side has room for the full popup, shrink popup height to the larger side rather than clamping into the cursor. - Vertical bounds clamp the popup to the source page screen rect, so it doesn't escape into the next-page area; horizontal bounds use the monitor work area so the popup can span beyond the page text column. LandscapeBox region & caption detection (used for figures, tables, listings, sections — anywhere DetectEntryBox falls back to a landscape view): - Cap base region to 200pt tall (was the destY → bottom-of-page span) so the popup is wide and short rather than narrow and tall. - Caption-extension: when a "Figure N.M" / "Table N.M" / "Listing N.M" / "Algorithm N.M" line-start is detected on the page (search range extended to bottom of page so tall figures with captions far below the initial cap still match), extend the region downward to include the full caption block. Logic moved out of DetectEntryBox into LandscapeBox itself, so it applies both when DetectEntryBox produces a fitted box AND when it falls through (image-only figures with no text at destY). - Caption end via line-by-line Y-range scan (stream order is unreliable for figure-float text) with a justification check: walk up to 3 lines from caption start; stop at any line whose right edge reaches within 30pt of the page text margin (= justified body paragraph). Avoids sweeping body text in when caption is followed by a no-parskip body paragraph (Figure 2.1 case) while still extending through the full 3-line caption when caption lines are raggedright (typical LaTeX). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Pushed Iterated on popup geometry and figure-caption detection so the popup is unobtrusive (preserves context around the hovered link) and tightly cropped to the relevant region for figures/tables/listings. Popup positioning & sizing
LandscapeBox region & caption detection (figures, tables, listings, sections — anywhere
|
Plain wheel over the citation popup now scrolls the rendered region, rolling over to the previous/next page when it crosses a page edge. Zoom moves to Ctrl+wheel. Also persist the popup-fitting region after a zoom so a subsequent scroll keeps filling the popup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bracket-style bib entries ("[ZM12]"): compute the bounding box from a
y-range bounded by the next "[" in the label column, independent of
text-array order. Some PDFs draw labels and body text in non-monotonic
order, which made the iterative scan terminate after the first line.
Tightened the y-boundary by 6pt and widened the label-column search to
+30pt so layouts with a page-number/section prefix before the label
still register the next entry.
German labels recognized everywhere caption / heading words are: added
Abbildung, Tabelle, Listing, Abschnitt, Kapitel, Algorithmus.
LandscapeBox extends upward by 250pt and uses a 360pt height cap when
destY anchors on a caption line, so figure / table cross-references whose
target lands at the caption include the figure body above, not just the
caption plus the following paragraph.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Caption-extension: replace justification check with a gap + shortness dual-signal. The old "stop on any line reaching the right margin" rule truncated German captions like "...Verdecchia und Bo-/gner [VB25]..." because hyphenation made every caption line reach the column edge. Now: paragraph-break gap (> 70% line height) ends the caption; a short caption line followed by a justified line also ends it. Both common patterns (parskip-spaced captions and short-caption-then-body) are still covered. Bracket-entry detection: after the same-y walk-leftmost step, if the chosen start glyph still isn't "[", search for "[" within ~one line height of the start glyph at any smaller x. Catches description-list bibs where the "[VB25]" label sits on a slightly different baseline than its body line 1 — previously the popup rendered only a narrow strip from mid-body-text rather than the full entry. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
New test build with two more fixes: https://files.jabref.org/sumatra/SumatraPDF-6.exe Commit Hyphenated multi-line captions (e.g. German "Abbildung 2.1: ...Verdecchia und Bo-/gner [VB25]..."). The previous "stop on any line reaching the right margin" rule truncated captions whose lines all hit the column edge due to hyphenation. Replaced with a dual signal:
Covers both parskip-spaced captions and short-caption-then-body layouts. Bracket-entry label on different baseline ([VB25]-style bibs). After the same-y walk-leftmost step, if the chosen start glyph still isn't "[", search for "[" within ~one line height of the start glyph at any smaller x. Catches description-list bibs where the label sits on a slightly different baseline than body line 1 — previously the popup rendered only a narrow vertical strip from mid-body-text rather than the full entry. |
|
@kjk I think this is also the close out answer to #1085 as OP complaint was having to see-saw to check a target goto.
|
Oh, yes 🙈 My aim is to distinguish scientific references and other things. If other things: Show half of page. If scientific: Show only the relevant target. Seems the current algorithm classifies your first example wrong. |
Three fixes for image-heavy PDFs (e.g. Bluey.pdf "JUMP TO ALL (A-Z)" button, /Dest /allcharacters → /XYZ 0 -2.580017 0): - RefHoverOnTimer: treat destY past page bottom as page-level (destY=0) so the source-text resolver runs. PDF /XYZ y just past page bottom is a common authoring mistake when "top of page" was intended. - ResolveDestYFromSourceText: try every alnum candidate run in priority order (parens-flanked first, then length desc), first dest-page match wins. Previously only the longest run was tried, so "Jump to all (a-z)" picked "Jump" (not on dest) and gave up; "all" now matches "All Characters" on the dest page. - DetectEntryBox: when the dest page has very little text (< 50 chars, i.e. image-only with a heading), return the full page rectangle instead of fitting to the heading line — auto-fit in RefHoverOnTimer then scales the bitmap to popup limits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Fixed in c76ea35 — three issues for malformed /XYZ destinations and image-heavy dest pages (e.g. Bluey.pdf "JUMP TO ALL (A-Z)" button,
Test build: https://files.jabref.org/sumatra/SumatraPDF-7.exe |
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
OH yes much much better for that odd ball and I am writing others :-) with my goto script so I wonder what happens there ah command was below as MuPDF uses that format and that type of Adobe XYZ is simply goto topleft with a 2x magnification. Result is |
AddLink.js-style PDFs overlay a FreeText annotation (the visible label) on top of a Link annotation (the clickable goto-target). The annotation tooltip "Free Text annotation. Ctrl+click to edit." was shown instead of the link preview, hiding the more useful information. Skip the annotation notification when an internal goto-link is at the same position; RefHover already prioritises kindPageElementDest in PickBestElement, so the link is what the user is really targeting. Rename IsCitationLink -> IsInternalLinkDest to match its actual scope (any internal-document link, not just citations).
The hover preview previously ignored the link's /XYZ zoom factor and rendered the destination at the popup's auto-fit zoom. Clicking the same link navigates to the page at the requested zoom, so the preview should match. Plumb the link's zoom through EngineMupdf (via fz_resolve_link_dest) into PageDestinationMupdf::destZoom, expose it via GetZoom2(), and pass it from Canvas to RefHoverSchedule / RefHoverOnTimer. When the link supplies a zoom hint, render the destination region anchored at the link's destY at that zoom (clamped to at least the document's current display zoom, so the preview never feels smaller than the page itself). Skip the DetectEntryBox auto-fit heuristic in that case — the link author already said how much to enlarge. Use the full page width starting at x=0 rather than cropping at destX: strict /XYZ semantics chop the left-most letters of the target lines, which makes for a poor preview.
|
Thanks for the AddLink.js repro — useful test vehicle (although I just used your Two follow-ups landed on this branch since:
New build: https://files.jabref.org/sumatra/SumatraPDF-8.exe
Is it OK that the preview exceeds the bound of the SumatraPDF window?
|
|
@koppor Most impressed by your tenacity, I am, to get it WRITE as you found with your first opening comment 11 years ago "Unfortunately it's too complicated. I don't see us implementing that." so not a "simple ask". But your attempts seem good enough to let loose in the wild. Often KJK may take a while to be able to review such a large change @kjk can this PR be included now and tweaked if needs be ? |
Register for WM_MOUSELEAVE while scheduling a RefHover preview and dismiss the popup on receipt, so the hover doesn't linger when the user moves to a tab strip or another application. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
😅💪 Regarding your math issue, I tried "blindly" to fix it: https://files.jabref.org/sumatra/SumatraPDF-9.exe |
Add DetectEquationBox: when the destination's nearest line carries a trailing "(N)" or "(N.M)" label past the page's right half and nothing sits further right on that line, render only the equation row plus a small vertical padding instead of the 200pt landscape slice (which swept in the following paragraph and the next equation below). Also dedupe the popup region detectors: - Extract IsCaptionLabelAt(text, textLen, idx) from three identical copies in LandscapeBox and DetectEntryBox. - Extract ClipToMediabox(box, mediabox) from three near-identical RectF clamp blocks. - Use str::IsDigit / str::IsWs and std::abs in DetectEquationBox. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Move LandscapeBox / DetectEquationBox / DetectEntryBox out of RefHover.cpp
into a new translation unit RefHoverDetect.{h,cpp}. The detectors now take
the per-glyph text + coords arrays and mediabox directly instead of an
EngineBase pointer, which makes them callable from test_util without
pulling in the engine, HWND, or rendering layers.
Add src/utils/tests/RefHover_ut.cpp with synthetic-glyph regression tests
covering:
- sparse-text page → whole page
- destY < 0 fallback to LandscapeBox
- bracket-style bib fits to one entry, excludes the next
- equation label "(N)" at right column edge → tight band returned
- body-text "(N)" in left half → rejected
- non-line-trailing "(N)" → rejected
- null / empty / negative-destY inputs handled without crash
- LandscapeBox returned shape (full width, anchored, bounded height)
Wire RefHoverTest() into the test_util alphabetical run order.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Hoist the language-specific caption and heading-prefix word lists out of IsCaptionLabelAt / DetectEntryBox and into two file-static tables in RefHoverDetect.cpp. Adding support for a new language is now a one-line addition to the table instead of editing the call sites. Switch the case-fold step from ASCII-only `c + 32` to `towlower` so the match works for capitalised non-ASCII first letters (e.g. "Sección", "Capítulo"). Accented dict entries are stored already-lowercased (NFC), matching the form PDF text extraction produces. Seed the tables with Spanish/Portuguese/Italian (figura, tabla, algoritmo, sección, capítulo) and French (tableau, chapitre, algorithme) in addition to the existing English and German. Add FrenchCaptionDetected test exercising the new dictionary path. Re-sort RefHoverTest call list alphabetically. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Twelve fields had pending* / displayed* prefixes carrying their grouping in the name. Nest them under two struct members instead so the grouping is structural — RefHoverSchedule writes s->pending.*, RefHoverOnTimer reads s->pending.* and writes s->displayed.*, and the wheel handlers read s->displayed.* alone. Top-level hwndPopup / bmp stay flat (touched by every code path). No behaviour change; mechanical s/pendingDestPage/pending.destPage/- style rename plus the struct definition. Updates the two call sites in Canvas.cpp. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>















#128 (comment)
11 years later, Claude can help.
This update is entirely written by claude - "only" driven by me in a code-and-fix mannger
Screenshot:
Summary
Hovering an internal-document link (citation, figure reference, footnote marker, etc.) shows a small popup rendering the destination region of the target page — so a
[1]citation, a "Figure 2.3" reference, or a footnote can be inspected without leaving the current page. Inspired by PDFRefPreview.Fixes #128
Fixes #4221
Demo
Hover any in-text reference for ~300 ms. The popup shows just the relevant entry / figure / footnote, cropped to its actual text box. Works for both 1-column and 2-column layouts, hanging-indent and numeric
[N]bibliographies, and figure references (where the strip falls back to a generous region around the anchor).Implementation
Detection (
src/Canvas.cpp+src/RefHover.cpp):IsCitationLinkaccepts any internalkindPageElementDestlink.DetectEntryBoxdoes a per-glyph text+coords scan starting atdestY, with several boundary signals:[Nat the entry's first-line X, indent change back to that X, an X that matches neither first-line nor continuation X, vertical paragraph break, single-line-entry pattern, and column wrap.destY, and to a taller strip when the detected box is suspiciously small (figure / diagram fragment).engine->RenderPageand blitted into a popup, letterboxed to fit max bounds.Engine API change (would appreciate a quick eyeball on this part):
IPageDestination::GetDestPoint2()insrc/EngineBase.h, default returnsGetRect2()(so existing implementations are unaffected).PageDestGetDestPoint().PageDestinationMupdf(insrc/EngineMupdf.cpp) now storesdestX/destYresolved viafz_resolve_linkat construction; previously the resolved (x, y) fromResolveLinkwas discarded after the page number was extracted.Plumbing:
src/RefHover.cpp/src/RefHover.h.MainWindowgets arefHovermember, destroyed in dtor.Canvas::OnMouseMoveschedules / hides the popup;OnTimerrenders.vs2022/*.vcxprojand*.vcxproj.filtersupdated.Setting:
EnableCitationHoveradvanced setting (defaulttrue, version 3.7), defined incmd/gen-settings.ts. Generated files updated.Test plan
bun ./cmd/build.tsclean — 0 warnings, 0 errors.[N]citations: tight popup of the right column entry only.Bischoff and Küchlin, 2017→ bib entryDaniel Bischoff and Wolfgang Küchlin. Adapting…): single entry, correctly cropped.[N]follows): stops before author bios / footers thanks to the indent-change signal.¹url \n ²url \n …): just the hovered footnote.Figure 2.2): falls back to a generous strip around the anchor so the diagram is visible.EnableCitationHovertofalse: existing popup is hidden, no further popups scheduled.Notes
tests-manual/extract-references/paper1/main.pdffrom the JabRef repo. Not included in this PR.RefHover.cpptext-scan boundary heuristics are defensive (multiple signals), but they're heuristics — happy to iterate based on PDFs that don't crop cleanly.🤖 Generated with Claude Code