Skip to content

Easier extractions assesments by subsample selection & other small improvements#199

Open
alexculealt wants to merge 3 commits intofeat/pick-modelfrom
feat/extraction-subsample
Open

Easier extractions assesments by subsample selection & other small improvements#199
alexculealt wants to merge 3 commits intofeat/pick-modelfrom
feat/extraction-subsample

Conversation

@alexculealt
Copy link
Collaborator

This PR implements:

Fix simplified content breaking page UI due to line length

Currently due to using a <pre> element, the layout width of the extracted page detail screen is oversized to fit the longest text line. This makes the page get high horizontal scroll which makes it difficult to select a different tab and to reason about the screen. This fixes the issue by wrapping the text to fit normal layout sizes.

Add URL based state for the tab navigation of the crawl page detail screen

The crawl page detail screen tabs now rely on URL state to select displayed tab which allows for referencing a certain tab by the URL and builds on the actions introduced in the next commit.

Add sample tool to extractions

Ads a sample dialog box that allows inspecting a subsample of the extraction bringing together various views that are only currently do-able either via the database or after an extraction is completed. With the sampling tool, we can asses performance of large catalogues (3000 items) without waiting for the extraction to complete, without having to do DB queries which have their complications and without requiring manually corroborating data items (via dataset CSV exports) and their source pages. A short video below demoes this feature in action:

sample-extractions-smaller.mov

@alexculealt alexculealt requested a review from rsaksida March 23, 2026 11:25
@alexculealt alexculealt self-assigned this Mar 23, 2026
@alexculealt alexculealt force-pushed the feat/extraction-subsample branch from 674440e to 650b617 Compare March 23, 2026 11:29
@alexculealt alexculealt force-pushed the feat/extraction-subsample branch from 650b617 to 8fbf1e9 Compare March 23, 2026 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant