Commit 60aea08
Article-mode oEmbed extraction for video pages + V1 payload parity
V1's article-mode flow on video pages (YouTube, Vimeo) produced a save
payload with an embedded video iframe, citation, title/author caption,
and `<meta name="AutoPageTagsCodes" content="Article" />` /
`<meta name="AutoPageTags" content="Article" />` tags that OneNote's
page renderer uses to recognize the result as an article-style clip
with playable embeds. V2 shipped without any of that machinery -- the
article mode just ran Readability over the YouTube DOM, which strips
iframes and produces a text-only result with no player and no
description. Users reported the regression on YouTube specifically;
the same gap applied to Vimeo as well.
oEmbed standard provides exactly the shape we need (iframe `html`,
title, author_name, thumbnail_url, dimensions) without any
provider-specific scraping. Both YouTube and Vimeo publish CORS-enabled
oEmbed endpoints that the chrome-extension origin can fetch directly
under our existing `<all_urls>` host_permissions.
Changes:
- New `src/scripts/contentCapture/oembedExtractor.ts` -- thin module
with a provider table (YouTube + Vimeo only, matching V1's
SupportedVideoDomains), hostname-pattern matching, fetch + JSON
parse, and a small `sanitizeProviderHtml` helper that strips
script-execution surfaces from provider-supplied HTML.
- `extractArticle` in renderer now tries oEmbed first; on no-match
or fetch failure it falls through to the existing Readability
path with zero behavior change.
- Preview vs save are decoupled:
- Preview shows the `thumbnail_url` at the same 600x338 (16:9)
box the saved iframe uses, with title / "author . provider"
attribution, page description (og:description fallback chain
same as bookmark mode), and a CSS-only play-glyph overlay
when `type === "video"`. No iframe in preview because the
renderer's `preview-frame` is sandboxed (allow-same-origin)
and the YouTube/Vimeo player can't run JS inside it -- which
is why earlier attempts produced a broken "Unable to execute
JavaScript" placeholder.
- Save uses the provider's iframe HTML (sanitized), with
`data-original-src=<pageUrl>` injected and dimensions
normalized to 600x338 -- the marker OneNote's renderer uses
to recognize and render the embedded player on the saved
page, matching V1's YoutubeVideoExtractor behavior exactly.
- PageMetadata plumbing: renderer threads a `pageMetadata` map
through the save port message; worker's `buildPage` iterates
and emits `<meta name="K" content="V" />` for each entry.
Mirrors V1's `OneNoteApi.OneNotePage.getPageMetadataAsHtml`
behavior. Article mode (both oEmbed and Readability paths)
populates `AutoPageTagsCodes=Article`, `AutoPageTags=Article`,
plus title/author/siteName (oEmbed) or
title/description/author/siteName/publishedTime (Readability,
matching V1 augmentationHelper).
- `buildPage` HTML output realigned to V1
`OneNoteApi.OneNotePage.getEntireOnml` shape: no `<!DOCTYPE>`,
`<html xmlns="http://www.w3.org/1999/xhtml" lang=<locale>>`
(no quotes around lang -- matches V1 output literally), locale
via `chrome.i18n.getUILanguage()`. Same change applied to the
parallel `distHtml` builder for distributed-PDF saves so all
save paths emit the same shape.
- Bookmark thumbnail size fallback restored: `imageToDataUrl`
initial-encode is PNG (good for icons/logos), with iterative
JPEG-quality step-down when the encoded data URL exceeds the
OneNote API per-MIME-part limit (~2MB minus padding). Matches
V1's deleted `DomUtils.adjustImageQualityIfNecessary` behavior
including the 0.1 step size. Surfaced because the user was
hitting "400 Maximum request size exceeded" on bookmark-mode
saves of YouTube pages whose 1280x720 og:image PNG-encoded to
~2.5MB.
Provider scope is intentionally narrow (YouTube + Vimeo only) to
match V1's effective surface and avoid accidentally enabling capture
on sites V1 never supported. V1 also handled Khan Academy via regex
scrape for embedded YouTube IDs in lesson-page HTML; that markup
likely no longer matches modern Khan Academy pages and is skipped
here per maintainer direction.
Verified manually: YouTube watch page and Vimeo video page produce
saved OneNote pages with the embedded player, title/author caption,
and og:description text below; non-matching domains fall through to
Readability with no regression; bookmark mode on YouTube saves
successfully without the 400 limit error.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 9cb7fa7 commit 60aea08
3 files changed
Lines changed: 380 additions & 17 deletions
File tree
- src/scripts
- contentCapture
- extensions/webExtensionBase
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
Lines changed: 23 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
679 | 679 | | |
680 | 680 | | |
681 | 681 | | |
| 682 | + | |
682 | 683 | | |
683 | 684 | | |
684 | 685 | | |
| |||
697 | 698 | | |
698 | 699 | | |
699 | 700 | | |
700 | | - | |
| 701 | + | |
| 702 | + | |
| 703 | + | |
| 704 | + | |
701 | 705 | | |
702 | 706 | | |
703 | 707 | | |
| |||
710 | 714 | | |
711 | 715 | | |
712 | 716 | | |
713 | | - | |
714 | | - | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
715 | 730 | | |
| 731 | + | |
716 | 732 | | |
717 | 733 | | |
718 | 734 | | |
| |||
861 | 877 | | |
862 | 878 | | |
863 | 879 | | |
864 | | - | |
865 | | - | |
| 880 | + | |
| 881 | + | |
| 882 | + | |
| 883 | + | |
866 | 884 | | |
867 | 885 | | |
868 | 886 | | |
| |||
0 commit comments