Code review: CSP/XSS hardening, init race fix, i18n content extraction#23
Conversation
Modal: replace inline onclick/onerror by addEventListener, escape all technique/page-derived strings via escapeHtml() before innerHTML, harden the gradient color with sanitizeHexColor(). UI button: drop the buttonCreated early-return that prevented re-creation after deletion; build the button via DOM nodes instead of innerHTML. content.js: declare all @Keyframes (fadeIn, slideIn, slideInRight) at document-level so the suspicious-site alert and details modal get their animation. Wait for suspiciousSitesManager in checkDependencies so the suspicious-site check is never missed on first page load. contentExtractor: switch cleanText to Unicode property escapes (\p{L}\p{N}\p{M}) — the previous ASCII \w stripped Cyrillic / Chinese / Arabic content, which is exactly what the plugin is meant to analyse. Tighten detectPageType with word/stem boundaries and parse the URL so "businessnews.com" no longer triggers the "news" branch. Suspicioussitesmanager: gate all the verbose console.log calls behind a window.DIMA_DEBUG flag — the extension runs on <all_urls> and was spamming every host page's console. keywords: drop the duplicated alternative `(cette|cette|this)` in the TE0153 pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR hardens the extension’s UI against CSP/XSS issues, improves initialization robustness (suspicious-sites race + shared animations), and fixes content extraction so non‑Latin scripts are preserved for analysis.
Changes:
- Reworked UI rendering to be more CSP-safe (no inline handlers) and added HTML escaping for analysis-derived values.
- Improved init robustness by defining shared keyframes at document scope and waiting for the suspicious-sites API before booting.
- Updated content extraction/page-type detection and fixed a minor keyword-regex typo.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
modules/uiManager.js |
CSP-safe event binding, DOM-built button, and escapeHtml() usage for modal content. |
modules/Suspicioussitesmanager.js |
Gates verbose logs behind window.DIMA_DEBUG. |
modules/contentExtractor.js |
Preserves non-Latin scripts in cleanText() and tightens detectPageType() classification. |
data/keywords.js |
Fixes a duplicated alternation token in a regex pattern. |
content.js |
Defines animation keyframes globally and makes init wait for checkSuspiciousSite. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| <div style="background: white; padding: 16px; border-radius: 10px; border-left: 4px solid #e67e22; box-shadow: 0 2px 8px rgba(0,0,0,0.08);"> | ||
| <div style="display: flex; justify-content: between; align-items: start; margin-bottom: 8px;"> | ||
| <div style="flex: 1;"> |
There was a problem hiding this comment.
Bonne pioche, merci. justify-content: between est en effet rejeté → remplacé par space-between aux deux occurrences (uiManager.js:814,827). J'en ai profité pour normaliser align-items: start → flex-start partout (les keywords nus du box-alignment module sont moins largement supportés). Corrigé dans 80c09a2.
| matchWord(pathAndQuery, "article") || | ||
| matchStem(pathAndQuery, "actualit") | ||
| ) return "news"; | ||
|
|
||
| if (matchWord(hostname, "blog") || matchWord(pathAndQuery, "blog")) return "blog"; |
There was a problem hiding this comment.
Excellente remarque. matchWord("article") exige une frontière non-alphanumérique en fin → /articles/ ne matchait pas. Idem pour /blogs/. J'ai basculé article et blog (et le blog du hostname) sur matchStem dans 80c09a2. Smoke test:\n\n\n/articles/foo => news\n/article/foo => news\n/blogs/post => blog\n/blog/post => blog\nblogs.example.com => blog\nbusinessnews.example.com => general (non-régression)\n
| @keyframes fadeIn { | ||
| from { opacity: 0; } | ||
| to { opacity: 1; } | ||
| } | ||
| @keyframes slideIn { | ||
| from { transform: translateY(30px); opacity: 0; } | ||
| to { transform: translateY(0); opacity: 1; } | ||
| } | ||
| @keyframes slideInRight { |
There was a problem hiding this comment.
Très juste, c'est exactement le genre de collision discrète qui aurait été pénible à diagnostiquer. Les quatre keyframes globaux sont préfixés dima* dans 80c09a2:\n\n- fadeIn → dimaFadeIn\n- slideIn → dimaSlideIn\n- slideInRight → dimaSlideInRight\n- L'ancien dimaFadeIn (zoom du bouton flottant) → dimaFadeInScale pour libérer le nom court au profit du modal.\n\nTous les animation: ... dans uiManager.js sont mis à jour (5 occurrences).
Three issues raised by the Copilot review of PR #23. uiManager — invalid CSS values `justify-content: between` is not a valid value (CSS ignores the declaration). Two occurrences in the techniques list of showModal() changed to `space-between`. Also normalised every `align-items: start` to the canonical `flex-start` for broader browser support — the bare `start` keyword is from the newer box-alignment module and is inconsistently supported. contentExtractor — matchStem for article/blog detectPageType used matchWord("article") and matchWord("blog"), which required a non-alphanumeric boundary after the token. That broke for /articles/foo and /blogs/foo — very common URL segments — and silently classed those pages as "general", shifting their contextual technique weights. Switched both to matchStem (also for the `blog` hostname token, e.g. blogs.example.com). content.js + uiManager — prefix shared keyframes fadeIn / slideIn / slideInRight live in the document-wide @Keyframes namespace. On a host page that defines its own animations with the same names, ours would clobber theirs (or vice-versa) and produce visible regressions outside the extension. Renamed to dimaFadeIn / dimaSlideIn / dimaSlideInRight (and the existing badge animation dimaFadeIn -> dimaFadeInScale to free the simpler name for the modal fade). Updated every `animation: ...` reference in uiManager. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| const button = document.createElement('div'); | ||
| button.id = 'dima-btn'; |
There was a problem hiding this comment.
Bien vu. Le <div> est gardé (pour ne pas avoir à neutraliser les styles UA d'un <button> natif) mais doté d'une vraie sémantique de bouton dans 63a7651:
role="button"+tabindex="0"→ focusable + reconnu par les lecteurs d'écranaria-labelqui annonce le score, le niveau et l'action (Activer pour ouvrir le rapport détaillé)- handler
keydownqui matcheEnter,(Space) etSpacebar(legacy) avecpreventDefaultsur Space pour éviter que la page hôte ne scrolle - handlers
focus/blurqui dessinent un outline blanc (le user-agent ne stylise pas lesdiv[role=button]par défaut) - le 🧠 décoratif passe en
aria-hidden="true"pour ne pas être verbalisé deux fois
Couvert par 6 tests dans tests/accessibility.test.js côté PR #24.
| window.CONTEXT_PATTERNS && | ||
| window.ContentExtractor && | ||
| window.TechniqueAnalyzer && | ||
| window.UIManager | ||
| window.UIManager && | ||
| typeof window.checkSuspiciousSite === "function" |
There was a problem hiding this comment.
Touché, et c'est la cause racine plus que le symptôme. Le setTimeout(100) dans Suspicioussitesmanager.js était lui-même inutile : en MV3 avec run_at: document_end et un seul bloc content_scripts listé en ordre dans manifest.json, les fichiers s'exécutent en séquence — à l'exécution de ce module les bases (copycopDomains, …) sont déjà déclarées.
Dans 63a7651 :
Suspicioussitesmanager.js: suppression dusetTimeout(100).window.checkSuspiciousSiteest publié de manière synchrone, donccontent.jsne touche jamais la boucle de retry en fonctionnement normal.content.js: les troisconsole.loginfo-level (Script chargé, message de retry,Analyseur initialisé) passent derrièrewindow.DIMA_DEBUG. Lesconsole.error(échec de dépendances, exception d'init) restent inconditionnels.
Verrouillé par 2 tests dans tests/syncInit.test.js côté PR #24 : pas de setTimeout( dans le bootstrap, et window.checkSuspiciousSite est une fonction immédiatement après le chargement du module.
contentExtractor.test.js — add three cases that exercise the matchStem switch on `article` and `blog`: - /articles/foo -> news - /blogs/post -> blog - blogs.example.com -> blog animations.test.js — new file. Scans content.js for @Keyframes declarations and uiManager.js for `animation:` references, asserting that every name begins with `dima`. Prevents a future change from re-introducing a generic keyframe name (fadeIn, slideIn, ...) that would collide with the host page's CSS namespace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
content.js / Suspicioussitesmanager.js — kill the artificial init delay
The previous fix expanded checkDependencies() to wait for
window.checkSuspiciousSite, but that symbol was published inside a
setTimeout(100ms) callback. Net effect: every page hit the retry loop
for at least one tick and emitted a "Attente du chargement…" log,
which the extension should not be doing on <all_urls>.
Root cause was the setTimeout itself: in MV3 with `run_at:
document_end` and a single content_scripts entry, JS files run in
manifest order. By the time Suspicioussitesmanager.js executes, every
database global is already defined — the 100 ms delay was just
inherited from an older loading model. Removed; the manager now
initialises synchronously, content.js's first checkDependencies()
call passes, and the retry loop is back to being a pure safety net.
Also gated the two remaining info-level console.log calls in
content.js ("Script chargé", "Analyseur initialisé", retry message)
behind window.DIMA_DEBUG. Errors stay visible unconditionally.
uiManager.js — keyboard-accessible floating badge
The score badge was a <div> with a click handler — not focusable,
not Enter/Space-activable, invisible to screen readers. Added
role="button", tabindex="0", aria-label that reads the score and
level and explains what activation does. Wired a keydown handler
that responds to Enter and Space (preventDefault on Space to avoid
scrolling the host page) and visible focus/blur outline so keyboard
users see the focus ring (UA does not style div[role=button]). The
brain emoji gets aria-hidden so screen readers don't speak it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
contentExtractor.test.js — add three cases that exercise the matchStem switch on `article` and `blog`: - /articles/foo -> news - /blogs/post -> blog - blogs.example.com -> blog animations.test.js — new file. Scans content.js for @Keyframes declarations and uiManager.js for `animation:` references, asserting that every name begins with `dima`. Prevents a future change from re-introducing a generic keyframe name (fadeIn, slideIn, ...) that would collide with the host page's CSS namespace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
contentExtractor.test.js — add three cases that exercise the matchStem switch on `article` and `blog`: - /articles/foo -> news - /blogs/post -> blog - blogs.example.com -> blog animations.test.js — new file. Scans content.js for @Keyframes declarations and uiManager.js for `animation:` references, asserting that every name begins with `dima`. Prevents a future change from re-introducing a generic keyframe name (fadeIn, slideIn, ...) that would collide with the host page's CSS namespace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Code review pass on the DIMA plugin. Five files touched, no behavior change in the happy path — fixes target robustness, CSP compatibility on host pages, and the fact that
cleanTextwas silently stripping the very scripts (Cyrillic, Chinese, Arabic) the plugin is built to analyse.UI / Security
onclick=/onerror=handlers were blocked by strict CSPs (GitHub, Twitter, …) — the "Fermer" button could become inert on those sites. Replaced byaddEventListenerbound to explicit IDs.escapeHtml()and used it for everytechnique.{nom,index,description,tactic,phase,confidence},matchedKeywords[].keyword,logoUrland page-derived value injected throughinnerHTML. The technique DB is project-controlled but the previous template-literal interpolation made any future<in a description an XSS.createButtonnow assembles the floating badge from DOM nodes rather thaninnerHTML, and appliessanitizeHexColortoriskColorso an invalid hex can't produce a NaN gradient.buttonCreatedre-creation bug: removed the early-return that prevented re-creation after the button had been deleted (first call worked, all later calls left an empty corner).Init / race
@keyframesdeclared globally (content.js):fadeIn,slideIn,slideInRightare now defined once at document-level. They were previously inside the modal's inline<style>, so the suspicious-site alert and details modal — which can open before the analysis modal — animated with no keyframes available.checkDependencieswaits for the suspicious-sites manager: that module initialises via a 100 mssetTimeout, so on the very first page the suspicious-site lookup was silently falling back to{isSuspicious:false}.content.jsnow waits forwindow.checkSuspiciousSitebefore booting the analyser.Content extraction
cleanTextkeeps non-Latin scripts (modules/contentExtractor.js): the old regex used\w(ASCII) and stripped Cyrillic / Chinese / Arabic content. Switched to Unicode property escapes (\p{L}\p{N}\p{M}) — the plugin targets Russian/Chinese disinformation networks, so this was a functional gap, not a nit.detectPageTypetightened: now parses the URL and uses word boundaries (businessnews.comis no longer "news") with stems where useful (productsstill counts as commerce,actualitésstill as news, social sites matched by hostname rather than substring).Noise / housekeeping
Suspicioussitesmanager.jslogs gated behindwindow.DIMA_DEBUG. The extension runs on<all_urls>and was printing ~20 lines into every host page's console.data/keywords.js:(?:cette|cette|this)→(?:cette|this)in the TE0153 pattern.Test plan
node --checkon all modified filesContentExtractor.cleanText('Hello мир 你好 العالم <script>')returnsHello мир 你好 العالم script(Latin/non-Latin preserved, HTML chars stripped)detectPageTypetable-test across 14 URL shapes (news/blog/social/commerce/general edge cases) — all passhttps://github.com) — confirm modal buttons still respond🤖 Generated with Claude Code