Automated website checker for Google AdSense review requirements. Detects "low value content" — the #1 rejection reason. Supports content sites, tool sites, game sites, video sites, and reference sites with AI-powered topic analysis, 5-dimension content quality scoring, and stratified sampling.
npm install -g @cloudcreate/adsense-checkOr run directly without installing:
npx @cloudcreate/adsense-check https://example.comCheck your current version:
adsense-check --versionUpdate to the latest version:
npm update -g @cloudcreate/adsense-checkOr with npx, always use the latest version:
npx @cloudcreate/adsense-check@latest https://example.comOption 1: Config file
Run npx adsense-check init to generate .adsense-check.yaml with built-in defaults. CLI flags override config values.
# .adsense-check.yaml
maxCrawl: 50
maxPages: 50
maxContent: 20
sampleMin: 20
sampleRatio: 0.2
concurrency: 5
timeout: 30000
lang: "en"
output: ~/.adsense-check/reports
ai: true
expert: false
fastModel:
apiKey: ""
apiBase: ""
model: ""
expertModel:
apiKey: ""
apiBase: ""
model: ""For global defaults, use --global: npx adsense-check init --global (writes to ~/.adsense-check/config.yaml).
Option 2: Environment file (AI only)
cp .env.example .env
# Edit .env:
# AI_API_KEY=sk-xxx
# AI_API_BASE=https://api.deepseek.com
# AI_MODEL=deepseek-chatOption 3: Command-line flag
adsense-check https://example.com --ai --api-key sk-xxx...Priority: CLI flags > .adsense-check.yaml > ~/.adsense-check.yaml > built-in defaults.
# Full check with AI analysis (AI enabled by default)
adsense-check https://example.com
# Full check with expert AI assessment (auto-enabled when configured)
adsense-check https://example.com --expert
# Disable AI for mechanical checks only
adsense-check https://example.com --no-ai
# JSON output (for programmatic use)
adsense-check https://example.com --json
# Localized output
adsense-check https://example.com -l zh
# Only detect site type and topic
adsense-check https://example.com --detect-only
# Single-page value analysis (legacy, requires site URL)
adsense-check https://example.com --page https://example.com/some-pageAnalyze a single page with AI five-dimension scoring:
# Single-page AI value scoring (no site context)
adsense-check page https://example.com/some-page
# Check relevance against site topic (auto-extracts origin from page URL)
adsense-check page https://example.com/some-page -r
# Override site URL for topic detection (cross-site check, local dev)
adsense-check page https://example.com/some-page -r --site http://localhost:3000/
# With Chinese output
adsense-check page https://example.com/some-page -l zh
# JSON output
adsense-check page https://example.com/some-page --jsonWithout -r/--relevance, the page is scored in isolation and Relevance will always be high (the page defines its own topic). With -r, the site homepage is auto-extracted from the page URL (https://example.com/blog/post → https://example.com) and crawled first to detect the site topic, then the page's Relevance score reflects alignment with that theme. Use --site to override the auto-extracted origin for cross-site checks or local development.
Detect site type and topic from the homepage:
# Detect site type and topic
adsense-check topic https://example.com
# Force site type, skip AI detection
adsense-check topic https://example.com --type game
# JSON output
adsense-check topic https://example.com --jsonReports are auto-saved to ~/.adsense-check/reports/<domain>-<timestamp>.json and ~/.adsense-check/reports/<domain>-<timestamp>.md.
Quick check of site-wide hard requirements — no content page crawl, no AI. Completes in ~10 seconds.
# Check hard requirements (required pages, robots.txt, sitemap, ads.txt, policy keywords)
adsense-check site https://example.com
# JSON output
adsense-check site https://example.com --json
# Chinese output
adsense-check site https://example.com -l zhQuick check of homepage quality — H1, internal links, load speed, viewport, mobile UX. Requires Playwright for DOM measurements.
# Check homepage quality
adsense-check home https://example.com
# JSON output
adsense-check home https://example.com --json
# Chinese output
adsense-check home https://example.com -l zhAutomatically classifies websites into supported types:
| Type | Description | Examples |
|---|---|---|
| Content | News, blogs, educational articles, guides | theexceltranslator.com |
| Tool | Online calculators, converters, generators | ishowspeedsaid.com |
| Game | Online games, game portals | popstone2.com |
| Video | Video sharing, video blogs, YouTube-style sites | — |
| Reference | Wiki, encyclopedia, glossary, knowledge base | ishowspeedsaid.com |
| Unsupported | Other types (e-commerce, social, etc.) | — |
AI analysis classifies the site type and topic. Falls back to DOM signal detection when AI is unavailable. Use --type to override.
With AI enabled (default), the tool analyzes the homepage to determine:
- Topic: What the site is about (e.g., "online match-3 puzzle games")
- Description: One-line summary of the site's purpose
- Type: content / tool / game / video / reference / unsupported
Use --no-ai to skip AI analysis. The expert model is auto-enabled when AI_EXPERT_MODEL or AI_EXPERT_API_KEY is configured with a different model than the fast model. Override with --no-expert to disable.
AI requests automatically retry up to 3 times with exponential backoff on failure.
Each page is evaluated by AI on five dimensions (0-10):
| Dimension | Description |
|---|---|
| Value | Does the page provide real, substantive information? |
| Originality | Is the content original (not scraped/AI-generated/copied)? |
| Relevance | How relevant is the page to the site's topic? |
| Compliance | Does the content comply with AdSense policies? |
| Translation | How well is the content translated into its declared language? |
Page score = geometric mean of all five dimensions. Any dimension at 0 drives the overall score to 0; a low dimension drags down heavily.
Site score = page-type weighted average across all analyzed pages (homepage and content pages have highest weight).
Language is extracted from <html lang> and <meta http-equiv="content-language"> attributes. The translation dimension checks whether page content matches its declared language, flagging mixed-language content and machine-translation artifacts. English pages are auto-scored 10.
The tool discovers URLs from sitemaps (including recursive sitemap indexes and robots.txt fallback) and homepage links, then performs stratified sampling:
- Always-crawl pages: homepage + required pages (about, privacy, contact, terms)
- URL classification: Each URL is classified by path pattern (content, game_detail, listing, reference_detail, etc.)
- Proportional budget allocation: Remaining crawl budget is distributed across page types proportionally to their weight and count
- Freshness sorting: Within each type group, URLs with date patterns in their paths are crawled first
This approach works on any site structure — it doesn't depend on listing pages or BFS discovery.
Three independent signals combine into the final score:
Composite = Page Value(VOT) × Site Quality/100 × Landing Page Quality/100
- Page Value (VOT): ∛(Value × Originality × Translation) — the core content quality signal, computed as a geometric mean of AI-evaluated dimensions across all content pages (excluding required/utility pages which don't need editorial content quality)
- Site-wide Quality: Pass rate of all hard requirements + content quality + UX categories. Acts as a multiplier — good infrastructure prevents discounting, but can't make mediocre content good
- Landing Page Quality: Pass rate of landing page–specific checks (H1, internal links, load speed, viewport, mobile overflow, homepage content). Also a multiplier
- Caps: Any page compliance < 6 → max composite 50; avg relevance < 6 → max composite 60
Compliance and relevance are excluded from the VOT mean because they're "safety" dimensions: nearly all pages score 10/10, so including them dilutes the signal from value/originality/translation. When they DO drop below threshold, the cap mechanism kicks in.
Each page receives a base score of 100/100, reduced only by AI quality assessment (AI warn → max 70, AI fail → 0). Content quality is assessed entirely through the AI VOT dimensions rather than structural heuristics like character count or content ratio.
Three-tier assessment:
| Method | Description |
|---|---|
| Rule-based | Mechanical estimate from composite score, hard status, AI site score |
| Fast model | AI reviews the full report and gives probability + reasons + actions |
| Expert model | Deeper analysis with --expert flag (uses a more capable model) |
Pages flagged with borderline compliance scores (3-5) receive a second-pass AI review to reduce false positives. Context-aware: informational/educational mentions of sensitive topics are not treated as violations.
adsense-check eval report.json
adsense-check eval report.json --expert
adsense-check eval report.json --jsonReads a previously saved JSON report and runs approval estimation without re-crawling.
Defaults come from .adsense-check.yaml or ~/.adsense-check.yaml. CLI flags override config values.
-v, --version Show version
-j, --json Output JSON to stdout
-n, --max-crawl <n> Total page crawl limit
-m, --page-limit <n> Max structural pages for sampling pool
-c, --content-limit <n> Max content pages to crawl
--sample-min <n> Min content pages to sample
--sample-ratio <ratio> Content page sampling ratio 0-1
--ai Enable AI content quality analysis
--no-ai Disable AI content quality analysis
--expert Enable expert AI summary
--no-expert Disable expert AI summary
-b, --concurrency <n> AI batch concurrency
--page <url> Analyze single page value (5-dimension scoring)
-t, --timeout <ms> Page load timeout
--api-key <key> AI API key
-o, --output <dir> Report output dir
--no-save Skip auto-saving report
-l, --lang <lang> Output language: en|zh
--type <type> Force site type: content|tool|game|video|reference
--detect-only Only detect site type/topic, skip full check
topic [options] <url> Detect site type and topic from homepage
-j, --json Output JSON to stdout
-t, --timeout <ms> Page load timeout
--api-key <key> AI API key
-l, --lang <lang> Output language: en|zh
--type <type> Force site type, skip AI detection
site [options] <url> Quick check: site-wide hard requirements
-j, --json Output JSON to stdout
-t, --timeout <ms> Page load timeout
-o, --output <dir> Report output dir
-l, --lang <lang> Output language: en|zh
home [options] <url> Quick check: homepage quality
-j, --json Output JSON to stdout
-t, --timeout <ms> Page load timeout
-o, --output <dir> Report output dir
-l, --lang <lang> Output language: en|zh
init [options] Generate .adsense-check.yaml config file
--global Write to home directory (~/.adsense-check.yaml)
page [options] <url> Analyze a single page with AI five-dimension scoring
-j, --json Output JSON to stdout
-t, --timeout <ms> Page load timeout
--api-key <key> AI API key
-l, --lang <lang> Output language: en|zh
-r, --relevance Check relevance against site topic (auto-extracts origin)
--site <url> Override site URL for topic detection
eval <report> Evaluate approval probability from existing JSON report
--lang <lang> Output language: en|zh
--expert Run expert model assessment
--no-expert Disable expert model assessment
--json Output JSON comparison
AdSense Review Report
URL: https://example.com
Time: 2026-05-08T15:00:00.000Z
Site type: Content
Topic: Excel translation reference — Provides Excel terminology translations for multiple languages.
Pages: 50, 50 AI-analyzed, confidence: high
Review Result
Composite Score: 82/100
┌─ Site Quality: 94/100
│ Landing Page: 90/100
│ Page Value: 97/100
│
│ 97 × 94/100 × 90/100 = 82
└─
Approval Probability
Rule-based: ~85% (confidence: high)
AI fast model: ~90% (deepseek-v4-flash)
AI expert model: ~88% (deepseek-v4-pro)
Site Quality Breakdown (94/100)
── Hard Requirements PASS
✔ Site Scale Good site size (194 pages)
✔ About Found About page (/about/)
...
Score: READY — All required items met
── Content Quality
✔ Page structure diversity OK (max similarity 42%)
✔ Content originality 41/100
...
── User Experience
✔ Mobile font size
✔ Heading structure
✔ Navigation elements
...
Saved alongside JSON with the same timestamp. Contains summary tables, dimension statistics, per-page details with 5-dimension scores, AI assessments, and improvement suggestions.
Full structured data including per-page details, AI assessments, topic info, sampling stats, and timing breakdown.
| Code | Meaning |
|---|---|
| 0 | No failures (READY or MOSTLY READY) |
| 1 | Has failures (NOT READY or NEEDS FIXES) |
| 2 | Runtime error |
MIT