Skip to content

Automated content publishing pipeline: Feishu docs → QVeris blog #4

@linfangw

Description

@linfangw

Background

QVeris tech blog content is authored in Feishu (飞书) documents, then manually converted and published to this repository. This manual process is error-prone, slow, and doesn't scale as content volume grows.

Goal

Build a Claude Code automated workflow that takes a Feishu document tree (a parent doc and all its child pages) and automatically publishes them as blog posts to QVerisAI.github.io, appearing on the QVeris official website.

Workflow Overview

Feishu Doc (parent + child pages)
  │
  ├─ 1. Fetch & extract content (Feishu API / lark-cli)
  ├─ 2. Classify articles (category, tags)
  ├─ 3. Translate (en ↔ cn bilingual versions)
  ├─ 4. Convert to Markdown + generate frontmatter
  ├─ 5. Download & optimize images
  ├─ 6. Place into correct content directory
  ├─ 7. Create PR with preview
  └─ 8. Auto-merge & deploy to QVeris website

Detailed Requirements

1. Feishu Document Fetching

  • Accept a Feishu document URL or document token as input
  • Recursively fetch the parent document and all child/sub-pages
  • Extract rich content: text, headings, images, tables, code blocks, callouts
  • Handle Feishu-specific elements (mentions, @-references, embedded files)
  • Download all inline images and attachments

2. Article Classification

  • Auto-detect article category based on content analysis: Engineering, Research, Product, Announcement
  • Auto-generate relevant tags (e.g., agents, memory, infrastructure, protocol)
  • Detect if article should be featured: true based on content significance
  • Set draft: false for production-ready content

3. Bilingual Translation

  • If source document is Chinese → auto-translate to English
  • If source document is English → auto-translate to Chinese
  • Preserve technical terms (e.g., "Agent", "QVeris Protocol", "MCP") without translation
  • Maintain markdown formatting, code blocks, and image references across translations
  • Generate matching translationKey for en/cn pair linking

4. Markdown Conversion

  • Convert Feishu block structure to clean Markdown/MDX
  • Generate complete frontmatter:
    ---
    title: '...'
    description: '...'  # Auto-generated summary
    pubDate: '...'       # From Feishu doc creation/modification date
    heroImage: '...'     # First significant image or auto-selected
    category: '...'      # Auto-classified
    author: '...'        # From Feishu doc author
    tags: [...]          # Auto-generated
    featured: false
    draft: false
    translationKey: '...'
    ---
  • Use Callout component for Feishu callout blocks
  • Preserve code block language annotations

5. Image Processing

  • Download all images from Feishu CDN
  • Rename to descriptive filenames (e.g., blog-hero-{slug}.png)
  • Place in src/assets/ directory
  • Update markdown image references to relative paths
  • Select appropriate hero image for frontmatter

6. File Placement

  • English version → src/content/blog/en/{slug}.md
  • Chinese version → src/content/blog/cn/{slug}.md
  • Images → src/assets/
  • Filename slug derived from title (kebab-case, ASCII-safe)

7. PR Creation

  • Create a feature branch: content/{slug}
  • Commit all files (markdown + images)
  • Create PR with:
    • Summary of articles being published
    • Preview links (if available)
    • Checklist for manual review items
  • Run pnpm build to verify no errors before PR creation

8. Deployment

  • After PR merge, GitHub Actions automatically deploys to GitHub Pages
  • Content appears on qveris.ai and qveris.cn

Implementation Approach

Option A: Claude Code Scheduled Task

  • Use mcp__scheduled-tasks to create a recurring check
  • Monitor a designated Feishu document for new/updated child pages
  • Auto-publish on detection

Option B: Claude Code Skill

  • Create a /publish-blog skill that accepts a Feishu doc URL
  • One-shot execution: fetch → process → PR
  • Manual trigger with full automation

Option C: Hybrid

  • Skill for on-demand publishing (/publish-blog <feishu-url>)
  • Scheduled task for periodic sync of a content hub document

Technical Dependencies

  • lark-cli / Feishu API access (via existing lark-doc skill)
  • Claude Code for content analysis, classification, and translation
  • gh CLI for PR creation
  • Astro build toolchain (pnpm build) for validation

Acceptance Criteria

  • Given a Feishu document URL, the workflow fetches all child pages
  • Each article is auto-classified with category and tags
  • Bilingual (en + cn) markdown files are generated with matching translationKey
  • All images are downloaded, renamed, and correctly referenced
  • Frontmatter is complete and valid per content.config.ts schema
  • pnpm build succeeds with the new content
  • PR is created with descriptive title and summary
  • Published content renders correctly on qveris.ai blog listing and detail pages

Priority

High — This is a key enabler for scaling QVeris content operations.

Related

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions