video-maker is a Codex skill for end-to-end Bilibili explainer-video production on Windows.
SKILL.md: entrypoint and operating rulesscripts/bootstrap_project.py: full new-project scaffold plus runtime helper generationscripts/bootstrap_video_project.py: low-level content scaffold used by the full bootstrap wrapperscripts/upgrade_project.py: upgrade existing projects and rewrite runtime helpersreferences/agent-orchestration.md: simplifiedcontent-strategist -> script-writer -> narration-polishercontent pipelinereferences/bilibili-tech-explainer-workflow.md: beat-driven explainer workflowreferences/narration-polisher.md: 口播润色 subagent 的职责和推荐提示词remotion-best-practicesskill: default engine for video composition, animation, subtitles, audio mounting, timeline, and renderingweb-design-engineerskill: optional visual-design support for layout and motion taste; production output stays in Remotionreferences/quiet-glass-lab-v3.md: iOS 18-inspired frosted glass brand prompt packreferences/quiet-glass-lab/base.css: neutral render foundation only, not a visual theme or module templatereferences/chinese-voice-rules.md: narration consistency rulesreferences/video-acceptance-rubric.md: final QA gates including meaning-gain checkspublish/cover_prompt.mdin generated projects: imagegen-ready Bilibili cover prompt templatecontent/visual_qa_report.jsonin generated projects: Remotion frame-sample and cover visual repair reportreferences/imagegen-2-visual-playbook.md: GPT Image 2 / imagegen production rules for direct-rendered visuals with visual QA and regenerationscripts/doctor.py: environment checker
Default flow:
coordinatorcreates the project, runs simple commands, assigns subagent task packets, and makes final go / no-go decisions.content-strategistwrites problem, audience, opening, meaning, outline, depth, detail, and evidence contracts.script-writerwritescontent/script_draft.json.narration-polisherpolishes the finished draft into natural Bilibili narration incontent/narration_polish.json.coordinatorcompiles shot intents and render-ready segments from the approved narration.visual-architectdesigns Remotion scenes and usesimagegenfor visual assets, exploded diagrams, animation assets, and the cover base.visual-qa-fixerinspects Remotion frame samples andpublish/cover.png, then fixes Remotion code/assets/cover text or regenerates imagegen outputs until visual blockers are gone.production-engineerowns Qwen master-track audio, Remotion props, timeline, render, and export.acceptance-reviewerchecks many real screenshots, reads key Remotion code and content contracts, and listens to required audio samples.- Publish handoff runs only after video, cover, metadata, visual QA, and acceptance all pass.
Stable defaults:
- new projects do not use HTML slide rendering; Remotion scene code and props appear after content and shot intents are locked
- the main agent is a coordinator, not a maker; substantial content, visuals, voice, assembly, and review are delegated
- video composition, animation, subtitles, audio mounting, timeline, and rendering default to Remotion; the visual system is a brand prompt pack, not fixed scene templates
- visual assets and Bilibili covers default to
visual-architect+imagegen, with final rendered-pixel repair owned byvisual-qa-fixer - visual production records benchmark and key visual inside
visual_asset_planso inspiration, cover promise, reusable key visual, imagegen assets, and QA stay connected without extra contract files - Chinese narration prefers local Qwen first
- narration uses a single master-track path with a hard timeout and
voice_jobs/qwen_master_status.jsonstatus manifest - full-audio approval requires
voice_profile.full_audio_review_status = passed; opening or midpoint spot checks are only supporting notes - Chinese pace target is calibrated to the 2026-04-25 CUDA video: about 260 CJK chars/min, with duration mismatch treated as non-blocking when content and QA pass
- content planning now starts from
problem_contract, then writes evidence-backed contracts before any render-ready segment exists - finished content drafts are always sent through
narration-polisherto remove AI machine-tone and lower ordinary viewer comprehension cost - Bilibili cover generation happens after title lock and local QA, not during early content phases
prepare_publish_job.pynow emitscover_pathand refuses publish handoff when the cover file is missing
For real Bilibili publishing, pair this skill with desktop-control-for-windows.
See SKILL.md for usage, quiet-glass-lab-v3.md for brand visual rules, and chinese-voice-rules.md for voice rules. Remotion execution follows the companion remotion-best-practices skill.
MIT