Agent Long Task Skill

🌐 English README | 中文文档

Agent Long Task Skill

Stop AI coding agents from saying “done” while leaving half the task unfinished.

Agent Long Task Skill is a transparent, Markdown-first long-task execution protocol for AI coding agents. It helps agents refine messy requirements, create structured task plans, track progress, recover from failures, verify completion, and generate handoff-ready reports.

The Problem

AI coding agents are useful, but long tasks expose predictable failure modes:

They miss requirements during multi-step work.
They say "done" while leaving bugs, skipped items, or unfinished work.
Long prompts are messy, token-heavy, repetitive, and easy to misunderstand.
Context loss makes continuation difficult after interruptions.
A simple checklist is not enough when tasks require investigation, recovery, verification, and handoff.

The Solution

This Skill defines a full lifecycle:

messy user request
→ requirement refinement and compression
→ structured execution document
→ task list
→ step-by-step execution
→ failure/blocker recovery attempts
→ final verification pass
→ final user-facing report

The agent should first compress a vague or multi-requirement request into a concise execution document. It should preserve the user's intent while removing repetition, ambiguity, and token-heavy wording. Then it maps every requirement to one or more task IDs, executes tasks one by one, records failures and blockers honestly, performs a final verification audit, and reports the real result.

Requirement Refinement

The first output should be a concise execution document, not a long restatement of the user's entire prompt.

Do not copy the user's full messy prompt repeatedly.
Convert vague language into concise executable requirements.
Keep each task short and testable.
Keep acceptance criteria specific but not verbose.
Preserve traceability from original requirement to task ID.
Prefer concise summaries over long explanations.

Core Capabilities

Requirement refinement before implementation.
A structured task plan with stable task IDs.
Status tracking for every item.
Acceptance criteria for each task.
Failure and blocker recovery attempts within task scope.
Final verification before claiming completion.
Handoff notes when work is interrupted, blocked, or too large for one pass.
Chinese-friendly and multilingual user-facing task output.

Final Verification

Final verification is an audit pass, not a second full execution pass.

During final verification, the agent reviews the task list, changed files, test/build results, acceptance criteria, and visible outcomes. It should not redo every task from the beginning. If verification finds a specific missing, incomplete, or broken item, the agent should perform a targeted fix and update the relevant task status.

Only after every required task is verified may the agent claim the long task is complete. If any task remains pending, in_progress, failed, blocked, or unverified, the agent must not claim full completion.

Who It Is For

Agent Long Task Skill is for:

Claude Code users.
Codex users.
OpenCode users.
AI coding agent users.
Developers doing multi-file fixes, refactors, productization, bug batches, UI redesigns, continuous delivery cleanup, or long implementation tasks.

The default language is English, but the workflow is useful for Chinese and multilingual teams because user-facing task titles, notes, handoff reports, and final reports can follow the user's language.

Chinese-friendly Task Output

This repository is Markdown-first and does not include a graphical UI. Here, task UI support means the generated task plan, status summary, handoff report, and final report can use localized user-facing labels.

If the user prompt is in Chinese, the agent should produce user-facing task titles, notes, summaries, handoff reports, and final reports in Chinese by default. If the prompt is in English, it should use English by default.

Machine-readable task status values stay stable in English:

pending
in_progress
done
failed
blocked
verified
skipped

English display labels:

{
  "pending": "⚪ Pending",
  "in_progress": "🟡 In Progress",
  "done": "✅ Done",
  "failed": "🔴 Failed",
  "blocked": "🟠 Blocked",
  "verified": "🔵 Verified",
  "skipped": "🟣 Skipped"
}

Chinese display labels:

{
  "pending": "⚪ 待处理",
  "in_progress": "🟡 进行中",
  "done": "✅ 已完成",
  "failed": "🔴 失败",
  "blocked": "🟠 阻塞",
  "verified": "🔵 已验收",
  "skipped": "🟣 已跳过"
}

Installation

Install as a user-level Claude skill:

Copy this folder to:
~/.claude/skills/agent-long-task-skill/

Install at the project level:

.claude/skills/agent-long-task-skill/

No package manager is required. There is no backend, web app, telemetry, network behavior, install script, or hidden automation.

Usage

Ask your coding agent to use the Skill before a long or messy implementation task:

Use this skill for this long task: fix the settings page, clean up the upload flow, verify the build, and write a final report.

You can also be more explicit:

This is a multi-requirement task. Refine it into a task plan first, then execute and verify item by item.

For Chinese tasks:

这是一个多需求长任务。先把需求压缩整理成任务计划，再按任务 ID 逐项执行、验收并输出最终报告。

Generated Output Files

When used during an agent run, the Skill may generate:

.agent/task-plan.json
.agent/status.json
.agent/handoff.md
.agent/final-report.md

These files are execution artifacts. They are intentionally plain text or JSON so humans can inspect them.

Status Definitions

Use these machine-readable statuses consistently:

pending
in_progress
done
failed
blocked
verified
skipped

pending: not started.
in_progress: currently being worked on.
done: implementation work appears completed, but final verification may still be pending.
verified: checked against acceptance criteria and confirmed complete.
failed: attempted and failed after reasonable recovery attempts.
blocked: cannot proceed due to missing information, dependency, permission, or technical blocker.
skipped: intentionally skipped with a stated reason.

Important distinction: done is not the same as verified. A task should only become verified after the final verification pass or after explicit acceptance checking. The agent must not claim full completion while important tasks are only done but not verified.

Failure and Blocker Handling

When a task fails, the agent should:

Diagnose the likely cause.
Try reasonable safe recovery methods within scope.
Avoid infinite retries.
Avoid unrelated large rewrites unless necessary.
Record what was tried.
If unresolved, mark failed and provide a recovery path.

When a task is blocked, the agent should:

Mark blocked immediately.
Explain the blocker.
State what information, dependency, permission, API key, design decision, or user input is needed.
Continue with other independent tasks when possible.

Live Task Progress

During execution, the agent keeps a full ordered task board visible in progress updates. This makes long tasks easier to follow because the user can see all tasks, current status, remaining work, failures, blockers, and verification progress at a glance.

This is a Markdown progress board, not a graphical dashboard.

The live task board is different from the final report. It is shown during execution, should stay near the bottom of progress updates when practical, and should be generated from the current task plan state rather than a separate progress list.

Example:

## Live Task Progress

| # | Task | Status |
|---|---|---|
| 1 | Refine requirements and create task plan | 🔵 Verified |
| 2 | Fix homepage course entry points | 🟡 In Progress |
| 3 | Fix mobile back button behavior | ⚪ Pending |
| 4 | Run build verification | ⚪ Pending |

Remaining: 2 | Done: 1 | Verified: 1 | Failed: 0 | Blocked: 0

Screenshots

Live Task Progress

The Skill keeps a full ordered task board visible during long-task execution, so users can quickly see what is done, what is in progress, what is pending, and what is blocked.

Final Report

After final verification, the agent returns a compact completion report with task status, key changes, verification performed, and recommended local checks.

Final Report Behavior

The final response must include:

Completed and verified tasks.
Failed tasks.
Blocked tasks.
Skipped tasks.
Files changed.
Verification performed.
Remaining risks.
Next recommended actions.

If all required tasks are verified, the agent may say:

All required tasks have been completed and verified.

If not all required tasks are verified, the agent must say:

The task is not fully complete yet. The following items remain failed, blocked, pending, or unverified.

Final Report Style

After all tasks are attempted and final verification is complete, the agent should return a compact report that is easy to scan. It should not only say "done."

The final report should include:

Task status table.
Key changes.
Verification performed.
Failed or blocked items.
Recommended local checks.

The status table is required. It should show at least the task number, task title, and user-facing status label:

## Final Report

| # | Task | Status |
|---|---|---|
| 1 | Fix homepage and course entry points | 🔵 Verified |
| 2 | Run build verification | 🔵 Verified |

Final verification must be reflected in the final report. The report should show verified completion status, not just implementation status. Failed, blocked, skipped, pending, or unverified tasks must not be hidden.

Use recommended local checks when the user may need to verify UI, device behavior, build output, runtime behavior, deployment, or account-specific behavior.

Why Markdown-first?

Markdown-first keeps the protocol:

Transparent.
Auditable.
Free of hidden runtime behavior.
Free of network access.
Easy to adapt.
Compatible with multiple agents.

The Skill is designed to be understood by humans first and automated later only if that becomes useful.

Repository Layout

agent-long-task-skill/
├─ README.md
├─ README.zh-CN.md
├─ SKILL.md
├─ LICENSE
├─ docs/
│  └─ images/
│     ├─ live-task-board.png
│     └─ final-report.png
├─ templates/
│  ├─ task-plan.md
│  ├─ acceptance-checklist.md
│  ├─ handoff.md
│  ├─ final-report.md
│  ├─ live-task-board.md
│  └─ zh-CN/
│     ├─ task-plan.md
│     ├─ acceptance-checklist.md
│     ├─ handoff.md
│     ├─ final-report.md
│     └─ live-task-board.md
├─ schemas/
│  └─ agent-task-plan.schema.json
└─ examples/
   ├─ messy-user-request.md
   ├─ messy-user-request.zh-CN.md
   ├─ refined-task-plan.json
   ├─ refined-task-plan.zh-CN.json
   ├─ handoff-example.md
   ├─ handoff-example.zh-CN.md
   ├─ final-report-example.md
   ├─ final-report-example.zh-CN.md
   ├─ live-task-board-example.md
   └─ live-task-board-example.zh-CN.md

Roadmap

v0.1

SKILL.md
Templates
JSON schema
Examples

v0.2

Stronger schemas
More examples
AGENTS.md, CLAUDE.md, and Codex instruction templates

v0.3

Optional CLI or dashboard integration
Status viewer
code-hub compatibility

License

MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Long Task Skill

The Problem

The Solution

Requirement Refinement

Core Capabilities

Final Verification

Who It Is For

Chinese-friendly Task Output

Installation

Usage

Generated Output Files

Status Definitions

Failure and Blocker Handling

Live Task Progress

Screenshots

Live Task Progress

Final Report

Final Report Behavior

Final Report Style

Why Markdown-first?

Repository Layout

Roadmap

v0.1

v0.2

v0.3

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
docs/images		docs/images
examples		examples
schemas		schemas
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
SKILL.md		SKILL.md

Folders and files

Latest commit

History

Repository files navigation

Agent Long Task Skill

The Problem

The Solution

Requirement Refinement

Core Capabilities

Final Verification

Who It Is For

Chinese-friendly Task Output

Installation

Usage

Generated Output Files

Status Definitions

Failure and Blocker Handling

Live Task Progress

Screenshots

Live Task Progress

Final Report

Final Report Behavior

Final Report Style

Why Markdown-first?

Repository Layout

Roadmap

v0.1

v0.2

v0.3

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages