A portable AI coding skill for evidence-based code quality evaluation.
Code Quality Evaluator helps an AI coding agent review a repository across five practical dimensions: readability, architecture, change health, engineering practices, and code smells. It produces a weighted scorecard, score rationale, concrete findings, and a remediation roadmap. It is designed to work as a Codex skill, Claude Code/ClawCode skill or command, OpenClaw rule, Cursor rule, or generic Markdown prompt.
Current version: 0.4.0
| File | Purpose |
|---|---|
SKILL.md |
Mini-style active skill body: workflow, decision rules, trigger rules, scoring model, checklist. |
scripts/collect_metrics.py |
Dependency-free project inventory script. |
scripts/render_html_report.py |
Optional renderer for standalone HTML reports from JSON summaries. |
scripts/validate_skill.py |
Skill pack validator for versions, required files, encoding, and report artifacts. |
evals/run_evals.py |
Lightweight regression evals for report quality and localization. |
references/rubric.md |
Detailed scoring calibration for the five dimensions. |
references/nano.md |
Compact fallback rules for tight context. |
references/report-template.md |
Formal report structure with score rationale and roadmap. |
references/html-report.md |
HTML report schema, render command, and fallback guidance. |
references/traceability.md |
Mapping from evaluation dimensions to software engineering references. |
references/platforms.md |
Cross-platform installation and degradation guidance. |
references/versioning.md |
Versioning and release policy. |
references/evolution.md |
Feedback-driven evolution process and anti-regression rules. |
references/dev-loop.md |
Local development, validation, and release loop. |
agents/openai.yaml |
Optional UI metadata for Codex skill lists. |
VERSION |
Current release version. |
CHANGELOG.md |
Release history. |
| Host | Recommended Use |
|---|---|
| Codex | Copy the whole folder into the Codex skills directory. |
| Claude Code / ClawCode | Use as a skill folder when supported, or copy SKILL.md into a command file. |
| OpenClaw | Use as a skill/rule folder when supported, or use SKILL.md as a prompt-injection rule. |
| Cursor | Use as a Manual or Agent Requested rule; use references/nano.md for compact always-on context. |
| Generic agent | Paste SKILL.md into project instructions or a reusable prompt. |
| Dimension | Weight |
|---|---|
| Readability and Style | 25% |
| Architecture Design | 25% |
| Change and Refactoring Health | 20% |
| Engineering Practices | 15% |
| Code Smell Inventory | 15% |
The skill is designed around a small default set of references:
- Clean Code
- A Philosophy of Software Design
- Refactoring
- Working Effectively with Legacy Code
- The Pragmatic Programmer
Additional references can be used by project type:
| Project Type | Useful Extra References |
|---|---|
| Backend service | Clean Architecture; Release It! |
| Data-intensive system | Designing Data-Intensive Applications |
| Domain-heavy product | Domain-Driven Design; Domain-Driven Design Distilled; Implementing Domain-Driven Design |
| Enterprise application | Patterns of Enterprise Application Architecture |
| Legacy or migration-heavy code | Working Effectively with Legacy Code; Refactoring |
| Grade | Score | Meaning |
|---|---|---|
| S | 90-100 | Excellent |
| A | 80-89 | Strong |
| B | 70-79 | Solid |
| C | 60-69 | Mixed |
| D | 40-59 | Risky |
| F | 0-39 | Critical |
Clone the repository and copy the skill folder into your Codex skills directory:
git clone https://github.com/Johnny-zbb/code-quality-evaluator.git
mkdir -p ~/.codex/skills
cp -r code-quality-evaluator ~/.codex/skills/code-quality-evaluatorRestart Codex so the skill metadata is reloaded.
Use the host's skill directory when available:
mkdir -p ~/.claude/skills
cp -r code-quality-evaluator ~/.claude/skills/code-quality-evaluatorIf only command files are supported, copy SKILL.md into the command location and keep the
references/ files nearby for manual loading.
Copy the whole folder into the host's skill or rule directory when possible. If the host only
supports a single instruction file, use SKILL.md and treat scripts/ and references/ as
optional supporting material.
See references/platforms.md for host-specific fallback guidance.
If your tool does not support skill folders, you can still copy the content of SKILL.md into
that tool's instruction or command system. This works as a prompt template, but it will not have
the same bundled-resource discovery behavior as a real skill folder.
Ask naturally:
- "Evaluate this repo's code quality."
- "Score this project."
- "Analyze maintainability and technical debt."
- "Review the architecture and code smells."
- "Prepare a handoff quality report."
- "Critique this quality report."
The skill asks the agent to produce:
- Project overview and inventory.
- Weighted composite score.
- Score rationale for each dimension.
- Per-dimension findings with file and line evidence.
- Top improvement suggestions.
- Remediation roadmap.
- Real project strengths.
- Confidence level and inspection gaps.
| Mode | Default | Use When |
|---|---|---|
| Markdown | Yes | Any host, normal evaluation, maximum portability. |
| HTML | No | The user asks for a visual, printable, dashboard-like, or shareable report. |
| JSON | No | The user wants machine-readable output or an input file for HTML rendering. |
HTML reports are optional enhancements. When Python and file writing are available, generate a JSON summary and render it with:
python scripts/render_html_report.py reports/code-quality-report.json reports/code-quality-report.htmlSet language in the report JSON to localize HTML labels, for example zh for Chinese or en
for English. Human-facing report text should match the user's language.
This is a skill, not a standalone static analyzer. The included script collects useful metrics, but the final evaluation still depends on agent inspection and reasoning.
Before publishing changes, run:
python scripts/validate_skill.py .
python evals/run_evals.py .