Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 46 additions & 25 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Contributing Skills to COMSES
# Contributing Skills to CoMSES

Thank you for contributing to this skills repository! This guide walks you through the process of creating, testing, and submitting skills for computational modelers.
Thank you for contributing to this skills repository! This guide walks you through the process of creating, testing, and submitting skills for our community.

## Table of Contents

Expand All @@ -15,16 +15,15 @@ Thank you for contributing to this skills repository! This guide walks you throu
## Before You Start

- Familiarize yourself with the [Agent Skills specification](https://agentskills.io)
- Review existing skills in `skills/` to understand the pattern
- Copy [docs/SKILL-TEMPLATE.md](docs/SKILL-TEMPLATE.md) as your starting point
- Ensure your skill addresses a concrete pain point for computational modelers
- Confirm your skill does NOT substantially overlap with existing skills
- Read [docs/agent-skills-creation-reference.md](docs/agent-skills-creation-reference.md). This is the canonical authoring guide for this repository.
- Review existing skills in `skills/` to check for overlap and assess fit / appropriateness
- Use `/create-skill` if your coding agent provides it, or manually copy [docs/SKILL-TEMPLATE.md](docs/SKILL-TEMPLATE.md) into a new skill directory

## Skill Creation Workflow

### 1. Plan Your Skill

Answer these questions before writing:
Answer these questions:

- **What problem does it solve?** (e.g., "Modelers struggle to document ODD+2 protocols manually")
- **When should the coding agent use it?** (e.g., "When user has model code and needs narrative documentation")
Expand All @@ -34,21 +33,24 @@ Answer these questions before writing:

### 2. Create Your Skill Folder

Run `/create-skill <name> — <one-sentence description>` in your coding agent. This scaffolds `skills/<name>/SKILL.md` from [docs/SKILL-TEMPLATE.md](docs/SKILL-TEMPLATE.md) with placeholders filled in, and generates a starter `evals.json`.
Run `/create-skill <name> — <one-sentence description>` in your coding agent if that command is available. It should scaffold `skills/<name>/SKILL.md` from [docs/SKILL-TEMPLATE.md](docs/SKILL-TEMPLATE.md) and create a starter `skills/<name>/evals.json`.

Alternatively, copy manually:
```bash
mkdir -p skills/your-skill-name
cp docs/SKILL-TEMPLATE.md skills/your-skill-name/SKILL.md
cp skills/document/evals.json skills/your-skill-name/evals.json
```

Then immediately rename `skill_name`, replace the copied prompts, and make sure the frontmatter `name:` matches the folder exactly.

### 3. Write SKILL.md

See [Frontmatter Specification](#frontmatter-specification) and [Writing Guidelines](#writing-guidelines) below.

### 4. Add Optional Resources

As your skill grows, add supporting files:
As your skill grows, you might find supporting files useful:

```
your-skill-name/
Expand All @@ -71,6 +73,14 @@ your-skill-name/

See [Testing Your Skill](#testing-your-skill).

Before opening a PR, also run the repository validators:

```bash
python scripts/validate_individual_skills.py
python scripts/validate_evals_schema.py
python scripts/validate_cross_skills.py evals/cross-skills.json
```

### 6. Submit a Pull Request

Include:
Expand Down Expand Up @@ -124,21 +134,21 @@ A typical SKILL.md body includes:

## Key Inputs

- Model source files (Python/R/C++)
- Model source code files
- Parameter descriptions or config files
- Optional: docstrings with metadata

## Step-by-Step Instructions

1. Read the model code
2. Extract metadata using scripts/extract.py
2. Extract metadata (scicodes/somef-core, google/langextract)
3. Generate narrative following references/TEMPLATE.md
4. Validate against references/CHECKLIST.md

## ⚠️ Gotchas

- **Stochastic models:** If your model uses randomness, document any fixed random seeds
- **Large codebases:** Summarize into entity/subsystem abstractions first
- **Large codebases:** Summarize into entity/subsystem/component abstractions first
- **Missing documentation:** Skill will ask clarifying questions rather than guess

## Templates & Resources
Expand Down Expand Up @@ -173,10 +183,11 @@ A typical SKILL.md body includes:
name: your-skill-name
description: |
A complete description of what this skill does.
Use when: you have model code and need...
When to trigger: mention [keywords like ODD, documentation, publication]

Use this skill when you have model code and need...
Triggers: "odd", "documentation", "publication"
Expected output: [specific deliverables]
license: MIT
---
```

Expand All @@ -186,23 +197,25 @@ description: |
---
name: your-skill-name
description: ...
license: MIT (default) | Apache-2.0 | Proprietary
license: MIT | Apache-2.0 | Proprietary
compatibility: Python 3.10+, git, Docker (optional)
metadata:
domain: computational-modeling | documentation | publication | execution
maturity: alpha | beta | stable
audience: modelers | researchers | data scientists
audience: modelers | researchers | data-scientists
category: documentation | quality-assurance | execution | publication
---
```

### Guidancefor `description`
### Guidance for `description`

The description is your **primary triggering mechanism**. Make it:

- **Task-specific:** "ODD+2 narrative for agent-based models" not just "model documentation"
- **Keyword-rich:** Include trigger phrases users would naturally type
- **Outcome-focused:** Mention specific deliverables (e.g., "checklist", "narrative sections", "validation report")
- **Slightly pushy:** Coding agents tend to under-trigger skills. Emphasize when to use: "Use whenever you mention ODD, ABM documentation, or model publication preparation"
- **Use the repository-preferred trigger phrase:** Start with `Use this skill when ...` so your description aligns with the validator heuristics and the existing skills.
- **Slightly pushy:** Coding agents tend to under-trigger skills. Emphasize when to use: "Use this skill when you mention ODD, ABM documentation, or model publication preparation"

## Testing Your Skill

Expand All @@ -226,37 +239,45 @@ The description is your **primary triggering mechanism**. Make it:

### Creating an Evaluation Strategy

For each skill, document 3–5 concrete test cases in a file `evals/evals.json`:
For each skill, include concrete test cases in `skills/<name>/evals.json`:

```json
{
"skill_name": "document",
"description": "Evaluation cases for ODD+2 narrative documentation skill",
"evals": [
{
"id": 1,
"type": "core",
"prompt": "I have a Python ABM with Agent and Environment classes. Generate an ODD narrative.",
"should_trigger": true,
"expected_output": "ODD sections covering entities, state variables, and processes",
"files": ["evals/files/minimal_abm.py"]
"expected_output": "ODD sections covering entities, state variables, and processes"
}
]
}
```

Notes:

- Individual skill evals live next to the skill, for example `skills/document/evals.json`.
- The repository schema accepts fields such as `type`, `should_trigger`, `expected_output`, `expected_behavior`, `success_criteria`, `skills_expected`, `failure_modes`, and `notes`.
- Do not add ad hoc fields unless you also update the schema in `evals/schema/schema.json`.

## Submission Checklist

Before submitting, verify:

- [ ] Skill folder name matches `name:` field in frontmatter
- [ ] Frontmatter includes `name` and `description` (and optionally `license`, `compatibility`, `metadata`)
- [ ] Description includes triggers ("Use when you...") and expected outputs
- [ ] Frontmatter includes `name`, `description`, and `license` (plus optional `compatibility` and `metadata`)
- [ ] Description includes triggers (`Use this skill when ...`) and expected outputs
- [ ] All script references use relative paths: `scripts/name.py` (not `./scripts/name.py`)
- [ ] README/CONTRIBUTING sections are consistent with repository guidelines
- [ ] `skills/<name>/evals.json` exists and validates against `evals/schema/schema.json`
- [ ] Tested skill against ≥5 should-trigger and ≥3 should-not-trigger prompts
- [ ] No hardcoded paths or user-specific settings
- [ ] Scripts have clear usage documentation (docstrings, help text, or references/SCRIPT.md)
- [ ] No credentials, API keys, or personal data in examples
- [ ] License field in frontmatter (defaults to MIT if omitted)
- [ ] License field is present in frontmatter

## Questions?

Expand Down
50 changes: 50 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# ---- config ----
PYTHON ?= python
SCRIPTS := scripts
EVALS := evals

CROSS_EVAL := $(EVALS)/cross-skills.json

# ---- default ----
.PHONY: all
all: validate-evals cross

# ---- schema validation ----
.PHONY: validate-evals
validate-evals:
$(PYTHON) $(SCRIPTS)/validate_evals_schema.py

# ---- cross-skill evals ----
.PHONY: cross
cross:
$(PYTHON) $(SCRIPTS)/validate_cross_skills.py $(CROSS_EVAL)

# ---- per-skill evals (placeholder) ----
# assumes future runner like: run_skill_evals.py <skill>
SKILLS := document fair4rs hpc ospool peer-review

.PHONY: skills
skills: $(SKILLS)

.PHONY: $(SKILLS)
$(SKILLS):
@echo "Running evals for $@"
$(PYTHON) $(SCRIPTS)/run_skill_evals.py $@

# ---- aggregate report ----
.PHONY: report
report:
$(PYTHON) $(SCRIPTS)/aggregate_failures.py

# ---- full pipeline ----
.PHONY: full
full: validate-evals cross report

# ---- CI Pipeline ----
.PHONY: ci
ci:
@echo "=== Running CI pipeline ==="
$(MAKE) validate-evals
$(MAKE) cross
$(MAKE) report
@echo "=== CI completed ==="
71 changes: 49 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ Use cases:
## Repository Structure

```
skills/
.
├── .github/
│ └── skills/
│ └── update-skill/ (repository-local maintainer skill)
Expand All @@ -156,43 +156,66 @@ skills/
│ │ └── REFRESH-WORKFLOW.md
│ └── assets/
│ └── REFRESH-PR-NOTE-TEMPLATE.md
├── AGENTS.md (repository-specific agent instructions)
├── README.md (this file)
├── CONTRIBUTING.md (contribution guidelines)
├── LICENSE (MIT)
├── .gitignore
├── Makefile (validation shortcuts)
├── docs/ (repository-level documentation)
│ ├── agent-skills-creation-reference.md
│ ├── roadmap.md
│ └── SKILL-TEMPLATE.md (copy/fill template for new skills)
├── evals/ (cross-skill evals and schema)
├── scripts/ (validation and reporting helpers)
└── skills/ (all skill folders)
├── document/
│ └── SKILL.md
│ ├── SKILL.md
│ └── evals.json
├── fair4rs/
│ └── SKILL.md
│ ├── SKILL.md
│ └── evals.json
├── ospool/
│ └── SKILL.md
│ ├── SKILL.md
│ └── evals.json
├── hpc/
│ └── SKILL.md
│ ├── SKILL.md
│ └── evals.json
└── peer-review/
└── SKILL.md
├── SKILL.md
└── evals.json
```

## For Skill Authors

### Adding a New Skill

1. **Read** [CONTRIBUTING.md](CONTRIBUTING.md) for submission guidelines and naming conventions.
1. **Read** [AGENTS.md](AGENTS.md), [CONTRIBUTING.md](CONTRIBUTING.md), and [docs/agent-skills-creation-reference.md](docs/agent-skills-creation-reference.md) before drafting.
2. **Review** [Agent Skills best practices](https://agentskills.io/skill-creation/best-practices) before drafting.
3. **Ground from real expertise**: start from real task runs, corrections, and project artifacts (not generic advice).
4. **Scope coherently**: define one composable unit of work; avoid overly broad or ultra-narrow skills.
5. **Design for context efficiency**: keep `SKILL.md` concise, move deep details to `references/`, and load references only when needed.
6. **Prefer defaults over menus**: choose one default tool/approach and list alternatives only as fallbacks.
7. **Include reusable control patterns**: gotchas, output templates, and validation loops/checklists where relevant.
8. **Refine with real execution**: test should-trigger and should-not-trigger prompts, review execution traces, then iterate.
9. **Copy** an existing skill folder as a starting point: `cp -r skills/hpc skills/your-skill-name`.
10. **Fill in** the YAML frontmatter (`name`, `description`) and markdown instructions following the progressive disclosure pattern.
11. **Include optional resources** (scripts, references, assets) as your skill grows.
12. **Test** against should-trigger and should-not-trigger prompts before submitting a PR.
13. **Submit** a pull request with your skill and evaluation strategy (see CONTRIBUTING.md).
3. **Ground from real expertise**: start from real task runs, corrections, and project artifacts, not generic advice.
4. **Scope coherently**: define one composable unit of work and keep the boundary clear.
5. **Design for context efficiency**: keep `SKILL.md` concise, move deep detail into `references/`, and add explicit load conditions.
6. **Prefer defaults over menus**: choose one default tool or approach and use alternatives only as fallbacks.
7. **Create the skill folder** with `/create-skill` if your agent supports it, or scaffold manually:

```bash
mkdir -p skills/your-skill-name
cp docs/SKILL-TEMPLATE.md skills/your-skill-name/SKILL.md
cp skills/document/evals.json skills/your-skill-name/evals.json
```

8. **Fill in** the YAML frontmatter and markdown instructions, then immediately rename `skill_name`, replace the copied prompts, and ensure `name:` matches the folder exactly.
9. **Include optional resources** (`assets/`, `references/`, `scripts/`) as the workflow needs them.
10. **Refine with real execution**: test should-trigger and should-not-trigger prompts, review execution traces, and iterate.
11. **Run the repository validators** before opening a PR:

```bash
python scripts/validate_individual_skills.py
python scripts/validate_evals_schema.py
python scripts/validate_cross_skills.py evals/cross-skills.json
```

12. **Submit** a pull request with the skill folder, its `evals.json`, and the prompts or checks you used to validate it.

### Skill Anatomy

Expand Down Expand Up @@ -223,22 +246,26 @@ Authoring guidance:
```yaml
---
name: your-skill-name
description: Brief description of when and why to use this skill
description: |
Use this skill when...
Triggers: "phrase 1", "phrase 2"
Expected output: ...
license: MIT
---
```

**Optional fields:**
```yaml
license: MIT (default) | Apache-2.0 | GPL-3.0-or-later
compatibility: Tool/version requirements
metadata:
domain: computational-modeling | documentation | publication | execution
maturity: alpha | beta | stable
audience: modelers | researchers | data scientists
audience: modelers | researchers | data-scientists
category: documentation | quality-assurance | execution | publication
---
```

See [CONTRIBUTING.md](CONTRIBUTING.md) and [AGENTS.md](AGENTS.md) for full guidance.
See [CONTRIBUTING.md](CONTRIBUTING.md), [AGENTS.md](AGENTS.md), and [docs/VALIDATION.md](docs/VALIDATION.md) for full guidance.

## Roadmap

Expand Down
Loading
Loading