Skip to content

acmlab/dcm2bids-skill

Repository files navigation

dcm2bids — DICOM to BIDS Conversion Skill

A production-grade agent skill that converts raw neuroimaging DICOM datasets into BIDS format. Battle-tested across 20+ real cohorts including UKB, HCPA, HCPYA, ADNI, NACC, and EBDS — on both Linux and Windows.

dcm2bids wraps the standard tools (dcm2bids, dcm2niix, bids-validator) in a structured 8-step workflow that an LLM agent can execute end-to-end: inspecting your dataset, classifying series, writing a config, running the conversion, validating the output, and producing a durable, reproducible report.

Platform-agnostic. The skill is a self-contained folder of Markdown instructions plus tool helpers — no platform-specific runtime dependency. Ships with first-class support for Claude Code and OpenCode (via the bundled .opencode/ agent and command definitions), and adapts to any LLM-agent runtime that can read a SKILL.md-style folder of instructions.


Why this skill

Converting DICOM to BIDS by hand is mechanical but error-prone. The typical pain points:

  • Discovering the dataset structure — raw DICOMs, zips, NRRD, mixed bags
  • Writing the dcm2bids config — every scanner uses different protocol names
  • Looping over subjects — one bad subject shouldn't kill the whole batch
  • Reproducibility — six months later, what config did you actually use?
  • Validation hygiene — was bids-validator actually run? Were warnings ignored?

This skill takes the typical "I'll just write a quick script" task and turns it into a workflow that is reproducible, auditable, failure-tolerant, and cross-platform out of the box.


Key features

  • Cross-platform: every step has tested Linux/macOS (bash) and Windows (PowerShell) implementations.
  • Handles real-world messes: compressed archives (.zip/.tgz/.tar.gz), extensionless DICOM files (auto-detected via DICM magic bytes), mixed directories with NIfTI / NRRD / derived data.
  • Multi-subject loops with failure tolerance: try/catch around each subject's dcm2bids call; one bad subject doesn't abort the batch; failed subjects are surfaced in the final report.
  • Multi-session support: detects ses-* / visit* / date-stamped session folders and maps them to BIDS ses-<label>.
  • Protocol variation detection: for multi-subject cohorts, samples multiple subjects' metadata and compares SeriesDescription sets so a single-subject config doesn't silently drop unmatched series.
  • Vendor-aware classification: built-in protocol-name → BIDS modality table covering common Siemens, GE, and Philips sequences (MPRAGE, SPACE, FLAIR, DTI, BOLD, field maps, etc.).
  • BIDS-compliant participant table: auto-populates participants.tsv with one row per converted subject.
  • Reproducible: persists the dcm2bids config and conversion logs to <output>/code/ alongside the data.
  • Durable report: writes conversion_report.md to the output directory, including verbatim config, validation results, and a list of any failed subjects.
  • Privacy-conscious: explicit guidance on PHI handling before conversion.
  • Validator-aware: runs bids-validator (Node CLI) and surfaces every warning with concrete suggested fixes.

Battle-tested on real cohorts

This skill has been used to convert raw DICOM data from the following neuroimaging studies. All conversions verified by bids-validator and inspected for series-level correctness.

Cohort Type Notes
UK Biobank (UKB) Population imaging, large-scale Multi-modal (T1, T2, dMRI, rfMRI, tfMRI, SWI)
HCP Young Adult (HCPYA) Connectome, 22–35 yr healthy adults Multi-shell dMRI, multiband fMRI
HCP Aging (HCPA) Connectome, 36–100 yr aging cohort Multi-modal lifespan imaging
ADNI Longitudinal Alzheimer's Disease Multiple phases (ADNI 1/2/3/GO), longitudinal sessions
NACC National Alzheimer's Coordinating Center Clinical AD cohort, multi-site
EBDS Early Brain Development Study (UNC) Neonatal / infant imaging
... ... ...

Verified on:

  • Operating systems: Linux (UNC Longleaf HPC, Ubuntu 20.04/22.04), Windows 10/11
  • Python: 3.9 – 3.11
  • Cohort sizes: 1 subject (pilot) up to 1000+ subjects (population)
  • Session structures: single-session, multi-session (longitudinal), multi-site

If you successfully convert a new cohort with this skill, please open a PR to add it to this list.


Quick start

# 1. Install the prerequisites (one-time)
pip install "dcm2bids>=3.2,<4.0"
conda install -c conda-forge dcm2niix
npm install -g bids-validator

# 2. Drop this skill folder into your Claude Code skills directory
#    Linux/Mac:   ~/.claude/skills/dcm2bids/
#    Windows:     %USERPROFILE%\.claude\skills\dcm2bids\

# 3. In Claude Code, ask:
#    "Use dcm2bids to convert /path/to/dicoms to BIDS at /path/to/output"

That's it. The agent will walk through the 8-step workflow and produce a validated BIDS dataset plus conversion_report.md in the output directory.


Installation

The skill is a self-contained folder. Drop it wherever your agent platform discovers skills/rules. We support multiple runtimes:

Quickest path — install scripts

After cloning the repo, run:

./install.sh        # Linux / macOS
.\install.ps1       # Windows

The scripts:

  • Detect your OS and available package managers (apt / dnf / pacman / brew / conda / choco / scoop / winget)
  • Install dcm2bids, dcm2niix, Node.js, and bids-validator automatically wherever possible
  • Copy the skill into Claude Code's default skills directory
  • Report exactly what's left to install manually (if anything)

Adapt the last step (skill copy) manually if you're using a runtime other than Claude Code.

Conda one-shot (alternative)

If you prefer to set up everything in a fresh conda environment, use the bundled environment.yml:

conda env create -f environment.yml
conda activate dcm2bids-skill
npm install -g bids-validator      # only step left after this
./install.sh                        # copies the skill into place

This is the most reproducible path — your collaborators get the same Python / Node.js / dcm2niix versions you used.

Claude Code

Manual install — place the folder under:

  • Linux / macOS: ~/.claude/skills/dcm2bids/
  • Windows: %USERPROFILE%\.claude\skills\dcm2bids\

Restart Claude Code. The skill auto-triggers on prompts mentioning DICOM, BIDS, dcm2bids, dcm2niix, or "convert MRI scans".

Plugin install (when supported by your Claude Code version):

/plugin marketplace add github.com/acmlab/dcm2bids-skill
/plugin install dcm2bids

OpenCode

The .opencode/ subdirectory contains an agent (dicom2bids) and a command (convert-bids) compatible with OpenCode. Place the folder where OpenCode discovers skills and invoke the dicom2bids agent.

Other agent runtimes

The workflow itself is platform-neutral: SKILL.md is the entry point, with 8 step-by-step reference files under references/. Any LLM-agent runtime that can read a Markdown folder of instructions can execute it:

  1. Point your platform's skill / rule discovery mechanism at this folder
  2. Or copy SKILL.md + references/ contents into your platform's native instruction format (Cursor rules, Continue commands, custom system prompts, etc.)
  3. Or call the workflow procedurally — the references/ files are written as concrete bash / PowerShell command sequences that an agent can execute step by step

If you adapt this skill to a new runtime, please open a PR adding an "As a <runtime> skill" section here so others can benefit.


The 8-step workflow

Each step is documented in detail under references/.

Step What it does Reference
1 Verify CLI tools are installed and on PATH 01_check_tools.md
2 Inspect dataset structure, probe for extensionless DICOM, identify subject/session folders, confirm labels with user 02_analyze_dataset_structure.md
2b Extract compressed DICOM archives to a temporary working directory 02b_handle_archives.md
3 Run dcm2bids_helper on a sample (or every subject for small cohorts); detect protocol variation 03_read_dicom_metadata.md
4 Classify each series into BIDS modality (T1w/T2w/FLAIR/dwi/bold/...) using a vendor-aware protocol-name table 04_classify_series.md
5 Write the dcm2bids_config.json 05_write_config.md
6 Run dcm2bids (failure-tolerant loop), populate participants.tsv, persist config to code/ 06_run_dcm2bids.md
7 Run bids-validator, preserve conversion logs, clean up working artifacts, write conversion_report.md 07_validate_and_report.md

Usage scenarios

Single subject

Use dcm2bids to convert /data/sub01_dicom to BIDS at /data/bids_out.

Multi-subject (single session)

Use dcm2bids to convert all subjects under /data/cohort_dicom to BIDS at /data/cohort_bids. Each subdirectory is one subject.

Multi-subject, multi-session

Convert the longitudinal dataset at /data/longitudinal to BIDS at /data/longitudinal_bids. Each subject has a baseline and follow-up visit folder.

Compressed archives

The DICOMs are inside .zip archives one per series. Convert /data/zipped_dicoms to BIDS at /data/bids_out.

Extensionless DICOM

The files in /data/legacy_dicoms have no extension but are DICOM internally. Convert to BIDS at /data/bids_out.

In every scenario the agent will confirm derived subject/session labels with you via an interactive prompt before running dcm2bids.


Output structure

After a successful conversion:

<output_dir>/
├── dataset_description.json
├── participants.tsv             # populated with one row per converted subject
├── participants.json
├── README, CHANGES, .bidsignore
├── conversion_report.md         # durable, markdown-formatted run summary
├── code/                        # reproducibility artifacts
│   ├── dcm2bids_config.json     # the exact config used
│   └── dcm2bids_logs/           # one log file per subject
│       ├── scaffold_<ts>.log
│       └── sub-<label>_<ts>.log
└── sub-<label>/
    ├── anat/
    │   ├── sub-<label>_T1w.nii.gz       (+ .json)
    │   ├── sub-<label>_run-01_T2w.nii.gz (+ .json)
    │   └── ...
    ├── dwi/
    │   ├── sub-<label>_dwi.nii.gz
    │   ├── sub-<label>_dwi.bval
    │   ├── sub-<label>_dwi.bvec
    │   └── sub-<label>_dwi.json
    └── func/...

For multi-session datasets, subject folders contain ses-<label>/ levels in between.


What this skill does NOT do

To keep the scope tight and the conversion correct:

  • Non-MRI modalities — PET, EEG, MEG, iEEG, and microscopy data are not currently supported. Use BIDS converters specific to those modalities.
  • De-identification — this skill does not strip PHI from DICOM headers. See Privacy & PHI below.
  • Manual editing of NIfTI — defacing, brain extraction, motion correction, and other preprocessing belong in derivatives/ and downstream pipelines (fMRIPrep, QSIPrep, ANTs, etc.).
  • Cohort metadata enrichmentparticipants.tsv is populated with participant_id only. Filling in age, sex, group, scanner, and other columns is the user's responsibility (typically from a separate demographics file).
  • dcm2bids 4.x — currently tested with dcm2bids 3.2.x. The dcm2bids 4.x series is not yet supported.

Privacy and PHI

WARNING: DICOM files often contain Protected Health Information (PHI): PatientName, PatientID, PatientBirthDate, StudyDate, InstitutionName, ReferringPhysicianName, and other identifying fields.

This skill does not de-identify DICOM data. If your dataset contains PHI and you intend to share, archive, or upload the resulting BIDS dataset:

  1. De-identify the source DICOM before running this skill, OR
  2. Pass --anonymize to dcm2niix (via dcm2bids config), OR
  3. Run a downstream tool like pydicom's de-identification or BIDS-defacing utilities on the BIDS output before sharing.

For datasets used under IRB/HIPAA/GDPR governance, confirm with your institution's privacy officer before publishing or sharing the output.


Configuration reference

The dcm2bids_config.json written in Step 5 maps DICOM series metadata to BIDS output names. Minimal example:

{
  "descriptions": [
    {
      "datatype": "anat",
      "suffix": "T1w",
      "criteria": { "SeriesDescription": "mp_rage*" }
    },
    {
      "datatype": "anat",
      "suffix": "T2w",
      "criteria": { "SeriesDescription": "t2_spc_1mm*" }
    },
    {
      "datatype": "dwi",
      "suffix": "dwi",
      "criteria": { "SeriesDescription": "ep2d_diff*" }
    }
  ]
}

The agent writes a config automatically. You can inspect and hand-edit <output_dir>/code/dcm2bids_config.json between runs. See assets/example_config.json and references/05_write_config.md for the full schema and vendor-specific patterns.


Troubleshooting

Symptom Likely cause Fix
bids-validator: command not found Installed the PyPI bids-validator (Python lib only) npm install -g bids-validator
0 series detected DICOMs have unusual extensions, or path points at NRRD/NIfTI directory Check Step 2c magic-bytes probe; verify path
FAILED: sub-<label> mid-loop One subject's DICOM is corrupt Check <output>/code/dcm2bids_logs/sub-<label>_*.log for dcm2niix error
Subject ID has hyphens / underscores BIDS spec forbids non-alphanumerics in labels Skill strips them automatically; confirm derived label at the prompt
Two series produce the same BIDS name Same protocol scanned twice Expected — dcm2bids auto-assigns run-01 / run-02
Unmatched series in tmp_dcm2bids/ Localizer / scout / calibration scans Expected — these are intentionally dropped from BIDS
tmp_dcm2bids/ still present after run Validation failed; cleanup skipped to preserve diagnostics Inspect logs; fix issue; re-run
Authors should contain an array warning Single-author dataset Soft warning; either add co-authors to dataset_description.json or ignore (validator allows it)

Requirements

The install scripts (install.sh / install.ps1) handle most of these automatically — you only need to install manually if you skip the scripts.

Component Minimum Recommended Auto-installed? How to install manually
Python 3.8 3.10 No (must exist) system / conda / pyenv
dcm2bids 3.2.0 3.2.x Yes (pip) pip install "dcm2bids>=3.2,<4.0"
dcm2niix v1.0.20220720 v1.0.20240202+ Yes (conda / apt / dnf / brew / choco / scoop) conda install -c conda-forge dcm2niix or binaries
Node.js 18 20+ Yes (conda / apt / dnf / brew / choco / scoop / winget) OS package or nodejs.org
bids-validator 1.14 1.15 Yes (npm, after Node.js) npm install -g bids-validator
OS Linux / macOS / Windows Linux

Limitations and known issues

  • dcm2bids 4.x not yet supported. Pinned to 3.2.x because the 4.x CLI is changing.
  • bids-validator@1.x is deprecated. Still works; migration to Deno-based @bids/validator 2.x is planned once that CLI stabilizes.
  • Single-vendor classification table. The Siemens/GE/Philips patterns cover most academic scanners but are not exhaustive. Hand-edit the config for unusual sequences (e.g. multi-echo, MP2RAGE, ASL).
  • participants.tsv columns minimal. Only participant_id is auto-filled.
  • Source DICOM not archived. The skill writes to <output>/code/ but does not copy source DICOM to <output>/sourcedata/. Add that manually if your IRB requires source preservation.

Contributing

Bug reports, feature requests, and PRs welcome — especially:

  • New cohort test reports (add a row to Battle-tested on real cohorts)
  • Additional vendor protocol-name patterns
  • Translations of references/ to other languages
  • CI / smoke-test contributions

See CONTRIBUTING.md for setup, code style, and review process.


Citation

If this skill contributed to a published study, please cite:

ACMLab. (2026). dcm2bids — DICOM to BIDS Conversion Skill for Claude Code.
ACMLab, University of North Carolina at Chapel Hill.
https://github.com/acmlab/dcm2bids-skill

A formal DOI release is planned via Zenodo for v1.0.0.


License

See LICENSE in the repository root.


Authors and credits

Developed by ACMLab at the University of North Carolina at Chapel Hill.

Built on top of:


References

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors