Runabilly

Runabilly spins up a disposable Docker container, clones an open source project into it, and uses Claude to automatically explore, install dependencies, build, and report the results. It was created to support BOSC (Bioinformatics Open Source Conference) software evaluation workflows.

Prerequisites

Docker (version 20.10 or later)
Claude Code (for the /runabilly slash command)

The script runs preflight checks automatically: it verifies Docker is installed and running, checks the minimum version, warns if Docker has less than 4 GB of memory available (common on Docker Desktop for macOS/Windows), and warns if the Docker root directory has less than 20 GB of free disk (matters for LFS-heavy and conda-heavy bioinformatics repos). It works on both Linux and macOS.

Quick start

Using the Claude Code slash command (recommended)

These instructions assume you are already running Claude Code from this project directory. From the Claude Code prompt, run:

/runabilly https://github.com/jqlang/jq

Claude will automatically:

Build the Docker base image (if needed) and create a container
Clone the repo and explore its structure
Detect the build system and install the required toolchain
Attempt to build the project (up to 3 retries)
Print a structured report with the results
Clean up the container

To keep the container running after the build for manual exploration:

/runabilly --keep https://github.com/jqlang/jq

Claude will skip cleanup and print instructions for entering the container.

Using the shell script directly

# Create a container and clone a project into it
./runabilly.sh https://github.com/jqlang/jq

# Output:
#   RUNABILLY_CONTAINER=runa-jq-a1b2c3d4
#   RUNABILLY_WORKDIR=/workspace/project

# Run commands inside the container
docker exec runa-jq-a1b2c3d4 bash -c 'cd /workspace/project && ls'

# Clean up when done
./runabilly.sh --cleanup runa-jq-a1b2c3d4

# Or use --keep to get an interactive container with entry instructions
./runabilly.sh --keep https://github.com/jqlang/jq

# Then enter it with:
docker exec -it runa-jq-a1b2c3d4 bash

How it works

Runabilly uses a minimal Ubuntu 24.04 base image with only basic tools (git, git-lfs, curl, build-essential, etc.). No language-specific toolchains are pre-installed — they get added as needed for each project. This keeps the base image small and avoids version conflicts.

Each project gets its own isolated container named runa-<reponame>-<hash>, capped at 4 GB of memory. Everything runs inside the container via docker exec, so nothing is installed on your host machine.

Environment policies

Network access: containers have full outbound internet access during the entire evaluation. Builds can apt-get install, pip install, cargo fetch, R install.packages, conda install, git clone submodules, etc. Inbound network is not configured.
GPU: no GPU is exposed to the container. CUDA/ROCm/Metal-only projects are reported as WARNING with the GPU requirement noted as the hurdle.
Git LFS: git-lfs is installed in the base image, but clones run with GIT_LFS_SKIP_SMUDGE=1 so LFS-tracked files remain pointer stubs by default. This prevents surprise multi-gigabyte pulls on data-heavy bioinformatics repos. When the build or tests actually need the LFS data, Claude opts in per repo with git lfs install --local && git lfs pull inside the container.
Disposability: containers are torn down at the end of each run unless --keep is passed. Nothing persists between evaluations.

Report output

Each evaluation produces a structured report with:

Build result

SUCCESS — build completes AND the project's tests run and pass (or no test infrastructure is present). Tests are required when present: a passing build with failing tests is not a SUCCESS.
WARNING — build completes but full test validation is blocked by an unavoidable environmental hurdle (e.g. Docker-in-Docker, large external databases, paid API keys, GPU-only). The code looks healthy; the environment can't validate it. The specific hurdle is named in the report.
FAILURE — build fails after retries, OR tests run but fail, OR the 1-hour timeout is exceeded
UNDEFINED — URL isn't a buildable repo (e.g. Kaggle homepage, documentation site, dataset collection)

Difficulty rating

A composite rating based on four sub-scores (each LOW / MEDIUM / HIGH):

Factor	LOW	MEDIUM	HIGH
Time	< 60s	60s–300s	> 300s
Dependencies	< 10 packages	10–50	> 50 or multiple toolchains
Exoticness	Standard build system, no workarounds	Less common build system or minor workarounds	Custom scripts, multi-stage setup, Docker-in-Docker, etc.
Divergence	Documented build path worked on first try	Minor adjustments needed	Documented path failed; alternate route required, or no docs

Roll-up: EASY (all LOW), MODERATE (any MEDIUM, no HIGH), HARD (any HIGH), IMPRACTICAL (can't complete in a disposable container).

Timeout

Evaluations are capped at 1 hour. If the build hasn't completed by then, the container is cleaned up and the result is reported as FAILURE.

File layout

File	Purpose
`Dockerfile`	Base Ubuntu 24.04 image definition
`runabilly.sh`	Container lifecycle script (create, clone, cleanup)
`.claude/skills/runabilly/SKILL.md`	Claude Code `/runabilly` skill definition
`.claude/settings.local.json`	Pre-approved Docker permission patterns
`CLAUDE.md`	Project conventions for Claude Code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Runabilly

Prerequisites

Quick start

Using the Claude Code slash command (recommended)

Using the shell script directly

How it works

Environment policies

Report output

Build result

Difficulty rating

Timeout

File layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.claude/skills/runabilly		.claude/skills/runabilly
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
README.md		README.md
runabilly.sh		runabilly.sh

Folders and files

Latest commit

History

Repository files navigation

Runabilly

Prerequisites

Quick start

Using the Claude Code slash command (recommended)

Using the shell script directly

How it works

Environment policies

Report output

Build result

Difficulty rating

Timeout

File layout

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages