Skip to content

ENABLED_SKILLS path mismatch between start_local.sh scan and skills_manager.py runtime resolution #465

@taka-yayoi

Description

@taka-yayoi

Summary

start_local.sh Step 7 writes skill names to ENABLED_SKILLS in .env.local by scanning both .claude/skills/ and databricks-skills/. However, at runtime skills_manager.py resolves skill paths exclusively under databricks-skills/. This mismatch causes SkillNotFoundError on startup for any skill that exists in .claude/skills/ but not in databricks-skills/ (e.g., MLflow skills like agent-evaluation, instrumenting-with-mlflow-tracing).

Steps to Reproduce

  1. Clone the repo and cd ai-dev-kit/databricks-builder-app
  2. Run ./scripts/start_local.sh --profile <profile> --lakebase-id <id>
  3. If .claude/skills/ already contains skills from a prior install_skills.sh run (e.g., for Claude Code setup), Step 7 picks up all of them
  4. Backend fails to start with:
server.services.skills_manager.SkillNotFoundError: Skill 'agent-evaluation' not found.
Directory does not exist: .../databricks-skills/agent-evaluation.
Check ENABLED_SKILLS in your .env file.

Root Cause

Step 7 in start_local.sh scans two directories:

for skills_root in "$PROJECT_DIR/.claude/skills" "$REPO_ROOT/databricks-skills"; do

It collects all directory names containing SKILL.md and writes them to ENABLED_SKILLS.

At runtime, skills_manager.py (line 226) looks for the skill directory under databricks-skills/<name> only. Skills installed to .claude/skills/ (such as MLflow and APX skills fetched from external repos) are not found, causing the application to crash on startup.

Impact

  • The app fails to start entirely — this is not a degraded experience, it's a hard crash.
  • The error message points to ENABLED_SKILLS in .env but doesn't explain the path mismatch, making it difficult to diagnose.
  • The issue recurs on every start_local.sh execution because Step 7 overwrites ENABLED_SKILLS each time.
  • Users who have previously run install_skills.sh for Claude Code in the same repo tree will always hit this bug.

Current Workaround

Manually clear ENABLED_SKILLS after every start_local.sh run:

sed -i '' 's/^ENABLED_SKILLS=.*/ENABLED_SKILLS=/' .env.local

Suggested Fix

Options (not mutually exclusive):

  1. Align the scan scope with the runtime scope: Step 7 should only scan databricks-skills/ (or wherever skills_manager.py actually resolves paths at runtime).
  2. Add a --skip-skills flag: Allow users to skip Step 7 entirely, similar to --skip-lakebase.
  3. Preserve existing ENABLED_SKILLS: When .env.local already exists, don't overwrite ENABLED_SKILLS (the same way the script preserves other fields).
  4. Validate at scan time: Before adding a skill name to ENABLED_SKILLS, verify that the directory exists in the path that skills_manager.py will use at runtime.

Environment

  • macOS (Apple Silicon)
  • Databricks CLI v0.287.0+
  • Python 3.12
  • ai-dev-kit main branch (April 2026)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions