From aa40610b8abb51e9fe41cf554ce7e9c652b4c94b Mon Sep 17 00:00:00 2001 From: Dmitry Date: Sun, 5 Apr 2026 13:21:15 -0700 Subject: [PATCH] Add CLAUDE.md files for Claude Code context Root CLAUDE.md covers project identity, Docker-based testing workflow, code standards, and git policy. Nested files document adapters, content fetchers, Rails integration, and testing conventions. Co-Authored-By: Claude Opus 4.6 (1M context) --- CLAUDE.md | 81 +++++++++++++++++++ lib/llm_classifier/adapters/CLAUDE.md | 22 +++++ lib/llm_classifier/content_fetchers/CLAUDE.md | 27 +++++++ lib/llm_classifier/rails/CLAUDE.md | 25 ++++++ spec/CLAUDE.md | 34 ++++++++ 5 files changed, 189 insertions(+) create mode 100644 CLAUDE.md create mode 100644 lib/llm_classifier/adapters/CLAUDE.md create mode 100644 lib/llm_classifier/content_fetchers/CLAUDE.md create mode 100644 lib/llm_classifier/rails/CLAUDE.md create mode 100644 spec/CLAUDE.md diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..369b4ae --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,81 @@ +# CLAUDE.md + +LlmClassifier - Ruby gem for building LLM-powered classifiers with a clean DSL. Supports multiple LLM backends (ruby_llm, OpenAI, Anthropic) and optional Rails integration. + +- Ruby >= 3.2, RSpec, RuboCop, Zeitwerk autoloading +- No Rails dependency in core; Rails integration is opt-in via `lib/llm_classifier/rails/` +- CI tests against Ruby 3.4 and 4.0 + +## Development with Docker + +Ruby is not installed on the host. Use Docker to run tests and linting: + +```bash +# Run tests and rubocop (Ruby 3.4) +docker.exe run --rm -v "$(wslpath -w "$(pwd)"):/app" -w /app ruby:3.4-slim \ + bash -c "apt-get update -qq && apt-get install -y -qq build-essential git 2>/dev/null && \ + gem install bundler --no-document && bundle install --quiet && \ + bundle exec rspec && bundle exec rubocop" + +# Rubocop only +docker.exe run --rm -v "$(wslpath -w "$(pwd)"):/app" -w /app ruby:3.4-slim \ + bash -c "apt-get update -qq && apt-get install -y -qq build-essential git 2>/dev/null && \ + gem install bundler --no-document && bundle install --quiet && \ + bundle exec rubocop" + +# Single spec file +docker.exe run --rm -v "$(wslpath -w "$(pwd)"):/app" -w /app ruby:3.4-slim \ + bash -c "apt-get update -qq && apt-get install -y -qq build-essential git 2>/dev/null && \ + gem install bundler --no-document && bundle install --quiet && \ + bundle exec rspec spec/llm_classifier/classifier_spec.rb" +``` + +Docker Desktop must be running on Windows. The `docker.exe` command is used because Docker runs via WSL2 integration. The `-v` flag bind-mounts the project so edits on host are immediately visible. + +A `.devcontainer/` setup also exists for VS Code Dev Containers. + +## Quick Commands (inside Docker) + +```bash +bundle exec rspec # all tests +bundle exec rspec spec/llm_classifier/classifier_spec.rb # single file +bundle exec rubocop # all files +bundle exec rubocop -a # auto-correct +gem build llm_classifier.gemspec # build gem +``` + +## Project Structure + +All sibling projects are located in `/home/axium/projects/`. The `prospector` gem depends on `llm_classifier`. + +## Code Standards + +- Double-quoted strings (enforced by RuboCop) +- Max line length: 120 characters +- Max method length: 20 lines +- RSpec example max: 15 lines, max 6 expectations per example +- `Style/HashExcept` disabled (requires ActiveSupport) +- `Metrics/ClassLength` exempted for `classifier.rb` and `content_fetchers/web.rb` + +## Git Workflow + +- Never push directly to main. Always create a feature branch and PR. +- Run the full test suite and rubocop before creating a PR. +- Version bumps in `lib/llm_classifier/version.rb` go in the feature PR, not separately. + +## Key Classes + +- `LlmClassifier::Classifier` - Core DSL and classification pipeline +- `LlmClassifier::Result` - Value object returned from every classification +- `LlmClassifier::Knowledge` - Domain knowledge DSL container (`method_missing`-based) +- `LlmClassifier::Configuration` - Global config (adapter, model, API keys) +- `LlmClassifier::Adapters::Base` - Abstract adapter interface +- `LlmClassifier::ContentFetchers::Web` - HTTP fetcher with SSRF protection +- `LlmClassifier::Rails::Concerns::Classifiable` - ActiveRecord integration + +## Component Documentation + +- [lib/llm_classifier/adapters/CLAUDE.md](lib/llm_classifier/adapters/CLAUDE.md) - LLM adapter contract and implementations +- [lib/llm_classifier/content_fetchers/CLAUDE.md](lib/llm_classifier/content_fetchers/CLAUDE.md) - Content fetchers and SSRF protection +- [lib/llm_classifier/rails/CLAUDE.md](lib/llm_classifier/rails/CLAUDE.md) - Rails integration (Zeitwerk-excluded) +- [spec/CLAUDE.md](spec/CLAUDE.md) - Testing conventions diff --git a/lib/llm_classifier/adapters/CLAUDE.md b/lib/llm_classifier/adapters/CLAUDE.md new file mode 100644 index 0000000..e1f1249 --- /dev/null +++ b/lib/llm_classifier/adapters/CLAUDE.md @@ -0,0 +1,22 @@ +# Adapters + +LLM provider adapters. All inherit from `Adapters::Base` and implement `#chat(model:, system_prompt:, user_prompt:)`. + +## Inventory + +- `Base` - Abstract interface. Provides `#config` helper for accessing `LlmClassifier.configuration` +- `RubyLlm` - Delegates to the `ruby_llm` gem. Returns a Hash with `:content`, `:input_tokens`, `:output_tokens` +- `OpenAI` - Direct `Net::HTTP` POST to OpenAI API. Returns a String +- `Anthropic` - Direct `Net::HTTP` POST to Anthropic API. Returns a String + +## Conventions + +- `#chat` returns either a String (raw content) or a Hash with `:content` plus optional token metadata +- `Classifier#extract_response_data` handles both return types, so new adapters can use either +- API keys come from `LlmClassifier.configuration`, not hardcoded or passed as arguments +- Custom adapters can be a class instance passed directly to `config.adapter` + +## Related + +- [../content_fetchers/CLAUDE.md](../content_fetchers/CLAUDE.md) - Content fetchers +- [../../spec/CLAUDE.md](../../spec/CLAUDE.md) - Testing conventions diff --git a/lib/llm_classifier/content_fetchers/CLAUDE.md b/lib/llm_classifier/content_fetchers/CLAUDE.md new file mode 100644 index 0000000..cb01a43 --- /dev/null +++ b/lib/llm_classifier/content_fetchers/CLAUDE.md @@ -0,0 +1,27 @@ +# Content Fetchers + +Utilities for fetching external content to use as classification input. Not wired into `Classifier` automatically -- callers fetch content and pass it in. + +## Inventory + +- `Base` - Abstract interface. Subclasses implement `#fetch(source)` +- `Web` - HTTP fetcher with SSRF protection, redirect following, and HTML text extraction +- `Null` - No-op fetcher, always returns `nil` + +## SSRF Protection (`Web`) + +- Validates resolved IPs against private/loopback CIDR ranges before connecting +- Follows up to 3 redirects, re-validating each redirect target +- `normalize_redirect_url` handles relative and absolute redirect URLs +- Uses `nil? || empty?` guards (not ActiveSupport `.blank?`) to avoid the dependency + +## HTML Processing (`Web`) + +- Nokogiri is lazily loaded (`require "nokogiri"` inside the method) since it's an optional dependency +- Strips `