Skip to content

Image Support

Mo Abualruz edited this page Dec 9, 2025 · 1 revision

Image Support Guide

Include images in your prompts with drag-and-drop support, automatic analysis, and intelligent caching.

Status: ✅ Complete

Phase: Phase 7 - Integration Features

Last Updated: December 9, 2025


Overview

The RiceCoder Image Support module enables you to drag-and-drop images into the terminal UI, which are automatically analyzed by AI providers and included in your prompts. Images are cached for performance and displayed in the terminal with fallback support for unsupported terminals.

Key Concepts

  • Drag-and-drop support: Simply drag images into the terminal to include them
  • Multi-format support: PNG, JPG, GIF, and WebP formats
  • Automatic analysis: Images are analyzed by AI providers (OpenAI, Anthropic, Ollama, etc.)
  • Smart caching: Analysis results are cached with 24-hour TTL and LRU eviction
  • Terminal display: Images are rendered in the terminal with metadata
  • Fallback rendering: ASCII placeholders for unsupported terminals
  • Sequential analysis: Multiple images are analyzed one at a time
  • Large image optimization: Automatic optimization for images over 10 MB

Getting Started

Basic Usage

Drag and drop an image into the terminal:

# Start ricecoder
rice chat

# In the TUI:
# 1. Drag an image file into the terminal window
# 2. Image is validated and added to the prompt
# 3. Image is displayed with metadata
# 4. Image is analyzed by the AI provider
# 5. Analysis is included in the prompt context

Supported Formats

RiceCoder supports the following image formats:

  • PNG - Portable Network Graphics (lossless)
  • JPG/JPEG - Joint Photographic Experts Group (lossy)
  • GIF - Graphics Interchange Format (animated)
  • WebP - Modern web format (efficient)

File Size Limits

  • Maximum file size: 10 MB
  • Automatic optimization: Files over 10 MB are automatically optimized before sending to providers
  • Cache limit: 100 MB total cache with LRU eviction

Terminal Support

Image Support works on all modern terminals:

  • Full support: iTerm2, Windows Terminal, GNOME Terminal, Konsole, Alacritty
  • Fallback support: Other terminals show ASCII placeholders

How to Use

Adding Images

Method 1: Drag and Drop

The easiest way to add images:

# In the ricecoder TUI:
# 1. Open a chat session
# 2. Drag an image file into the terminal window
# 3. Image is automatically added to the prompt

Method 2: File Path

Specify image paths directly:

# In the prompt:
# "Analyze this screenshot: /path/to/screenshot.png"
# 
# RiceCoder detects the file path and includes the image

Viewing Images

Images are displayed in the terminal with metadata:

┌─────────────────────────────────────────┐
│ Image: screenshot.png                   │
│ Format: PNG | Size: 2.5 MB              │
│ Dimensions: 1920x1080                   │
├─────────────────────────────────────────┤
│ [Image preview - 80x30 max]             │
└─────────────────────────────────────────┘

Removing Images

Remove an image from the prompt:

# In the TUI:
# 1. Navigate to the image
# 2. Press 'Delete' or 'Backspace'
# 3. Image is removed from the prompt

Multiple Images

Include multiple images in a single prompt:

# In the TUI:
# 1. Drag first image
# 2. Drag second image
# 3. Continue adding images as needed
# 
# Images are analyzed sequentially
# All analyses are included in the prompt

Configuration

Image Settings

Configure image support in .ricecoder/config.yaml:

images:
  # Supported formats
  formats:
    - png
    - jpg
    - jpeg
    - gif
    - webp
  
  # Display settings
  display:
    max_width: 80          # Max width for terminal display
    max_height: 30         # Max height for terminal display
    placeholder_char: ""  # ASCII placeholder character
  
  # Cache settings
  cache:
    enabled: true
    ttl_seconds: 86400     # 24 hours
    max_size_mb: 100       # LRU limit
  
  # Analysis settings
  analysis:
    timeout_seconds: 10    # Provider timeout
    max_image_size_mb: 10  # Optimization threshold
    optimize_large_images: true

Per-Project Configuration

Override settings for specific projects:

# Create project-specific config
mkdir -p .agent/config
cat > .agent/config/images.yaml << 'EOF'
images:
  cache:
    ttl_seconds: 3600      # 1 hour for this project
    max_size_mb: 50        # Smaller cache for this project
  
  analysis:
    timeout_seconds: 5     # Faster timeout
EOF

User-Level Configuration

Set defaults for all projects:

# Create user config
mkdir -p ~/.ricecoder/config
cat > ~/.ricecoder/config/images.yaml << 'EOF'
images:
  display:
    max_width: 100
    max_height: 40
  
  cache:
    ttl_seconds: 172800    # 48 hours
    max_size_mb: 200
EOF

Usage Examples

Example 1: Analyze a Screenshot

Analyze a screenshot of your code:

# In ricecoder TUI:
# 1. Drag screenshot.png into the terminal
# 2. Type: "What issues do you see in this code?"
# 3. Press Enter
# 
# RiceCoder:
# - Analyzes the screenshot
# - Identifies code issues
# - Provides suggestions

Example 2: Design Review

Get feedback on a design mockup:

# In ricecoder TUI:
# 1. Drag mockup.png into the terminal
# 2. Type: "Review this UI design for usability"
# 3. Press Enter
# 
# RiceCoder:
# - Analyzes the design
# - Provides UX feedback
# - Suggests improvements

Example 3: Multiple Images

Compare multiple images:

# In ricecoder TUI:
# 1. Drag before.png into the terminal
# 2. Drag after.png into the terminal
# 3. Type: "What changed between these two screenshots?"
# 4. Press Enter
# 
# RiceCoder:
# - Analyzes both images
# - Identifies differences
# - Explains the changes

Example 4: Error Diagnosis

Debug an error from a screenshot:

# In ricecoder TUI:
# 1. Drag error-screenshot.png into the terminal
# 2. Type: "Help me debug this error"
# 3. Press Enter
# 
# RiceCoder:
# - Analyzes the error
# - Identifies the root cause
# - Suggests fixes

Example 5: Architecture Diagram

Analyze an architecture diagram:

# In ricecoder TUI:
# 1. Drag architecture.png into the terminal
# 2. Type: "Explain this architecture and suggest improvements"
# 3. Press Enter
# 
# RiceCoder:
# - Analyzes the diagram
# - Explains the architecture
# - Suggests optimizations

Caching

How Caching Works

Image analysis results are cached automatically:

First analysis:
1. Image is hashed (SHA256)
2. Cache is checked (miss)
3. Image is sent to provider
4. Analysis result is cached
5. Result is returned

Second analysis (same image):
1. Image is hashed (SHA256)
2. Cache is checked (hit)
3. Cached result is returned immediately
4. No provider call needed

Cache Performance

Caching provides significant performance improvements:

  • Cache hit: < 50ms (instant)
  • Cache miss: < 10 seconds (provider timeout)
  • Speedup: 200x faster for cached images

Cache Management

View and manage the cache:

# View cache statistics
rice cache stats images

# Clear cache
rice cache clear images

# Clear old entries
rice cache clean images --older-than 24h

# View cache entries
rice cache list images

Cache Locations

Caches are stored in two locations:

  • User cache: ~/.ricecoder/cache/images/ (persistent across projects)
  • Project cache: projects/ricecoder/.agent/cache/images/ (project-specific)

Cache Expiration

Cache entries expire after 24 hours by default:

# .ricecoder/config.yaml
images:
  cache:
    ttl_seconds: 86400     # 24 hours

When an entry expires, the image is reanalyzed on next use.

LRU Eviction

When cache exceeds 100 MB, least recently used entries are removed:

# .ricecoder/config.yaml
images:
  cache:
    max_size_mb: 100       # LRU limit

Providers

Supported Providers

Image Support works with all ricecoder providers:

  • OpenAI - GPT-4 Vision, GPT-4 Turbo
  • Anthropic - Claude 3 Vision models
  • Google - Gemini Vision models
  • Ollama - Local vision models (llava, etc.)
  • Zen - Zen AI provider
  • Custom - Any provider with vision support

Provider Configuration

Configure which provider to use:

# .ricecoder/config.yaml
providers:
  default: openai
  
  openai:
    model: gpt-4-vision
    api_key: ${OPENAI_API_KEY}
  
  anthropic:
    model: claude-3-vision
    api_key: ${ANTHROPIC_API_KEY}
  
  ollama:
    model: llava
    base_url: http://localhost:11434

Provider-Specific Behavior

Different providers handle images differently:

  • OpenAI: Supports PNG, JPG, GIF, WebP
  • Anthropic: Supports PNG, JPG, GIF, WebP
  • Google: Supports PNG, JPG, GIF, WebP
  • Ollama: Supports PNG, JPG (local processing)

Token Counting

Image tokens are counted and included in usage:

# View token usage
rice info tokens

# Example output:
# Image tokens: 765 (GPT-4 Vision)
# Text tokens: 234
# Total: 999 tokens

Performance

Performance Targets

Image Support is optimized for performance:

  • Drag-and-drop detection: < 100ms
  • Format validation: < 500ms
  • Analysis: < 10 seconds (provider timeout)
  • Cache lookup: < 50ms
  • Display rendering: < 200ms

Optimization Tips

Optimize image analysis performance:

# .ricecoder/config.yaml
images:
  analysis:
    # Reduce timeout for faster feedback
    timeout_seconds: 5
    
    # Optimize large images automatically
    optimize_large_images: true
    
    # Reduce image size threshold
    max_image_size_mb: 5

Large Image Handling

Large images are automatically optimized:

Original image: 15 MB
↓
Optimization:
- Reduce resolution
- Compress quality
- Convert format if needed
↓
Optimized image: 2 MB
↓
Send to provider

Troubleshooting

Issue: Image not recognized

Symptoms: Dragged image is not added to prompt

Solutions:

  1. Check file format (PNG, JPG, GIF, WebP)
  2. Check file size (max 10 MB)
  3. Verify file is readable
  4. Try a different image
# Check image format
file image.png

# Check file size
ls -lh image.png

# Try converting format
convert image.png image.jpg

Issue: Image analysis fails

Symptoms: Image is added but analysis fails

Solutions:

  1. Check provider configuration
  2. Check API key is valid
  3. Check network connectivity
  4. Try a different provider
# Test provider connection
rice test provider openai

# Check API key
echo $OPENAI_API_KEY

# Try different provider
rice config set providers.default anthropic

Issue: Cache not working

Symptoms: Same image is analyzed multiple times

Solutions:

  1. Check cache is enabled
  2. Check cache directory exists
  3. Clear cache and try again
# Check cache status
rice cache stats images

# Enable cache
rice config set images.cache.enabled true

# Clear cache
rice cache clear images

Issue: Image display is broken

Symptoms: Image shows as ASCII placeholder or doesn't display

Solutions:

  1. Check terminal supports images
  2. Try different terminal
  3. Disable image display
# Check terminal support
rice info terminal

# Try different terminal
# iTerm2, Windows Terminal, GNOME Terminal, etc.

# Disable image display
rice config set images.display.enabled false

Issue: Performance is slow

Symptoms: Image analysis takes too long

Solutions:

  1. Reduce image size
  2. Use faster provider
  3. Enable caching
  4. Reduce timeout
# .ricecoder/config.yaml
images:
  analysis:
    timeout_seconds: 5
    max_image_size_mb: 5
    optimize_large_images: true
  
  cache:
    enabled: true

Issue: Out of memory

Symptoms: RiceCoder crashes when adding images

Solutions:

  1. Reduce cache size
  2. Clear cache
  3. Use smaller images
# Reduce cache size
rice config set images.cache.max_size_mb 50

# Clear cache
rice cache clear images

# Optimize image
convert large.png -resize 50% small.png

Issue: Provider doesn't support images

Symptoms: Error message about unsupported format

Solutions:

  1. Use different provider
  2. Convert image format
  3. Check provider documentation
# List providers with image support
rice info providers --filter vision

# Convert image format
convert image.png image.jpg

# Check provider docs
rice help provider openai

Advanced Usage

Batch Image Processing

Process multiple images programmatically:

# Analyze all images in a directory
for image in *.png; do
  rice analyze image "$image"
done

Custom Image Analysis

Create custom analysis workflows:

# Create a workflow
cat > .agent/workflows/image-analysis.yaml << 'EOF'
name: Image Analysis
steps:
  - name: Load Image
    action: load_image
    input: ${image_path}
  
  - name: Analyze
    action: analyze_image
    input: ${image}
  
  - name: Report
    action: generate_report
    input: ${analysis}
EOF

# Run workflow
rice workflow run image-analysis --image screenshot.png

Integration with Other Tools

Combine image analysis with other ricecoder features:

# Analyze image and generate code
rice chat << 'EOF'
Analyze this screenshot and generate code to fix the issues.
[image: screenshot.png]
EOF

# Analyze image and create spec
rice spec create from-image --image mockup.png

Programmatic Access

Use image analysis in your code:

use ricecoder_images::{ImageHandler, ImageAnalyzer};

// Load and analyze image
let handler = ImageHandler::new(config);
let metadata = handler.read_image("screenshot.png").await?;

let analyzer = ImageAnalyzer::new(provider);
let analysis = analyzer.analyze(&metadata).await?;

println!("Analysis: {}", analysis.text);

Best Practices

1. Use High-Quality Images

Provide clear, high-quality images for better analysis:

# Good: Clear screenshot with good contrast
# Bad: Blurry or low-resolution image

# Optimize image quality
convert image.png -quality 95 optimized.png

2. Include Context

Provide context in your prompt along with the image:

# Good: "Fix the layout issues in this screenshot"
# Bad: "What do you think?"

# Include specific questions
"Analyze this error message and suggest fixes"

3. Use Appropriate Formats

Choose the right format for your image:

# PNG: Screenshots, diagrams (lossless)
# JPG: Photos, complex images (lossy)
# GIF: Animated images
# WebP: Modern format (efficient)

4. Manage Cache

Keep cache clean for optimal performance:

# Clear old cache entries
rice cache clean images --older-than 7d

# Monitor cache size
rice cache stats images

5. Test with Different Providers

Different providers may give different results:

# Test with OpenAI
rice config set providers.default openai
rice chat

# Test with Anthropic
rice config set providers.default anthropic
rice chat

# Compare results

6. Optimize Large Images

Optimize large images before analysis:

# Reduce resolution
convert large.png -resize 50% small.png

# Compress quality
convert image.png -quality 80 compressed.png

# Convert format
convert image.png image.webp

7. Document Image Sources

Document where images come from:

# Good: "Screenshot from production error"
# Bad: "Some image"

# Include metadata
"Screenshot from 2025-12-09 showing login error"

See Also


Last updated: December 9, 2025

Clone this wiki locally