Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions Changelog.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Changelog

All notable changes to speak will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## [Unreleased]

### Added
- Hardware detection via /proc/meminfo (RAM, CPU cores, AVX2)
- Disk-based KV cache (ds4-style) with SHA1 token keys
- Persistent user memory across sessions via user.md
- Agent loop for multi-step tool calls (up to 10 iterations)
- Tool system with read_file, search_web, remember, finish
- Web search via DuckDuckGo (30s timeout, 10 results max)
- Model registry with 13+ pre-configured models
- Interactive model selection on first run
- Auto-setup mode for headless installation
- Resumable model downloads with progress bar
- Multi-threaded downloads via aria2c (falls back to HTTP)
- Streaming chat interface with Readline support
- System prompt customization via embedded system_prompt.txt
- Hardware-aware config.json with detected and active sections
- Memory mapping (mmap) for low-RAM systems (<8GB)
- LRU cache cleanup (max 50 cache files)
- JSON serialization for all config structures
- Crystal spec tests for core functionality
- GitHub Actions CI workflow

### Changed
- N/A (initial development)

### Fixed
- N/A (initial development)

### Removed
- N/A (initial development)

### Security
- Path traversal protection in read_file tool
- File size limit (13MB) for read operations
- Working directory restriction for file access

## [0.12.0-beta] - 2026-05-27

### Added
- First public beta release
- Nanbeige 4.1 3B model support (Q2_K, Q4_K_M, Q6_K)
- Basic chat functionality
- Command history with Readline
- Save and load conversation history
- Memory commands (memory, clearmemory)
- Clear screen command
- Help text in interface

### Known Issues
- macOS support is experimental
- Windows not yet supported
- Web search may be slow on first query
- Large files (>13MB) cannot be read
- Model download requires stable internet connection

### Notes
This is a beta release. Expect bugs and breaking changes.
Please report issues on GitHub.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@

[![Crystal](https://img.shields.io/badge/Crystal-1.12-000000?logo=crystal)](https://crystal-lang.org/)
[![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
[![Security Policy](https://img.shields.io/badge/Security-Policy-blue)](Security.md)
[![Changelog](https://img.shields.io/badge/Change-Log-white)](Changelog.md)
[![CI](https://github.com/zendrx/speak/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/zendrx/speak/actions/workflows/ci.yml)
[![Lines of Code](https://img.shields.io/badge/Lines-1689-blue)](https://github.com/zendrx/speak)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](CONTRIBUTING.md)
Expand Down
128 changes: 128 additions & 0 deletions Security.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@

# Security Policy

## Supported Versions

Only the latest stable version of speak receives security updates.

| Version | Supported |
|---------|-----------|
| latest | ✅ |
| < latest | ❌ |

## Reporting a Vulnerability

If you discover a security vulnerability in speak, please report it privately.

**Do NOT report security issues through public GitHub issues.**

### How to Report

1. Email the maintainer directly at [ynwghosted@icloud.com]
2. Include detailed steps to reproduce the issue
3. Include your system information (OS, Crystal version)
4. Allow up to 48 hours for initial response

### What to Expect

- You will receive acknowledgment of your report within 48 hours
- The maintainer will investigate and confirm the vulnerability
- A fix will be developed and tested
- A security advisory will be published after the fix is released

## Security Measures in speak

### File System Protection

- Path traversal attacks are blocked (files with `..` are rejected)
- File reading is restricted to the current working directory
- Maximum file size is limited to 13MB
- Directory reading is not allowed

### Memory Safety

- speak is written in Crystal, a memory-safe language
- No unsafe pointers or manual memory management
- Bounds checking is performed on all array accesses

### Network Security

- Web search uses DuckDuckGo (no API key required)
- No telemetry or data collection
- All network requests are HTTPS only
- Model downloads verify file size integrity

### Input Validation

- All user input is sanitized before processing
- Tool call arguments are validated before execution
- JSON parsing includes error handling

## Known Limitations

| Area | Limitation | Mitigation |
|------|------------|------------|
| Model files | Downloaded from Hugging Face over HTTPS | Verify file size checksum |
| Web search | DuckDuckGo HTML scraping | No API key, only used when user requests |
| Dependencies | llama.cpp is C++ code | Upstream library, security updates tracked |

## Responsible Disclosure

We follow responsible disclosure practices:

1. Vulnerability is reported privately
2. Maintainer confirms and fixes the issue
3. Fix is tested and released
4. Security advisory is published
5. Public announcement after fix is available

## Cryptographic Measures

speak does not implement any cryptographic functions directly. It relies on Crystal's standard library for SHA1 hashing (used for KV cache keys). The SHA1 algorithm is used only for cache key generation, not for security-critical purposes.

## Third-Party Dependencies

| Dependency | Purpose | Security Notes |
|------------|---------|----------------|
| llama.cr | Crystal bindings to llama.cpp | Upstream library, monitor for updates |
| llama.cpp | Inference engine | C++ library, monitor for CVEs |
| readline | Command line input | Standard system library |

## Reporting Format

When reporting a vulnerability, please include:

```yaml
# Example report format
version: "0.12.0-beta"
os: "Ubuntu 24.04"
crystal: "1.12.0"

description: |
Detailed description of the issue

steps_to_reproduce: |
1. Run ./speak
2. Type specific command
3. Observe unexpected behavior

impact: |
What an attacker could potentially do

proposed_fix: |
Optional: suggested solution
```

Security Contact

- Email: [ynwghosted@icloud.com]
- GitHub: @zendrx
- Response time: 24-48 hours

Acknowledgments

We thank the following people for reporting security issues:

· List will be updated with contributors who report vulnerabilities


136 changes: 100 additions & 36 deletions src/speak.cr
Original file line number Diff line number Diff line change
@@ -1,51 +1,115 @@
# speak.cr - Main entry point for speak
# Integrates hardware detection, model selection, configuration, and chat

require "llama"

Check failure on line 4 in src/speak.cr

View workflow job for this annotation

GitHub Actions / Code Coverage

can't find file 'llama'
require "./speak/system"
require "./speak/config"
require "./speak/model"
require "./speak/install"
require "./speak/disk"
require "./speak/tool"
require "./speak/memory"
require "./speak/launch"

CONFIG_PATH = "./speak/config.json"

def main
config = Speak::Config.load_or_create
settings = config.apply_overrides
# Parse command line arguments
auto_setup = ARGV.includes?("--auto-setup")
force_setup = ARGV.includes?("--setup")
use_case = ARGV.includes?("--coding") ? "coding" : "general"

# Get available RAM to decide mmap
available_ram = Speak::System.available_ram_mb
# Check if config exists and we're not forcing setup
if !File.exists?(CONFIG_PATH) || force_setup
puts "speak - First time setup"
puts "=" * 40

manager = Speak::ModelManager.new(use_case)

if auto_setup || force_setup
success = manager.auto_setup
else
success = manager.setup
end

unless success
puts "Setup failed. Exiting."
exit 1
end

puts "\nSetup complete. Run ./speak again to start chatting."
exit 0
end

# Use mmap only if RAM is tight (< 8GB)
use_mmap = available_ram < 8000
# Load existing configuration
config = Speak::Config.load?
unless config
puts "Error: Config file exists but cannot be loaded."
puts "Please run: ./speak --setup"
exit 1
end

model_path = "./speak/models/#{settings.model_file}"
settings = config.active
detected = config.detected

puts "speak - Local AI Assistant"
puts "=" * 40
puts "Hardware: #{detected.total_ram_mb} MB RAM, #{settings.cpu_cores} cores"
puts "Model: #{settings.model_file}"
puts "Context: #{settings.context_size} tokens"
puts "=" * 40

model = if File.exists?(model_path)
puts "Loading model: #{model_path}"
puts "Available RAM: #{available_ram} MB"
puts "mmap: #{use_mmap} #{use_mmap ? "(RAM saving mode)" : "(Full RAM mode)"}"
Llama::Model.new(model_path, use_mmap: use_mmap)
else
puts "Model file not found: #{model_path}, installing..."
installer = Speak::Install.new
installer.install_model(settings.model_quant)

if File.exists?(model_path)
puts "Model installed successfully: #{model_path}"
Llama::Model.new(model_path, use_mmap: use_mmap)
else
puts "Failed to install model: #{model_path}"
exit(1)
end
end

begin
context = Llama::Context.new(
model: model,
n_ctx: settings.context_size.to_u32
)
launcher = Speak::Launch.new(context, model, settings)
launcher.run
rescue ex : Exception
puts "Error: #{ex.message}"
exit(1)
# Check if model file exists
model_path = "./speak/models/#{settings.model_file}"

unless File.exists?(model_path)
puts "Model file not found: #{model_path}"
puts "Downloading model..."

installer = Speak::Install.new
success = installer.install_model(settings.model_quant)

unless success && File.exists?(model_path)
puts "Failed to download model. Please check your internet connection."
puts "You can also download the model manually and place it in: #{model_path}"
exit 1
end

puts "Model downloaded successfully."
end

# Determine if we should use mmap based on available RAM
use_mmap = settings.use_mmap && detected.available_ram_mb < 8000

# Load the model
puts "Loading model..."
model = Llama::Model.new(model_path, use_mmap: use_mmap)

# Create context
context = Llama::Context.new(
model: model,
n_ctx: settings.context_size
)

# Launch chat interface
puts "Starting chat interface..."
puts "Type 'exit' to quit, 'help' for commands"
puts "-" * 40

launcher = Speak::Launch.new(context, model, settings)
launcher.run
end

# Handle interrupt signals gracefully
Signal::INT.trap do
puts "\n\nInterrupted. Goodbye."
exit 0
end

Signal::TERM.trap do
puts "\n\nTerminated. Goodbye."
exit 0
end

# Run main function
main
Loading
Loading