GitHub - Heoster/SOHAM: A comprehensive AI agent system that combines memory, vision, voice, automation, and proactive monitoring capabilities through a modular architecture. SOHAM is Self-Organizing Hyper-Adaptive Machine

SOHAM Digital Life Companion A comprehensive AI agent system that combines memory, vision, voice, automation, and proactive monitoring capabilities through a modular architecture. ## Overview SOHAM (Digital Life Companion) is designed as a localized AI agent that: - **Remembers Everything**: Persistent memory using vector and structured databases - **Sees Your Screen**: Advanced vision capabilities with OCR and AI analysis - **Speaks and Listens**: Natural voice interaction with wake word detection - **Controls Your Computer**: Intelligent automation with vision-guided interactions - **Monitors Proactively**: Background health and activity monitoring - **Thinks Intelligently**: Multi-provider AI with automatic failover ## Architecture The system consists of six core modules coordinated by a central Context Orchestrator: - **Memory Module**: ChromaDB (vector) + SQLite (structured) storage - **Brain Module**: Multi-provider AI management (Groq, Cerebras, Ollama) - **Voice Module**: Faster-Whisper + Edge TTS for speech I/O - **Vision Module**: OCR + Moondream vision model for screen analysis - **Automation Module**: PyAutoGUI + JSON protocols for system control - **Observer Module**: Proactive monitoring and notifications ## Quick Start ### Prerequisites - Python 3.9+ - Git - (Optional) Ollama for local AI models ### Installation 1. Clone the repository: ```bash git clone https://github.com/soham-ai/digital-companion.git cd digital-companion ``` 2. Create virtual environment: ```bash python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate ``` 3. Install dependencies: ```bash pip install -r requirements.txt ``` 4. Set up environment variables: ```bash cp .env.example .env # Edit .env with your API keys and preferences ``` 5. Initialize the system: ```bash python -m soham init ``` ### Basic Usage ```python import asyncio from soham import ContextOrchestrator async def main(): # Initialize SOHAM orchestrator = ContextOrchestrator() await orchestrator.initialize() # Process user input response = await orchestrator.process_user_input( "Hello SOHAM, what's on my screen?" ) print(response.data.get("response")) await orchestrator.shutdown() if __name__ == "__main__": asyncio.run(main()) ``` ## Configuration SOHAM uses YAML configuration files with environment variable support: ```yaml # config/default.yaml brain: providers: groq: api_key: "${GROQ_API_KEY}" models: ["llama-3.1-8b-instant"] priority: 1 ``` ## Modules ### Memory Module - **Vector Storage**: Semantic search using ChromaDB - **Structured Storage**: Relational data using SQLite - **Context Injection**: Intelligent context retrieval for queries ### Brain Module - **Multi-Provider**: Groq, Cerebras, Ollama support - **Intelligent Routing**: Automatic provider selection based on task complexity - **Failover**: Seamless fallback between providers ### Voice Module - **Speech-to-Text**: Faster-Whisper for accurate transcription - **Text-to-Speech**: Edge TTS for natural voice synthesis - **Wake Word**: Always-listening activation with "Hey SOHAM" ### Vision Module - **Screen Capture**: Multi-monitor screenshot capabilities - **OCR**: Text extraction using Tesseract - **AI Vision**: UI understanding using Moondream model ### Automation Module - **Protocol Engine**: JSON-based workflow execution - **System Control**: Application and window management - **Vision-Guided**: Dynamic coordinate detection for interactions ### Observer Module - **Health Monitoring**: CPU, memory, battery, disk usage - **Activity Detection**: User behavior and pattern analysis - **Proactive Alerts**: Calendar reminders and system notifications ## Testing SOHAM uses pytest with Hypothesis for comprehensive testing: ```bash # Run all tests pytest # Run specific module tests pytest tests/test_memory/ -v # Run property-based tests only pytest -m property # Run with coverage pytest --cov=soham --cov-report=html ``` ### Property-Based Testing Each correctness property is implemented as a property-based test: ```python from hypothesis import given, strategies as st @given(st.text(min_size=1)) async def test_property_conversation_storage_completeness(conversation_text): """**Feature: soham-core-memory, Property 1: Conversation Storage Completeness**""" # Test implementation pass ``` ## Development ### Setting Up Development Environment ```bash # Install development dependencies pip install -e ".[dev]" # Install pre-commit hooks pre-commit install # Run code formatting black soham/ flake8 soham/ # Type checking mypy soham/ ``` ### Adding New Modules 1. Create module directory: `soham/your_module/` 2. Implement interface: `class YourModule(BaseModule)` 3. Register with orchestrator: `orchestrator.register_module(your_module)` 4. Add tests: `tests/test_your_module/` ## API Reference ### Core Interfaces - `BaseModule`: Abstract base for all modules - `MemoryInterface`: Memory storage and retrieval - `BrainInterface`: AI provider management - `VoiceInterface`: Speech input/output - `VisionInterface`: Screen analysis - `AutomationInterface`: System control - `ObserverInterface`: Monitoring and notifications ### Data Models - `ConversationItem`: Chat history storage - `VisionAnalysisResult`: Screen analysis results - `ProtocolDefinition`: Automation workflow scripts - `SystemState`: Overall system status ## Contributing 1. Fork the repository 2. Create feature branch: `git checkout -b feature/amazing-feature` 3. Make changes and add tests 4. Run test suite: `pytest` 5. Commit changes: `git commit -m 'Add amazing feature'` 6. Push to branch: `git push origin feature/amazing-feature` 7. Open Pull Request ## License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ## Acknowledgments - ChromaDB for vector database capabilities - Faster-Whisper for speech recognition - Edge TTS for speech synthesis - Moondream for vision understanding - PyAutoGUI for system automation - Hypothesis for property-based testing ## Support - Documentation: [docs.soham.ai](https://docs.soham.ai) - Issues: [GitHub Issues](https://github.com/soham-ai/digital-companion/issues) - Discussions: [GitHub Discussions](https://github.com/soham-ai/digital-companion/discussions) - Email: support@soham.ai# SOHAM

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages