Skip to content

🚧 👷‍♂️ Working on a Financial Audit Agentic System, that will use LangGraph and PydanticAI for the agents orchestration and logic, and OpenRouter as LLM provider, DSpy for prompts, and Reinforcement Learning via ART (Agent Reinforcement Trainer).

Notifications You must be signed in to change notification settings

SamoraDC/FinancialAuditAgenticSystem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

# Financial Audit Agentic System <�>� > **Enterprise-grade AI-powered financial audit system with multi-agent coordination and advanced statistical analysis** [![CI/CD Pipeline](https://github.com/yourusername/FinancialAuditAgenticSystem/workflows/CI/CD%20Pipeline/badge.svg)](https://github.com/yourusername/FinancialAuditAgenticSystem/actions) [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) ## <� Overview The Financial Audit Agentic System is a comprehensive AI-powered platform that revolutionizes financial auditing through: - **>� Multi-Agent AI Coordination** - Specialized AI agents for different audit tasks - **=� Advanced Statistical Analysis** - Benford's Law, Zipf's Law, and anomaly detection - **=� Intelligent Document Processing** - Universal document extraction with OCR fallback - **=� Enterprise Security** - GuardRails AI integration with PII protection - **� Real-time Monitoring** - OpenTelemetry and observability stack - **=� Production-Ready** - Docker, CI/CD, and cloud deployment ready ## <�� Architecture ```mermaid graph TB A[Frontend - Next.js] --> B[API Gateway - FastAPI] B --> C[Document Processor] B --> D[Statistical Analyzer] B --> E[LangGraph Workflow] C --> F[langextract + OCR] D --> G[Benford's Law] D --> H[Zipf's Law] E --> I[Groq LLM] E --> J[GuardRails AI] B --> K[DuckDB] E --> L[Redis State Store] M[OpenTelemetry] --> N[Jaeger + Prometheus] ``` ## =� Quick Start ### Prerequisites - **Python 3.12+** - **uv** package manager (`pip install uv`) - **Node.js 20+** (for frontend) - **Docker & Docker Compose** (for full stack) ### 1. Clone and Setup ```bash git clone https://github.com/yourusername/FinancialAuditAgenticSystem.git cd FinancialAuditAgenticSystem # Install Python dependencies uv sync # Setup environment cp .env.template .env # Edit .env with your API keys ``` ### 2. Configure Environment ```bash # Required: Add your Groq API key to .env GROQ_API_KEY=your-groq-api-key-here # Optional: Configure other services REDIS_URL=redis://localhost:6379/0 DUCKDB_PATH=./audit_data.duckdb ``` ### 3. Start Development #### Backend Only ```bash cd backend uv run uvicorn main:app --reload --host 0.0.0.0 --port 8000 ``` #### Full Stack with Docker ```bash # Update docker-compose.yml for DuckDB (see deployment notes) docker-compose up -d ``` #### Frontend Development ```bash cd frontend npm install npm run dev ``` ## =� Project Structure ``` FinancialAuditAgenticSystem/ � =� backend/ # FastAPI Backend � � =� api/ # API routes and endpoints � � =� core/ # Configuration and logging � � =� database/ # DuckDB operations � � =� services/ # Business logic services � � � =� document_processor.py # langextract integration � � � =� statistical_analyzer.py # Benford's & Zipf's Law � � � =� groq_llm_service.py # LLM integration � � � =� guardrails_service.py # Security & validation � � � =� observability_service.py # Monitoring � � =� workflows/ # LangGraph state machines � � =� tests/ # Comprehensive test suite � =� frontend/ # Next.js Frontend � � =� src/components/ # React components � � =� src/hooks/ # Custom React hooks � � =� src/types/ # TypeScript definitions � =� .github/workflows/ # CI/CD pipeline � =� config/ # Configuration files � =� docs/ # Documentation � =� docker-compose.yml # Container orchestration ``` ## =' Core Features ### 1. Document Processing Pipeline - **Universal Format Support**: PDF, DOCX, TXT, Markdown - **OCR Fallback**: Handles non-selectable PDFs - **Smart Extraction**: Financial entities and transaction data - **Batch Processing**: Multiple documents in parallel ### 2. Statistical Analysis Engine - **Benford's Law**: First-digit frequency analysis - **Newcomb-Benford's Law**: Two-digit pattern detection - **Zipf's Law**: Vendor payment distribution analysis - **Anomaly Scoring**: Statistical deviation identification ### 3. AI-Powered Analysis - **Groq LLM Integration**: `openai/gpt-oss-120b` for analysis - **Security Model**: `meta-llama/llama-guard-4-12b` for content filtering - **Multi-Model Approach**: Specialized models for different tasks ### 4. Workflow Orchestration - **LangGraph State Machines**: 10-step audit process - **Redis Persistence**: Fault-tolerant execution - **Human-in-the-Loop**: Review and approval workflow - **Progress Tracking**: Real-time status monitoring ### 5. Enterprise Security - **GuardRails AI**: PII detection and content filtering - **Input Validation**: Comprehensive security scanning - **Audit Trail**: Complete operation logging - **Compliance**: GAAP, SOX framework support ## >� Testing ```bash # Run all tests uv run pytest backend/tests/ --cov=backend # Run specific test suites uv run pytest backend/tests/unit/ uv run pytest backend/tests/integration/ uv run pytest backend/tests/performance/ # Performance testing with Locust uv run locust --locustfile backend/tests/performance/test_load.py ``` ## =� Deployment ### Development Deployment ```bash # Local development uv run uvicorn backend.main:app --reload # Docker development docker-compose up -d ``` ### Production Deployment #### 1. Update Configuration ```bash # Production environment variables ENVIRONMENT=production DEBUG=false SECRET_KEY=your-production-secret-key GROQ_API_KEY=your-production-groq-key ``` #### 2. Database Setup ```bash # DuckDB will be created automatically # For persistent storage, mount volume: # -v ./data:/app/data ``` #### 3. Deploy with Docker ```bash # Build production images docker-compose -f docker-compose.prod.yml up -d # Or use the CI/CD pipeline for automated deployment ``` ## =� Security Considerations ### Before GitHub Deployment 1. **� Secrets Management** - API keys moved to environment variables - `.env` added to `.gitignore` - Production secrets in CI/CD secrets 2. **=� Security Scanning** - Bandit SAST scanning enabled - Safety vulnerability checking - Semgrep security analysis 3. **=�� GuardRails Integration** - PII detection and redaction - Content safety filtering - Input validation and sanitization ## =� Monitoring & Observability ### OpenTelemetry Stack - **Distributed Tracing**: Request flow visualization - **Metrics Collection**: Business and technical KPIs - **Log Aggregation**: Centralized logging - **Performance Monitoring**: Real-time bottleneck detection ### Key Metrics - **Audit Completion Rate**: Business KPI tracking - **Document Processing Time**: Performance monitoring - **Anomaly Detection Accuracy**: ML model performance - **API Response Times**: Technical performance ## >� Contributing 1. **Fork the repository** 2. **Create feature branch**: `git checkout -b feature/amazing-feature` 3. **Run tests**: `uv run pytest` 4. **Commit changes**: `git commit -m 'Add amazing feature'` 5. **Push to branch**: `git push origin feature/amazing-feature` 6. **Open Pull Request** ### Development Guidelines - Follow PEP 8 for Python code - Use TypeScript for frontend development - Write comprehensive tests for new features - Update documentation for API changes ## =� External APIs and Integrations ### Core API Integrations #### Groq LLM API - **Purpose**: Advanced AI analysis and insight generation - **Models**: `openai/gpt-oss-120b`, `meta-llama/llama-guard-4-12b` - **Setup**: Get API key from [Groq Console](https://console.groq.com) - **Documentation**: [API Integration Guide](docs/api-integration-guide.md) #### GuardRails AI - **Purpose**: Content safety, PII detection, and input validation - **Setup**: Get API key from [GuardRails AI](https://www.guardrailsai.com) - **Features**: PII detection, toxic language filtering, SQL injection prevention #### LangGraph - **Purpose**: Workflow orchestration and state management - **Setup**: Optional for advanced features from [LangSmith](https://smith.langchain.com) - **Features**: Multi-step audit workflows, fault-tolerant execution ### Optional Integrations #### QuickBooks - **Purpose**: Import financial data from QuickBooks - **Setup**: OAuth 2.0 authentication via [Intuit Developer](https://developer.intuit.com) - **Environment Variables**: `QUICKBOOKS_CLIENT_ID`, `QUICKBOOKS_CLIENT_SECRET` #### SAP - **Purpose**: Enterprise resource planning integration - **Setup**: Contact SAP administrator for credentials - **Environment Variables**: `SAP_ENDPOINT`, `SAP_USERNAME`, `SAP_PASSWORD` #### Banking APIs - **Purpose**: Direct bank transaction import - **Setup**: Contact your bank for API access - **Environment Variables**: `BANK_API_KEY`, `BANK_API_ENDPOINT` For detailed integration guides, see [API Integration Guide](docs/api-integration-guide.md). ## =� Environment Variables ### Required Variables ```bash # Core Configuration GROQ_API_KEY=your-groq-api-key-here # Get from https://console.groq.com REDIS_URL=redis://localhost:6379/0 # Redis connection for state management DUCKDB_PATH=./audit_data.duckdb # Path to DuckDB database SECRET_KEY=your-secret-key-here # Generate with: openssl rand -hex 32 ``` ### Optional Variables ```bash # Security GUARDRAILS_API_KEY=your-guardrails-key # Content safety and PII detection SENTRY_DSN=your-sentry-dsn # Error tracking # Workflows LANGGRAPH_API_KEY=your-langgraph-key # Advanced workflow features # External Integrations QUICKBOOKS_CLIENT_ID=your-client-id # QuickBooks integration SAP_ENDPOINT=your-sap-endpoint # SAP integration BANK_API_KEY=your-bank-api-key # Banking integration ``` Complete list with descriptions in [.env.example](.env.example) ## =� API Documentation Once running, access interactive API documentation: - **Swagger UI**: http://localhost:8000/docs - **ReDoc**: http://localhost:8000/redoc - **OpenAPI Spec**: http://localhost:8000/openapi.json ### Key Endpoints ```bash # Health Check GET /health # Start Audit POST /api/v1/audit/start { "document_id": "string", "audit_type": "comprehensive" } # Get Audit Status GET /api/v1/audit/{audit_id}/status # Get Audit Results GET /api/v1/audit/{audit_id}/results # List Audits GET /api/v1/audits ``` For detailed API documentation, see [API Integration Guide](docs/api-integration-guide.md) ## =�� Roadmap - [ ] **Enhanced ML Models**: Advanced fraud detection algorithms - [ ] **Multi-tenant Support**: Enterprise customer isolation - [ ] **Real-time Collaboration**: Multi-user audit sessions - [ ] **Mobile App**: React Native audit companion - [ ] **API Integrations**: QuickBooks, SAP, banking APIs ## =� Documentation Comprehensive documentation is available in the [docs/](./docs/) directory: - **[API Integration Guide](docs/api-integration-guide.md)**: External API setup and configuration - **[Deployment Guide](docs/deployment-guide.md)**: Production deployment and CI/CD - **[Developer Guide](docs/developer-guide.md)**: Development setup and contribution guidelines - **[User Guide](docs/user-guide.md)**: Feature overview and usage instructions ## =� Support - **Documentation**: [docs/](./docs/) - **Issues**: [GitHub Issues](https://github.com/yourusername/FinancialAuditAgenticSystem/issues) - **Discussions**: [GitHub Discussions](https://github.com/yourusername/FinancialAuditAgenticSystem/discussions) ## =� License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ## =O Acknowledgments - **langextract** for universal document processing - **Groq** for high-performance LLM inference - **LangGraph** for workflow orchestration - **GuardRails AI** for enterprise security - **DuckDB** for analytical database capabilities --- **Built with d� using Claude Flow MCP coordination for enterprise-grade financial auditing**

About

🚧 👷‍♂️ Working on a Financial Audit Agentic System, that will use LangGraph and PydanticAI for the agents orchestration and logic, and OpenRouter as LLM provider, DSpy for prompts, and Reinforcement Learning via ART (Agent Reinforcement Trainer).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •