A production-grade, high-performance financial intelligence and analytics platform. MFIS features a premium Bloomberg-style terminal dashboard and advanced machine learning (FinBERT, PyTorch LSTM, XGBoost, and Qwen3 RAG) pipeline capabilities.
The system runs real-time price updates, SEC filing scrapers, news wires, feature storage, multi-agent orchestrations, and backtesting simulations.
graph TD
Client[Web Browser Frontend] -->|WebSockets| WS[WebSocket Live Server]
Client -->|HTTP API| API[FastAPI Endpoints]
subgraph FastAPI Backend App
API -->|Copilot Router| Copilot[Qwen3 Copilot Engine]
API -->|RAG Router| RAG[Hybrid Search & QA Engine]
API -->|Agents Router| Agents[Multi-Agent Coordination]
API -->|Backtest Router| Backtest[Backtesting Simulator]
API -->|Stocks Router| Stocks[Database / Feature Store]
API -->|Monitoring Router| Monitoring[Metrics & Health]
API -->|Explainability Router| SHAP[XGBoost & SHAP Explainer]
API -->|Knowledge Graph Router| KG[Knowledge Graph Builder]
end
subgraph Data Stores
Stocks -->|SQLAlchemy| DB[(PostgreSQL Database)]
RAG -->|Dense Search| FAISS[FAISS Vector Store]
RAG -->|Sparse Search| BM25[BM25 Index]
API -->|Caching Layer| Cache[Msgpack In-Memory Cache]
end
WS -->|Live Updates| Stream[Background Streaming Task]
- Frontend: Custom single-page layout styled using vanilla HSL CSS tokens, responsive layout, Chart.js for data visualization, and Canvas network structures for corporate relationship representation.
- Backend Services: Built using FastAPI with ASGI concurrency. Features a background WS thread broadcasting live ticks, database connection pooling using SQLAlchemy and
asyncpg, and a local Msgpack serialization caching layer. - Ingestion Pipeline (ETL):
- Yahoo Finance API: Imports historical price bars and volume data.
- SEC Filing Scraper: Automated parsing of quarterly/annual filings (Form 10-Q/10-K).
- RSS News Wire: Real-time RSS feeds monitoring news sentiment.
- Analytical Database (PostgreSQL): Stores company profiles, stock listings, historical price points, sentiment ratings, calculated feature records, machine learning outputs, and system logs.
The platform processes market news wires and parsed SEC disclosures using FinBERT (ProsusAI/finbert).
- Model Type: Pre-trained financial BERT (AutoModelForSequenceClassification).
- Classification Output: Emits probability vectors for
positive,negative, andneutralsentiments. - Cache TTL: Outputs cached for 30 minutes to optimize inference latency.
Provides multi-horizon forecasts projecting 1, 7, and 30-day target prices.
- Model Architecture: Multi-layer Long Short-Term Memory (LSTM) network built in PyTorch.
- Inputs: 60-day historical sequence of normalized open, high, low, close, and volume features combined with FinBERT sentiment parameters.
- Training Strategy: Retrained on startup if data exists, avoiding synthetic datasets.
Evaluates volatility, financial, and market risk scores using gradient boosted trees and tree-explainers.
- Core Classifier: XGBoost models training on Technical, Volatility, Sentiment, and Fundamental groups.
- SHAP (SHapley Additive exPlanations): Tree SHAP explainers extract exact mathematical feature contributions for each classification probability.
- Guarantees: Fails explicitly with a
503 Service Unavailableerror if model files are missing or uninitialized instead of falling back to simulated scores.
Implements double-retrieval QA over corporate disclosures:
- Dense Vector Retrieval: Uses
SentenceTransformer(all-MiniLM-L6-v2) to encode documents and index them in a FAISS index. - Sparse Term Retrieval: Tokenizes document corpuses for keywords using a BM25 index (
rank-bm25). - Reranker: Employs
CrossEncoder(ms-marco-MiniLM-L-6-v2) to score query-document pairs, keeping the topkcandidates. - Generator: Intersects candidate fragments and passes them to Qwen3 (
Qwen/Qwen2.5-1.5B-Instruct) pipeline to build readable reports.
Coordinates independent specialist agents to draft a comprehensive investment thesis:
- Sentiment Agent: Scans public RSS wire sentiment scores.
- Forecast Agent: Queries the LSTM forecast sequence.
- Risk Agent: Scans XGBoost SHAP volatility weights.
- Fundamental Agent: Evaluates debt-to-equity and price-to-earnings ratios.
- Coordinator: Aggregates all reports, runs a chain of thought, and saves the final thesis in the database.
All API endpoints are prefixed with /api. Authenticated routes require a valid JWT bearer header: Authorization: Bearer <token>.
POST /api/auth/token: Generates a JWT access token.- Request Body (JSON):
{"api_key": "string"}(optional) - Response (JSON):
{"access_token": "token_str", "token_type": "bearer", "expires_in_hours": 24}
- Request Body (JSON):
GET /api/stocks: Paginated coverage list. Query params:sector,active_only,page,page_size. (No auth required)GET /api/stocks/{ticker}: Company profile, status, and latest close price. Requires JWT Header. (Cached for 5 minutes)GET /api/stocks/{ticker}/prices: Historical closing bars. Query params:start_date,end_date,limit(max 2520). (No auth required, cached for 15 minutes)
GET /api/features/{ticker}: Computed technical/volatility/fundamental feature map. Query params:groups,recompute. (Cached for 15 minutes)GET /api/explainability/{ticker}: XGBoost risk levels and SHAP values for volatility, financial, and market risks.GET /api/agents/analyze/{ticker}: Orchestrates Multi-Agent analysis. Requires JWT Header.GET /api/knowledge-graph: Emits relationships map of companies, events, executives, products, and competitors. Query params:ticker(optional).GET /api/backtesting/run: Backtests trading strategies on historical data. Query params:ticker,strategy(sentiment, momentum, hybrid),capital.POST /api/portfolio/analyze: Computes returns, volatility, Sharpe ratio, and diversification metrics for custom portfolios.- Body:
{"portfolio": [{"ticker": "AAPL", "weight": 40.0}, ...]}
- Body:
POST /api/copilot/query: Chatbot answering finance queries. Body:{"query": "string"}POST /api/rag/query: QA over indexed documents. Body:{"query": "string", "top_k": 3}
GET /api/monitoring/health: Basic liveness check.GET /api/monitoring/detailed-health: Liveness details for Postgres database and Redis cache backend.GET /monitoring/dashboard: ETL schedule details, health summaries, and cache stats.GET /metrics: Prometheus metric exposition text.
WS /ws/live: Live streaming socket. Emits price updates (price_updateticks) and RSS news alerts (news_alertblocks).
The backend features an authentication layer that safeguards critical operations:
- Token Generation: Access tokens expire after 24 hours and are signed using standard HMAC SHA-256 (
HS256) with a cryptographically secureSECRET_KEY. - Dependency Injection: Route endpoints use FastAPI's
Depends(verify_token)validation. If theAuthorizationheader is missing, malformed, or has an expired token, the system returns a401 Unauthorizedor403 Forbiddenresponse. - Frontend Integration: On load,
static/js/app.jsissues a request to/api/auth/tokento acquire a bearer token, storing it in-memory and appending it to all protected fetches.
- Python 3.11+
- PostgreSQL database engine
- (Optional) Redis cache server
Copy .env.example to .env and configure your credentials:
POSTGRES_USER=postgres
POSTGRES_PASSWORD=your_secure_password
POSTGRES_DB=mfis
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
DATABASE_URL=postgresql+asyncpg://postgres:your_secure_password@localhost:5432/mfis
# Authentication Secret
SECRET_KEY=generate_a_secure_token_key_hereInitialize your virtual environment, install the requirements, and upgrade database tables using Alembic:
# Set up virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
.\venv\Scripts\activate
# On Linux/macOS:
source venv/bin/activate
# Install requirements
pip install -r requirements.txt
# Run migrations
alembic upgrade headStart the uvicorn development server:
uvicorn app.main:app --host 127.0.0.1 --port 8000 --reloadOpen http://localhost:8000/ in your web browser. The platform will automatically seed reference assets and train ML models on startup.
To run all tests (including ETL, caching, and API endpoint suites):
pytestThe system includes configuration files to orchestrate the backend app, PostgreSQL, and Redis in individual containers:
# Build and launch containers
docker-compose up --build -d
# Verify logs and statuses
docker-compose ps
docker-compose logs -fFor production, avoid running uvicorn directly. Use Gunicorn with Uvicorn workers to enable process management and multiple workers:
gunicorn -w 4 -k uvicorn.workers.UvicornWorker app.main:app --bind 0.0.0.0:8000- Cause: The route requires a JWT bearer header.
- Solution: Confirm your client makes a
POST /api/auth/tokenrequest and sets the returned token as a Bearer Header:Authorization: Bearer <token_string>.
- Cause: The ETL scheduler runs daily or hourly in the background. If you just initialized the database, historical prices may not have downloaded yet.
- Solution: Let the backend run for a few minutes or trigger data ingestion manually for a ticker via the API. Under standard development settings, querying
/api/stocks/{ticker}automatically triggers ingestion viaensure_ticker_data.
- Cause: If PyTorch, transformers, or XGBoost libraries fail to load pre-trained models due to lack of connection to HuggingFace hubs.
- Solution: Ensure your host machine has internet access on first startup to download the models. Model parameters are cached locally in your standard cache directories.