API Documentation

Overview

The Financial News Scraper provides a RESTful API built with FastAPI for accessing scraped news data programmatically.

Base URL

http://localhost:8000

Authentication

Currently, the API does not require authentication. For production deployments, implement API key authentication.

Endpoints

Health Check

Check API status and version.

Endpoint: GET /health

Response:

{
  "status": "healthy",
  "version": "1.0.0",
  "database": "connected"
}

Get Articles

Retrieve articles with optional filtering.

Endpoint: GET /articles

Query Parameters:

limit (integer, optional): Maximum number of articles (default: 50, max: 1000)
offset (integer, optional): Pagination offset (default: 0)
source (string, optional): Filter by news source
start_date (string, optional): Start date (ISO 8601 format)
end_date (string, optional): End date (ISO 8601 format)
sentiment (string, optional): Filter by sentiment (positive/negative/neutral)

Example Request:

curl "http://localhost:8000/articles?limit=10&source=yahoo_finance&sentiment=positive"

Response:

{
  "total": 150,
  "limit": 10,
  "offset": 0,
  "articles": [
    {
      "id": 1,
      "title": "Market Rally Continues",
      "content": "...",
      "url": "https://...",
      "source": "yahoo_finance",
      "published_date": "2026-02-03T15:30:00",
      "sentiment_score": 0.75,
      "sentiment_label": "positive",
      "mentioned_stocks": ["AAPL", "TSLA"],
      "mentioned_companies": ["Apple", "Tesla"],
      "word_count": 450,
      "read_time_minutes": 2
    }
  ]
}

Get Article by ID

Retrieve a specific article by its ID.

Endpoint: GET /articles/{article_id}

Path Parameters:

article_id (integer, required): Article ID

Example Request:

curl http://localhost:8000/articles/1

Response:

{
  "id": 1,
  "title": "Market Rally Continues",
  "content": "Full article content...",
  "summary": "Brief summary...",
  "url": "https://...",
  "source": "yahoo_finance",
  "author": "John Doe",
  "published_date": "2026-02-03T15:30:00",
  "scraped_date": "2026-02-03T16:00:00",
  "sentiment_score": 0.75,
  "sentiment_label": "positive",
  "mentioned_stocks": ["AAPL", "TSLA"],
  "mentioned_companies": ["Apple", "Tesla"],
  "mentioned_persons": ["Tim Cook", "Elon Musk"],
  "category": "Technology",
  "tags": ["stocks", "earnings"],
  "word_count": 450,
  "read_time_minutes": 2
}

Search Articles

Search articles by keyword.

Endpoint: GET /search

Query Parameters:

q (string, required): Search query
limit (integer, optional): Maximum results (default: 50)
offset (integer, optional): Pagination offset (default: 0)

Example Request:

curl "http://localhost:8000/search?q=Tesla&limit=20"

Response:

{
  "query": "Tesla",
  "total": 45,
  "limit": 20,
  "offset": 0,
  "results": [
    {
      "id": 5,
      "title": "Tesla Reports Record Earnings",
      "snippet": "...Tesla announced record quarterly earnings...",
      "url": "https://...",
      "source": "cnbc",
      "published_date": "2026-02-03T14:00:00",
      "relevance_score": 0.95
    }
  ]
}

Get Statistics

Retrieve aggregated statistics.

Endpoint: GET /stats

Query Parameters:

start_date (string, optional): Start date for statistics
end_date (string, optional): End date for statistics

Example Request:

curl "http://localhost:8000/stats?start_date=2026-02-01&end_date=2026-02-03"

Response:

{
  "period": {
    "start": "2026-02-01",
    "end": "2026-02-03"
  },
  "total_articles": 450,
  "sources": {
    "yahoo_finance": 120,
    "cnbc": 95,
    "bbc_business": 85,
    "guardian_business": 75,
    "marketwatch": 45,
    "seeking_alpha": 30
  },
  "sentiment_distribution": {
    "positive": 180,
    "neutral": 200,
    "negative": 70
  },
  "average_sentiment": 0.15,
  "top_stocks": {
    "AAPL": 45,
    "TSLA": 38,
    "MSFT": 32,
    "GOOGL": 28,
    "AMZN": 25
  },
  "articles_per_day": {
    "2026-02-01": 145,
    "2026-02-02": 155,
    "2026-02-03": 150
  }
}

Get Sources

List all available news sources.

Endpoint: GET /sources

Response:

{
  "sources": [
    {
      "name": "yahoo_finance",
      "display_name": "Yahoo Finance",
      "article_count": 1250,
      "last_scraped": "2026-02-03T16:00:00",
      "status": "active"
    },
    {
      "name": "cnbc",
      "display_name": "CNBC",
      "article_count": 980,
      "last_scraped": "2026-02-03T16:00:00",
      "status": "active"
    }
  ]
}

Export Data

Export articles in various formats.

Endpoint: GET /export

Query Parameters:

format (string, required): Export format (json/csv/xml/parquet)
start_date (string, optional): Start date
end_date (string, optional): End date
source (string, optional): Filter by source

Example Request:

curl "http://localhost:8000/export?format=csv&start_date=2026-02-01" -o export.csv

Response: File download

Get Sentiment Trends

Retrieve sentiment trends over time.

Endpoint: GET /sentiment/trends

Query Parameters:

start_date (string, optional): Start date
end_date (string, optional): End date
granularity (string, optional): day/week/month (default: day)

Example Request:

curl "http://localhost:8000/sentiment/trends?granularity=day"

Response:

{
  "granularity": "day",
  "data": [
    {
      "date": "2026-02-01",
      "average_sentiment": 0.12,
      "positive_count": 60,
      "neutral_count": 70,
      "negative_count": 15
    },
    {
      "date": "2026-02-02",
      "average_sentiment": 0.18,
      "positive_count": 65,
      "neutral_count": 68,
      "negative_count": 12
    }
  ]
}

Error Responses

400 Bad Request

{
  "error": "Bad Request",
  "message": "Invalid date format. Use ISO 8601 format (YYYY-MM-DD)",
  "status_code": 400
}

404 Not Found

{
  "error": "Not Found",
  "message": "Article with ID 999 not found",
  "status_code": 404
}

500 Internal Server Error

{
  "error": "Internal Server Error",
  "message": "Database connection failed",
  "status_code": 500
}

Rate Limiting

Current implementation does not enforce rate limits. For production:

Recommended: 100 requests per minute per IP
Implement using middleware or API gateway

CORS

CORS is enabled for all origins in development. Configure appropriately for production.

Pagination

For endpoints returning multiple results:

Use limit and offset parameters
Maximum limit: 1000
Default limit: 50

Example:

# Page 1
curl "http://localhost:8000/articles?limit=50&offset=0"

# Page 2
curl "http://localhost:8000/articles?limit=50&offset=50"

Filtering

Combine multiple filters:

curl "http://localhost:8000/articles?source=yahoo_finance&sentiment=positive&start_date=2026-02-01&limit=100"

Date Formats

All dates use ISO 8601 format:

Date: YYYY-MM-DD (e.g., 2026-02-03)
DateTime: YYYY-MM-DDTHH:MM:SS (e.g., 2026-02-03T15:30:00)

Response Formats

All responses are in JSON format with UTF-8 encoding.

Client Examples

Python

import requests

# Get articles
response = requests.get('http://localhost:8000/articles', params={
    'limit': 10,
    'source': 'yahoo_finance'
})
articles = response.json()

# Search
response = requests.get('http://localhost:8000/search', params={
    'q': 'Tesla'
})
results = response.json()

JavaScript

// Get articles
fetch('http://localhost:8000/articles?limit=10')
  .then(response => response.json())
  .then(data => console.log(data));

// Search
fetch('http://localhost:8000/search?q=Tesla')
  .then(response => response.json())
  .then(data => console.log(data));

cURL

# Get articles
curl -X GET "http://localhost:8000/articles?limit=10"

# Search
curl -X GET "http://localhost:8000/search?q=Tesla"

# Export
curl -X GET "http://localhost:8000/export?format=csv" -o export.csv

WebSocket Support

Real-time updates (future feature):

const ws = new WebSocket('ws://localhost:8000/ws');

ws.onmessage = (event) => {
  const article = JSON.parse(event.data);
  console.log('New article:', article);
};

API Versioning

Current version: v1

Future versions will be accessible via:

http://localhost:8000/v2/articles

Support

For API issues or questions:

Check Usage Guide
Review Troubleshooting Guide
Open GitHub issue

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Documentation

Overview

Base URL

Authentication

Endpoints

Health Check

Get Articles

Get Article by ID

Search Articles

Get Statistics

Get Sources

Export Data

Get Sentiment Trends

Error Responses

400 Bad Request

404 Not Found

500 Internal Server Error

Rate Limiting

CORS

Pagination

Filtering

Date Formats

Response Formats

Client Examples

Python

JavaScript

cURL

WebSocket Support

API Versioning

Support

FilesExpand file tree

API.md

Latest commit

History

API.md

File metadata and controls

API Documentation

Overview

Base URL

Authentication

Endpoints

Health Check

Get Articles

Get Article by ID

Search Articles

Get Statistics

Get Sources

Export Data

Get Sentiment Trends

Error Responses

400 Bad Request

404 Not Found

500 Internal Server Error

Rate Limiting

CORS

Pagination

Filtering

Date Formats

Response Formats

Client Examples

Python

JavaScript

cURL

WebSocket Support

API Versioning

Support