Skip to content

Lightning-fast Wikipedia search API serving 4.8M articles with millisecond response times

Notifications You must be signed in to change notification settings

Built-Simple/wikipedia-api

Repository files navigation

Wikipedia API - Lightning-Fast Search

High-performance Wikipedia search API serving 4.8M articles with millisecond response times.

Features

  • Lightning-fast search: Optimized FTS queries with sub-millisecond exact match
  • Dual search strategy: Exact title matching and Full-text search with FTS5
  • Result caching: LRU cache for ultra-fast repeated queries
  • Async logging: Non-blocking performance monitoring
  • Production-ready: Gunicorn WSGI server with 8 workers

Architecture

  • Framework: Flask
  • Database: SQLite with FTS5 full-text search
  • WSGI Server: Gunicorn with multi-worker configuration
  • Caching: Python functools.lru_cache for query results

API Endpoints

Search Articles

GET /api/search?q=query&limit=10

Parameters:

  • q (required): Search query string
  • limit (optional): Number of results (default: 10, max: 20)

Installation

  1. Install dependencies: pip install -r requirements.txt
  2. Ensure database is accessible
  3. Run with Gunicorn: gunicorn --config gunicorn.conf.py wsgi:application

Configuration

  • Workers: 8
  • Bind: 0.0.0.0:80
  • Database: /var/www/wikipedia-api/wikipedia_production.db (7GB, 4.8M articles)

Performance

  • Exact match: <1ms
  • FTS search: 20-50ms
  • Cached queries: <1ms
  • Memory usage: ~70MB per worker

Files

  • wsgi.py - WSGI entry point
  • wikipedia_api_lightning.py - Main application
  • async_worker.py - Async logging worker
  • gunicorn.conf.py - Gunicorn configuration
  • requirements.txt - Python dependencies

About

Lightning-fast Wikipedia search API serving 4.8M articles with millisecond response times

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •