Skip to content

Add Docusaurus plugin for client-side semantic search #1

@vaibhaviitk

Description

@vaibhaviitk

Overview

Add docusaurus-plugin-altor-vec package to enable client-side semantic search for Docusaurus sites using the altor-vec WASM vector search engine.

Motivation

Docusaurus users need semantic search capabilities without:

  • Server-side infrastructure
  • Third-party services (Algolia costs $0.50/1K searches)
  • API keys for basic functionality
  • Per-query costs

This plugin provides:

  • 🚀 Zero-config semantic search
  • ⚡ Client-side execution (privacy-first)
  • 🎯 Semantic understanding with Transformers.js or OpenAI
  • 📦 Automatic index building during build
  • 🔒 No data sent to external servers

Implementation

Package Structure

  • Location: packages/docusaurus-plugin-altor-vec/
  • Language: TypeScript
  • Architecture: Modular with SOLID principles

Features Implemented

  • ✅ Content extraction from markdown files
  • ✅ Embedding generation (Transformers.js & OpenAI)
  • ✅ HNSW index building using altor-vec
  • ✅ React search UI component
  • ✅ Web Worker integration (off-main-thread)
  • ✅ Full configuration system with validation
  • ✅ Comprehensive error handling
  • ✅ Structured logging
  • ✅ i18n support
  • ✅ Security: Path traversal protection, file size limits, fetch validation

Security Fixes

  • Path traversal validation in ContentExtractor
  • File size limits (10MB per file, 100MB index)
  • Fetch response validation in Web Worker
  • API key sanitization in logs

Documentation

  • Comprehensive README with quick start
  • Security best practices section
  • Browser compatibility notes
  • Troubleshooting guide
  • Updated main README with plugin reference
  • Updated CONTRIBUTING.md for monorepo

Testing

  • ✅ TypeScript compilation successful
  • ✅ All Rust tests passing (31 tests)
  • ✅ Cargo clippy with no warnings
  • ✅ Code properly formatted
  • 🔄 Integration testing to be done post-merge

Breaking Changes

None - this is a new package addition.

Checklist

  • Code follows project style guidelines
  • Tests pass (Rust: 31/31)
  • Documentation updated
  • Security vulnerabilities addressed
  • No hardcoded values
  • LICENSE files consistent
  • Integration testing (post-merge)

Related

  • Closes #[issue-number-will-be-here]

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions