Skip to content

ccascio/ConvertPrivately

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Privacy-first conversion and utility tools for documents, images, data, media, and measurement workflows.

This repository powers the ConvertPrivately product: a React frontend with a Flask API for OCR and file conversion, plus supporting Cloudflare workers for payments and YouTube-related flows.

What This Project Includes

  • PDF and image to DOCX, TXT, XLSX, and Markdown
  • Client-side file, image, data, text, and unit-conversion tools
  • OCR-based extraction using Tesseract
  • Website-to-document utilities
  • Stripe-backed credit flows
  • YouTube AI tooling backed by Gemini
  • Programmatic SEO pages and prerendered static frontend output

Product Principles

  • Privacy-first by default
  • Client-side processing whenever possible
  • Minimal upload surface for tasks that require backend processing
  • Fast, utility-oriented UX
  • SEO-friendly static delivery for the frontend

Architecture

Frontend

  • Location: api/frontend
  • Stack: React 19, TypeScript, Vite, React Router, Tailwind
  • Deployment target: Cloudflare Pages
  • Output: prerendered static pages plus lazy-loaded tool routes

API

  • Location: api
  • Stack: Flask, Gunicorn, Tesseract OCR, Poppler, Playwright
  • Deployment target: containerized service such as Cloud Run
  • Purpose: OCR, document conversion, webpage fetch/render fallback, payments, contact handling, AI proxying

Workers

  • Location: worker
  • Purpose: payment and YouTube transcript supporting flows

Key Features

Conversion Backend

  • Convert PDFs and supported images through convert.py
  • Supported inputs: .pdf, .jpg, .jpeg, .png, .webp, .tiff, .tif, .bmp
  • Supported outputs: docx, txt, xlsx, md
  • OCR language support includes English, Italian, German, French, Spanish, Portuguese, Dutch, and Polish

Frontend Tool Surface

  • Document tools
  • Image tools
  • Audio and media tools
  • Validators and diff tools
  • Text and developer utilities
  • Large unit-conversion suite with shared conversion logic

Webpage Extraction

  • Public URL fetch endpoint in webpage.py
  • Playwright fallback for JS-heavy pages when enabled

Repository Layout

.
├── api/
│   ├── app/                  # Flask app, routes, services, utilities
│   ├── config/               # Runtime settings
│   ├── frontend/             # React/Vite frontend
│   ├── tests/                # API tests
│   └── Dockerfile            # API container image
├── worker/                   # Cloudflare Workers
├── UNIT_CONVERSION_ROADMAP.md
├── LEGAL_CHANGELOG.md
└── README.md

Local Development

1. Frontend

Requirements:

  • Node.js 20+
  • npm

Install and run:

cd api/frontend
npm install
npm run dev

Useful commands:

npm test
npm run build

The Vite dev server proxies /api requests to http://localhost:8080.

2. API

Requirements:

  • Python 3.11+
  • Tesseract OCR
  • Poppler
  • Chromium dependencies if using Playwright locally

Create an environment and install dependencies:

cd api
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run locally:

python app/main.py

Default local API address:

http://localhost:8080

Health check:

GET /api/health

Docker

The API image is defined in api/Dockerfile.

Build:

cd api
docker build -t convertprivately-api .

Run:

docker run --rm -p 8080:8080 convertprivately-api

The image installs:

  • Tesseract OCR with multiple language packs
  • Poppler utilities
  • Playwright Chromium

Configuration

Runtime configuration is defined in settings.py.

Important environment variables:

  • PORT
  • DEBUG
  • MAX_FILE_SIZE_MB
  • DEFAULT_DPI
  • DEFAULT_LANGUAGE
  • RATE_LIMIT_PER_MINUTE
  • ALLOWED_ORIGIN
  • ENABLE_WEBPAGE_JS_RENDER
  • WEBPAGE_JS_TIMEOUT_SECONDS
  • WEBPAGE_JS_SETTLE_MS
  • JWT_SECRET
  • PAYMENT_WORKER_URL
  • GOOGLE_CLOUD_API_KEY
  • GEMINI_MODEL
  • CONTACT_RECIPIENT_EMAIL
  • SMTP_HOST
  • SMTP_PORT
  • SMTP_USERNAME
  • SMTP_PASSWORD
  • SMTP_FROM_EMAIL
  • SMTP_FROM_NAME

Batch Entitlements

Batch Pro now uses the Cloudflare payment worker plus D1 as the source of truth for remaining batch balances.

Required worker-side setup:

Required API-side setup:

  • set PAYMENT_WORKER_URL so the Flask backend can consume and refund backend batch usage against the worker-managed D1 ledger

Main API Endpoints

Conversion

  • POST /api/convert
  • POST /api/batch-convert
  • POST /api/convert/info

Webpage

  • POST /api/webpage/fetch

Other Services

  • health
  • payment
  • contact
  • AI proxy

See api/app/routes for the full route set.

Testing

Frontend

cd api/frontend
npm test

API

There are API tests under api/tests. Run them with your preferred Python test runner after installing backend dependencies.

Deployment Model

  • Frontend: Cloudflare Pages
  • API: separate containerized deployment
  • Workers: Cloudflare Workers

This split keeps the static frontend fast and cacheable while reserving backend compute for OCR, document conversion, payments, and AI-assisted flows.

Notes

  • The frontend contains a large number of utility pages and lazy-loaded routes.
  • Many tools are fully client-side and do not send user files to the backend.
  • Backend-heavy flows are isolated to the API and protected with rate limiting or credit gating where needed.

Roadmaps and Internal Docs

License

No license file is currently present in this repository. Add one before publishing or accepting external contributions.

About

ConvertPrivately is a free collection of 190+ file conversion, formatting, and validation tools. The difference: most tools run entirely in your browser. Your files are never uploaded to a server — they're read into memory, converted using WebAssembly and JavaScript libraries, and downloaded locally.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors