SpeedyF

Documents are hard. Especially in serverless environments. SpeedyF ("sPDF") is a one-click deployable solution for file manipulation, with support for using object-storage URLs as input.

Contrary to the name, SpeedyF is not just for PDFs, but can be used to manipulate a variety of files. The details for that are below.

Supported Document Formats

Format	Extension	MIME Type
PDF	`.pdf`	`application/pdf`
Word	`.docx`	`application/vnd.openxmlformats-officedocument.wordprocessingml.document`
PowerPoint	`.pptx`	`application/vnd.openxmlformats-officedocument.presentationml.presentation`
EPUB	`.epub`	`application/epub+zip`
RTF	`.rtf`	`application/rtf`
OpenDocument Text	`.odt`	`application/vnd.oasis.opendocument.text`
XML DocBook / JATS XML / OPML	`.xml` or `.opml`	`application/xml`
FictionBook	`.fb2`	`application/x-fictionbook+xml`
Troff man pages	`.1`, `.man`	`application/x-troff-man`
Jupyter notebooks	`.ipynb`	`application/x-ipynb+json`

Supported Image Formats

Format	Extension	MIME Type
JPEG	`.jpg`, `.jpeg`	`image/jpeg`
PNG	`.png`	`image/png`
AVIF	`.avif`	`image/avif`
TIFF	`.tiff`	`image/tiff`
GIF	`.gif`	`image/gif`
HEIC / HEIF	`.heic`, `.heif`	`image/heic`, `image/heif`
BMP	`.bmp`	`image/bmp`
WebP	`.webp`	`image/webp`

Development and Deployment

To develop locally:

Fill the environment variables:

cp .env.example .env.local 
# or 
vc env pull

A MISTRAL_API_KEY is required for the markdown endpoint. You can make an API account and keys here.

bun install
vc dev

open http://localhost:3000

To build locally:

npm install
vc build

To deploy:

npm install
vc deploy

API Documentation

Authentication

All API endpoints require bearer token authentication. The token is configured via the INTERNAL_API_KEY environment variable.

Include the token in the Authorization header:

curl -X GET "http://localhost:3000/truncate?url=..." \
  -H "Authorization: Bearer YOUR_API_KEY"

Example (TypeScript):

const response = await fetch('http://localhost:3000/truncate', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${process.env.INTERNAL_API_KEY}`
  },
  body: JSON.stringify({ url: 'https://example.com/document.pdf' })
})

Requests without a valid bearer token will receive a 401 Unauthorized response.

File Truncation

POST `/truncate`

Truncate files to fit within specified page count and file size limits. For PDFs, uses binary search optimization. For DOCX files, truncates based on paragraph count with a configurable heuristic for pages-to-paragraphs conversion.

Accepts both JSON with URL and multipart file upload:

Option 1: JSON with URL

Request:

curl -X POST http://localhost:3000/truncate \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/document.pdf",
    "maxPages": 1000,
    "maxSizeBytes": 52428800
  }'

Request Body:

Field	Type	Required	Default	Description
`url`	string	Yes	-	URL of the file to truncate
`maxPages`	number	No	1000	Maximum number of pages
`maxSizeBytes`	number	No	52428800 (50MB)	Maximum file size in bytes
`paragraphsPerPage`	number	No	15	(DOCX only) Paragraphs per page heuristic for estimating page count

Option 2: Multipart File Upload

Request:

curl -X POST http://localhost:3000/truncate \
  -F "file=@/path/to/document.pdf" \
  -F "maxPages=1000" \
  -F "maxSizeBytes=52428800"

Form Fields:

Field	Type	Required	Default	Description
`file`	File	Yes	-	File to truncate
`maxPages`	string	No	"1000"	Maximum number of pages
`maxSizeBytes`	string	No	"52428800"	Maximum file size in bytes
`paragraphsPerPage`	string	No	"15"	(DOCX only) Paragraphs per page heuristic for estimating page count

Response:

Content-Type: Matches input file MIME type
Body: Binary file data (truncated)
Headers:
- X-Was-Truncated: "true" or "false"
- X-Original-Size: Original file size in bytes
- X-Final-Size: Final file size in bytes

Example (TypeScript):

// With URL
const response = await fetch('http://localhost:3000/truncate', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    url: 'https://example.com/document.pdf',
    maxPages: 500
  })
})
const fileBlob = await response.blob()

// With File Upload
const formData = new FormData()
formData.append('file', Bun.file('/path/to/document.docx'))
formData.append('maxPages', '10')
formData.append('paragraphsPerPage', '20')  // Adjust if document has dense formatting

const response = await fetch('http://localhost:3000/truncate', {
  method: 'POST',
  body: formData
})
const docxBlob = await response.blob()

GET `/truncate`

Truncate a file from a URL using query parameters.

Request:

curl -X GET "http://localhost:3000/truncate?url=https://example.com/document.pdf&maxPages=1000&maxSizeBytes=52428800"

Query Parameters:

Field	Type	Required	Default	Description
`url`	string	Yes	-	URL of the file to truncate
`maxPages`	number	No	1000	Maximum number of pages
`maxSizeBytes`	number	No	52428800 (50MB)	Maximum file size in bytes
`paragraphsPerPage`	number	No	15	(DOCX only) Paragraphs per page heuristic for estimating page count

Response: Same as POST /truncate above.

Example (TypeScript):

const url = 'https://example.com/document.pdf'
const response = await fetch(`http://localhost:3000/truncate?url=${encodeURIComponent(url)}&maxPages=500`)
const fileBlob = await response.blob()

// For DOCX with custom paragraphs per page
const docxUrl = 'https://example.com/document.docx'
const response2 = await fetch(`http://localhost:3000/truncate?url=${encodeURIComponent(docxUrl)}&maxPages=10&paragraphsPerPage=20`)
const docxBlob = await response2.blob()

PDF to Markdown (OCR)

Convert PDFs to markdown using Mistral's OCR API. PDFs are automatically truncated to 999 pages or 49MB before processing.

POST `/markdown`

Convert a PDF to markdown using binary search optimization.

Accepts both JSON with URL and multipart file upload:

Option 1: JSON with URL

Request:

curl -X POST http://localhost:3000/markdown \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/document.pdf",
    "maxPages": 999,
    "maxSizeBytes": 51380224
  }'

Request Body:

Field	Type	Required	Default	Description
`url`	string	Yes	-	URL of the PDF to convert
`maxPages`	number	No	999	Maximum pages before OCR
`maxSizeBytes`	number	No	51380224 (49MB)	Maximum size before OCR

Option 2: Multipart File Upload

Request:

curl -X POST http://localhost:3000/markdown \
  -F "file=@/path/to/document.pdf" \
  -F "maxPages=999" \
  -F "maxSizeBytes=51380224"

Form Fields:

Field	Type	Required	Default	Description
`file`	File	Yes	-	PDF file to convert
`maxPages`	string	No	"999"	Maximum pages before OCR
`maxSizeBytes`	string	No	"51380224"	Maximum size before OCR

Response Fields:

Field	Type	Description
`markdown`	string	Extracted markdown content with pages separated by `---`
`remainingFields`	object	The remaining fields from the Mistral OCR Response, which are detailed here.

Example (TypeScript):

// With URL
const response = await fetch('http://localhost:3000/markdown', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    url: 'https://example.com/document.pdf',
    maxPages: 500
  })
})
const { markdown, pages, usage } = await response.json()

// With File Upload
const formData = new FormData()
formData.append('file', Bun.file('/path/to/document.pdf'))
formData.append('maxPages', '500')

const response = await fetch('http://localhost:3000/markdown', {
  method: 'POST',
  body: formData
})
const { markdown, ...remainingFields } = await response.json()

GET `/markdown`

Convert a PDF from a URL to markdown using query parameters.

Request:

curl -X GET "http://localhost:3000/markdown?url=https://example.com/document.pdf&maxPages=999&maxSizeBytes=51380224"

Query Parameters:

Field	Type	Required	Default	Description
`url`	string	Yes	-	URL of the PDF to convert
`maxPages`	number	No	999	Maximum pages before OCR
`maxSizeBytes`	number	No	51380224 (49MB)	Maximum size before OCR

Response: Same as POST /markdown above.

Example (TypeScript):

const url = 'https://example.com/document.pdf'
const response = await fetch(`http://localhost:3000/markdown?url=${encodeURIComponent(url)}&maxPages=500`)
const { markdown, ...remainingFields } = await response.json()
console.log(markdown)

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpeedyF

Supported Document Formats

Supported Image Formats

Development and Deployment

API Documentation

Authentication

File Truncation

POST `/truncate`

Option 1: JSON with URL

Option 2: Multipart File Upload

GET `/truncate`

PDF to Markdown (OCR)

POST `/markdown`

Option 1: JSON with URL

Option 2: Multipart File Upload

GET `/markdown`

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SpeedyF

Supported Document Formats

Supported Image Formats

Development and Deployment

API Documentation

Authentication

File Truncation

POST /truncate

Option 1: JSON with URL

Option 2: Multipart File Upload

GET /truncate

PDF to Markdown (OCR)

POST /markdown

Option 1: JSON with URL

Option 2: Multipart File Upload

GET /markdown

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

POST `/truncate`

GET `/truncate`

POST `/markdown`

GET `/markdown`

Packages