Skip to content

MalachiGreen/Base81

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Base81/62

Multi-radix binary-to-text codec. Zero dependencies. Production-hardened.


Quick Start

from base81 import encode, decode

# Standard radix-81 (98.1% efficiency)
data = b"Hello, World!"
encoded = encode(data)
decoded = decode(encoded)
assert decoded == data

# URL-safe radix-62 (94.6% efficiency)
url_safe = encode(data, alphabet_type="url")
decoded = decode(url_safe, alphabet_type="url")
assert decoded == data

CLI

# Encode with self-describing header
$ echo -n "Hello" | base81 encode --header
^b81:7:standard^8pJTDW^^

# Decode with whitespace tolerance
$ echo "^b81:7:standard^8pJTDW^^" | base81 decode --header --ignore-ws
Hello

Features

  • Two alphabets: Standard (81 chars, 98.1% efficient) and URL-safe (62 chars, 94.6% efficient)
  • No padding: Variable-length tail blocks eliminate = or ^ padding characters
  • Streaming API: Process multi-gigabyte data with bounded memory
  • DoS hardened: Configurable max_input_length and max_buffer guards
  • Canonical encoding: Every byte sequence maps to exactly one valid string
  • Self-describing headers: Optional ^b81:N:alphabet^ framing for protocol use
  • Zero dependencies: Pure Python 3.8+, standard library only

Supported Codecs

Alphabet Radix Block Efficiency Use Case
standard 81 7→9 98.1% Maximum density
url 62 5→7 94.6% URLs, filenames, shells

API

One-Shot

encode(data, *, line_width=None, block_size=7, alphabet_type="standard") -> str
decode(s, *, ignore_whitespace=False, validate_canonical=True, block_size=7, max_input_length=None, alphabet_type="standard") -> bytes

Streaming

enc = Encoder(block_size=7, alphabet_type="standard", max_input_length=None, max_buffer=1048576)
enc.update(data) -> str
enc.finalize() -> str
enc.diagnostics() -> dict

dec = Decoder(block_size=7, alphabet_type="standard", ignore_whitespace=False, validate_canonical=True, max_input_length=None, max_buffer=1048576)
dec.update(s) -> bytes
dec.finalize() -> bytes
dec.diagnostics() -> dict

Exceptions

Exception When
ValidationError Invalid parameters, non-canonical input
CorruptStreamError Malformed blocks, structural defects
BoundaryError Input/buffer limits exceeded

Performance

Metric Value
Encode throughput 200 MB/s (single thread, 1 MB input)
Decode throughput 200 MB/s
Streaming overhead <5% vs one-shot
Memory (streaming) <8 KB typical, configurable cap

Installation

⚠️ Not yet on PyPI. Install from GitHub:

pip install git+https://github.com/MalachiGreen/Base81.git

License

MIT

About

Base81/62: A binary-to-text encoding library with 81-character (dense) and 62-character (URL-safe) alphabets, supporting streaming, memory limits, and canonical encoding.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages