Yhash: Complete Technical Documentation

Version: 1.0.0
Developer: Adekunle Abdulmujeeb
License: MIT
Language: Python 3.8+

Features

Overview
Installation
Project Structure
Technology Stack
Architecture & Internal Workings
- 5.1 Entry Point & argv Normalisation
- 5.2 CLI Dispatch Logic
- 5.3 Hashing Engine
- 5.4 Output Layer
- 5.5 Utilities Layer
Algorithms
Feature Reference
- 7.1 Default File Hashing
- 7.2 Algorithm Selection Flags
- 7.3 Multiple Algorithms in One Pass
- 7.4 Chained Hashing
- 7.5 Text / String Hashing
- 7.6 Multiple Files
- 7.7 Recursive Directory Hashing
- 7.8 Hash Verification
- 7.9 Manifest Creation
- 7.10 Clipboard Copy
- 7.11 JSON Output
- 7.12 Algorithm Reference Table
- 7.13 About Panel
- 7.14 Help
- 7.15 Version
Flag Compatibility Matrix
Progress Bars
Hash Display Design
Security Model
Memory Safety
Error Handling
Test Suite
Module Reference

1. Overview

Yhash (Yung Hash) is a command-line interface tool for computing cryptographic hashes of files, directories, and text strings. It is written entirely in Python and exposes a single yhash command with composable flags.

Design goals:

Memory safety — files are never fully loaded into memory; they stream through 8 KB chunks.
Speed — multiple algorithms applied in a single file-read pass; directories hashed in parallel.
Clarity — every hash value always prints in full, regardless of terminal width.
Simplicity — single-dash flags, sensible defaults, no subcommands.
Security — all algorithm names are validated against a strict whitelist before any I/O begins; no eval, exec, or shell calls are made anywhere.

2. Installation

From PyPI

pip install yhash

Using pipx (recommended for CLI tools)

pipx installs the tool in an isolated virtual environment and adds yhash to your PATH without affecting your system Python packages.

pipx install yhash

To install from a local directory (from source):

pipx install .

From source with pip

git clone https://github.com/yhash/yhash.git
cd yhash
pip install .

Editable install (for development)

pip install -e .

Verify the installation:

yhash --version
# Yhash  v1.0.0

3. Project Structure

yhash/
├── pyproject.toml          # PEP 621 packaging metadata and entry point
├── README.md               # Quick-start reference
├── LICENSE                 # MIT licence
├── src/
│   └── yhash/
│       ├── __init__.py     # Package version metadata
│       ├── constants.py    # All shared constants — algorithms, thresholds, flags
│       ├── hasher.py       # Core hashing engine (the only file that calls hashlib)
│       ├── utils.py        # File collection, manifest I/O, clipboard, validation
│       ├── formatter.py    # All Rich terminal output
│       ├── cli.py          # Click CLI definition and dispatch logic
│       └── py.typed        # PEP 561 type-checking marker
└── tests/
    ├── test_hasher.py      # Unit tests — hashing engine
    ├── test_utils.py       # Unit tests — utilities
    ├── test_cli.py         # Integration tests — CLI flags and flows
    ├── test_security.py    # Security and vulnerability tests
    └── test_system.py      # System tests — end-to-end workflows

The src/ layout (PEP 517) ensures the package is never accidentally imported from the project root during development; only the installed version is importable.

4. Technology Stack

Click ≥ 8.1

What it is: A Python package for building command-line interfaces via decorators.

How it is used: Every flag (-sha256, -chain, -text, etc.) is declared as a @click.option decorator on the main() function. Click handles all argument parsing, type conversion, help generation, and version output. The CliRunner from click.testing is also used in the integration test suite to invoke the CLI in-process without spawning a subprocess.

Key settings used:

context_settings = {"help_option_names": ["-help", "--help"]} — enables -help as a valid help flag alongside the standard --help.
no_args_is_help=False — prevents Click from printing help when no arguments are given; Yhash handles that case itself (showing the banner and usage).
@click.argument("files", nargs=-1) — captures zero or more positional file paths into a tuple.
@click.option("-algo", "--algorithms", "show_algo", is_flag=True) — one option declared with two names, so both -algo and --algorithms work.

Rich ≥ 13.7

What it is: A Python library for rich text and beautiful terminal formatting.

How it is used:

Rich component	Where used	Purpose
`Console(soft_wrap=True)`	`formatter.py`	Shared output sink; `soft_wrap=True` prevents Rich from truncating long hash strings
`Rule`	`formatter.py`	Horizontal separator between output sections
`Text`	`formatter.py`	Styled inline text with per-character style control
`Panel`	`formatter.py`	Bordered panels for MATCH/MISMATCH verdict, About, Manifest Created
`Table`	`formatter.py`	Multi-file hash results table, algorithm reference table, verification meta info
`Progress`	`cli.py`	Progress bars for large files (> 5 MB) and recursive directory batches
`SpinnerColumn`	`cli.py`	Animated spinner on the left of the progress bar
`BarColumn`	`cli.py`	The actual filled progress bar
`TaskProgressColumn`	`cli.py`	Percentage complete
`TransferSpeedColumn`	`cli.py`	MB/s read speed
`TimeRemainingColumn`	`cli.py`	Estimated time to completion

Why soft_wrap=True: By default, Rich measures the width of each print() call and inserts a newline at the terminal boundary. For a 128-character SHA-512 or BLAKE2b digest in a narrow terminal, this would produce a truncated line ending in …. With soft_wrap=True, Rich emits the full string and leaves line-wrapping to the terminal, which visually wraps the characters but never drops any of them.

overflow="fold" on table columns: Long digests in multi-file tables are folded (wrapped) within the cell rather than being clipped. This is set on every hash-value column of the results table.

hashlib (Python standard library)

What it is: Python's built-in interface to cryptographic hash functions, backed by OpenSSL.

How it is used: All cryptographic operations live exclusively in hasher.py. No other module calls hashlib directly.

# For all algorithms except BLAKE2b:
hasher = hashlib.new("sha256")
hasher.update(chunk)
digest = hasher.hexdigest()

# For BLAKE2b (requires special constructor):
hasher = hashlib.blake2b()

The hashlib.new(name) factory method is used for all algorithms except blake2b, which requires its own constructor. This is abstracted behind the internal _make_hasher(algorithm) function.

pyperclip ≥ 1.9

What it is: A cross-platform Python clipboard module.

How it is used: Called only from utils.copy_to_clipboard(). The function wraps the call in a broad except Exception so that a clipboard failure (common on headless servers, CI environments, or Linux systems without xclip/xsel) is never a crash — it returns False and cli.py shows a warning instead.

concurrent.futures (Python standard library)

What it is: Python's high-level threading and multiprocessing interface.

How it is used: ThreadPoolExecutor is used in hasher.hash_files_parallel() and in the recursive directory flow inside cli.py. File hashing is I/O-bound, making threads (not processes) the correct tool — threads allow concurrent disk reads without the serialisation overhead of multiprocessing. Worker count is capped at min(8, number_of_files) to avoid thread thrashing on large directories.

pyproject.toml / setuptools ≥ 68

What it is: The modern Python packaging standard (PEP 517/621).

How it is used: All project metadata (name, version, dependencies, Python requirement, entry point) lives in pyproject.toml. The entry point yhash = "yhash.cli:cli_entry" wires the yhash shell command to the cli_entry() function. The [tool.setuptools] section uses an explicit package-dir = {"": "src"} declaration so that both pip install and pipx install resolve the package correctly without relying on auto-discovery.

5. Architecture & Internal Workings

Yhash is divided into five layers. Data flows strictly from right to left in this diagram; only cli.py imports from all other layers.

 Shell
  │
  ▼
cli_entry()           ← cli.py     (argv normalisation + Click dispatch)
  │
  ├──► hasher.py      (cryptographic operations — hashlib only)
  ├──► utils.py       (file I/O, manifest, clipboard, validation)
  └──► formatter.py   (all terminal output — Rich only)
            │
            └──► constants.py  (shared constants, no imports from other modules)

5.1 Entry Point & argv Normalisation

The shell command yhash calls cli_entry(), not main() directly.

def cli_entry() -> None:
    sys.argv = [sys.argv[0]] + _normalise_argv(sys.argv[1:])
    main()

Before Click parses anything, _normalise_argv() iterates over every argument and lower-cases any token whose lowercase form appears in NORMALISABLE_FLAGS:

NORMALISABLE_FLAGS = {"-sha256", "-sha512", "-sha384", "-sha1", "-md5", "-blake2b",
                      "-chain", "-copy", "-create", "-json", "-about", "-algo", "-text"}

This means -SHA256, -Sha256, and -sha256 are all normalised to -sha256 before Click sees them. File paths, hash values, and chain strings are left untouched because their lowercase forms do not match any flag in the set. This is a pure string-set lookup — O(1) per token.

5.2 CLI Dispatch Logic

main() is the single Click command. After argument parsing, it dispatches to one of six flows in strict priority order:

1. show_algo  → display algorithm table, return
2. show_about → display about panel, return
3. No input   → display banner + help, return
4. chain      → chain hashing (file or text)
5. text_input → independent text hashing
6. check_hash → hash verification
7. recursive  → recursive directory hashing
8. files > 1  → multi-file parallel hashing
9. files == 1 → single file hashing

Mutual-exclusion checks happen before dispatch:

-chain + any algorithm flag → error (the chain string already specifies all algorithms)
-text + file arguments → error (text and file inputs cannot be mixed)

5.3 Hashing Engine

hasher.py is the only module that imports hashlib. All other modules call functions from hasher.py.

`_make_hasher(algorithm)`

Internal factory. Validates the algorithm against SUPPORTED_ALGORITHMS and returns the appropriate hashlib object. Raises ValueError for any unsupported name, which is the security boundary that prevents injection.

`compute_hashes(file_path, algorithms, *, progress, task)`

The core file-hashing function. Opens the file in binary mode and reads it in CHUNK_SIZE = 8192 byte (8 KB) chunks. All hashers are updated on every chunk, so N algorithms cost N × (CPU work per chunk), but the file is only read once regardless of N.

File on disk:
 ┌──────────┬──────────┬──────────┬──────────┐
 │ chunk 0  │ chunk 1  │ chunk 2  │ chunk 3  │  ...
 └──────────┴──────────┴──────────┴──────────┘
      │            │
      ▼            ▼
  sha256.update  sha256.update  ...
  sha512.update  sha512.update  ...
  blake2b.update blake2b.update ...

After the loop ends, hexdigest() is called on each hasher. The result dict preserves the order of the input algorithms list.

Optional progress and task parameters allow the caller (cli.py) to inject a Rich Progress context for live progress bars on large files; the engine calls progress.update(task, advance=len(chunk)) on every iteration if they are provided.

`chain_hash_file(file_path, algorithms, *, progress, task)`

Implements the chain hashing algorithm:

Open and stream the file with the first algorithm only (8 KB chunks). Produce hex string H0.
For each subsequent algorithm Ai, create a fresh hasher and call hasher.update(H_{i-1}.encode("utf-8")). Produce Hi.
Return all intermediate and final hashes as an ordered dict.

The key design decision is that each chained step hashes the UTF-8 encoding of the previous hex string, not the raw bytes of the previous digest. This means the chain input is always a printable ASCII string of hexadecimal digits.

File bytes  →  [SHA256]  →  "a3f9..."   (64 hex chars)
                              │
                         .encode("utf-8")
                              │
                              ▼
               [SHA512]  →  "8b2e..."   (128 hex chars)  ← Final Chained Hash

`chain_hash_text(text, algorithms)`

Identical to chain_hash_file but the initial input is text.encode("utf-8") rather than file bytes. No streaming or progress bar is needed since text is always small.

`hash_text(text, algorithms)`

Hashes the same UTF-8 bytes independently with each algorithm. All algorithms see the same input — this is not chained. Returns a dict mapping algorithm → hexdigest.

`hash_files_parallel(file_paths, algorithms, *, max_workers)`

Uses ThreadPoolExecutor to hash multiple files concurrently. Each file is submitted as an independent task. as_completed() yields results in completion order (not submission order), which is handled correctly — results are stored by path string key, not by order.

Worker count: min(max_workers, len(file_paths)) — capped to avoid creating more threads than files.

Error handling: if compute_hashes() raises for a specific file, the exception is caught inside the worker and stored as {"error": str(exc)} in the result dict. This prevents one failing file from aborting the entire batch.

5.4 Output Layer

formatter.py owns all terminal output. cli.py imports functions from it; it never calls print() or console.print() directly.

Shared Console instance:

console = Console(soft_wrap=True)

This singleton is imported by both formatter.py and cli.py so that Rich Progress bars (which need access to the console to avoid interleaving with other output) share the same output stream.

Hash display — two-line layout:

Every hash value is printed on its own dedicated line below the algorithm label, using _print_hash():

def _print_hash(digest: str, style: str = CLR_HASH) -> None:
    console.print(Text(_HASH_INDENT + digest, style=style, no_wrap=True))

no_wrap=True on the Text object combined with soft_wrap=True on the Console means Rich will not split, fold, or truncate the hash regardless of terminal width. The terminal itself handles visual wrapping if needed, but no character is ever dropped.

Verification layout:

The display_verification() function prints file metadata in a Table(box=None) (invisible borders for alignment), then prints the expected and computed hashes each on their own _print_hash() line. The verdict (MATCH or MISMATCH) appears in a coloured Panel.

JSON output:

display_json_results() creates a fresh Console(highlight=False, markup=False) for each call to avoid any Rich markup processing interfering with JSON content:

def display_json_results(data: Any) -> None:
    _json_console = Console(soft_wrap=True, highlight=False, markup=False)
    _json_console.print(json.dumps(data, indent=2, ensure_ascii=False))

5.5 Utilities Layer

utils.py provides helpers with no knowledge of the CLI or display.

format_file_size(size_bytes) — converts bytes to a human-readable string using iterative division by 1024.
collect_files_recursive(path) — uses Path.rglob("*") to find all files under a directory, sorted alphabetically. If given a file path, returns that single file.
collect_files_flat(path) — uses Path.iterdir() for a non-recursive listing.
validate_algorithms(algos) — validates and deduplicates a list of algorithm names. Deduplication preserves insertion order using a seen set alongside the result list.
parse_chain_algorithms(chain_str) — splits on comma, strips whitespace from each part, calls validate_algorithms, and enforces the minimum length of 2.
create_manifest_file(file_hashes, algorithms, output_path) — serialises a manifest dict to UTF-8 JSON and writes it atomically (as a single write_text call). The manifest always ends with a newline.
load_manifest_file(manifest_path) — reads and parses a manifest file, raising FileNotFoundError or json.JSONDecodeError on failure.
copy_to_clipboard(text) — wraps pyperclip.copy() in a try/except and returns bool.
get_final_hash(results) — returns the last value in a dict (Python 3.7+ dicts are insertion-ordered).

6. Algorithms

Flag	Algorithm	Digest Length	Security Status	Notes
`-sha256`	SHA-256	64 hex chars	Recommended	Default when no flag is given
`-sha512`	SHA-512	128 hex chars	Recommended	Higher security, larger output
`-sha384`	SHA-384	96 hex chars	Recommended	SHA-2 family, truncated SHA-512
`-sha1`	SHA-1	40 hex chars	Deprecated	Collision attacks exist; legacy only
`-md5`	MD5	32 hex chars	Weak	Checksums only; not cryptographically safe
`-blake2b`	BLAKE2b	128 hex chars	Recommended	Fastest secure algorithm; no length-extension vulnerability

All flags are case-insensitive. -SHA256, -Sha256, and -sha256 are all valid. This is achieved by normalising sys.argv before Click parses it.

The default algorithm is sha256. It is applied whenever no algorithm flag is specified.

7. Feature Reference

7.1 Default File Hashing

Hash a single file with the default SHA-256 algorithm.

Syntax:

yhash <file>

Example:

yhash document.pdf

Output:

────────────────────────────────────────────
  File    document.pdf  (1.2 MB)

  →  SHA-256
       a665a45920422f9d417e4867efdc4fb8...

How it works:

cli_entry() normalises sys.argv and calls main().
Click parses document.pdf into the files tuple.
main() calls _build_algorithms() — no flags were set, so it returns ["sha256"].
_hash_file() checks the file size. If ≤ 5 MB, calls compute_hashes() directly. If > 5 MB, wraps the call in a Rich Progress context.
compute_hashes() opens the file, reads 8 KB chunks, feeds each chunk into the SHA-256 hasher.
display_hash_results() prints the source name, file size, algorithm label, and digest.

7.2 Algorithm Selection Flags

Choose a specific hashing algorithm.

Syntax:

yhash -<algorithm> <file>

Examples:

yhash -sha512 archive.tar.gz
yhash -blake2b firmware.bin
yhash -md5 installer.exe
yhash -sha1 legacy_backup.zip
yhash -sha384 contract.pdf

How it works: Each algorithm flag is a Click boolean option (is_flag=True). _build_algorithms() inspects all six boolean parameters and appends matching algorithm names to the selected list in declaration order. If none are set, it returns [DEFAULT_ALGORITHM] ("sha256").

Case insensitivity: -SHA512, -Sha512, and -sha512 are all accepted. The _normalise_argv() function checks each argv token against NORMALISABLE_FLAGS (a set of lowercase flag strings) and lowercases any match before Click processes them.

7.3 Multiple Algorithms in One Pass

Compute several hashes in a single file read.

Syntax:

yhash -<algo1> -<algo2> [-<algo3> ...] <file>

Example:

yhash -sha256 -sha512 -blake2b document.pdf

Output:

────────────────────────────────────────────
  File    document.pdf  (1.2 MB)

  →  SHA-256
       a665a45920422f9d417e4867efdc4fb8...

  →  SHA-512
       cf83e1357eefb8bdf1542850d66d8007...

  →  BLAKE2b
       786a02f742015903c6c6fd852552d272...

How it works: compute_hashes(file_path, ["sha256", "sha512", "blake2b"]) creates three hashlib objects before opening the file. On every 8 KB chunk, all three objects are updated in a single loop. The file is read exactly once. Adding more algorithms increases CPU work linearly but does not increase I/O.

This is more efficient than running yhash three times, which would read the file three times.

7.4 Chained Hashing

Apply algorithms sequentially: the output of each step becomes the input of the next.

Syntax:

yhash -chain <algo1,algo2,...> <file>

At least two algorithms must be specified. The chain string is case-insensitive and whitespace around commas is ignored.

Example:

yhash -chain md5,sha256,sha512 video.mp4

Output:

────────────────────────────────────────────
  Chain   video.mp4

  →  MD5
       d41d8cd98f00b204e9800998ecf8427e

  →  SHA-256
       e3b0c44298fc1c149afbf4c8996fb924...

  →  SHA-512   <- Final Chained Hash
       cf83e1357eefb8bdf1542850d66d8007...

How chaining works — step by step:

Step 1:  Read raw bytes of video.mp4 through MD5 hasher
         → MD5 hex digest: "d41d8cd98f00b204..." (32 hex chars)

Step 2:  Encode that hex string as UTF-8 bytes: b"d41d8cd98f00b204..."
         Feed those bytes into a fresh SHA-256 hasher
         → SHA-256 hex digest: "e3b0c44298fc1c..." (64 hex chars)

Step 3:  Encode that hex string as UTF-8 bytes: b"e3b0c44298fc1c..."
         Feed those bytes into a fresh SHA-512 hasher
         → SHA-512 hex digest: "cf83e1357eefb8..." (128 hex chars)  <- Final

Each subsequent step hashes the UTF-8 encoding of the previous hexadecimal string, not the raw bytes of the previous digest. This is important: the input to step 2 is the ASCII/UTF-8 representation of the hex string, e.g., "a3f9..." encoded as bytes, not the binary digest.

Chain with text:

yhash -chain sha256,blake2b -text "my passphrase"

Constraint: -chain cannot be combined with individual algorithm flags (-sha256, etc.). All algorithms in the chain must be specified in the chain string.

How it is implemented: cli.py calls parse_chain_algorithms(chain) which splits the comma-separated string, normalises case, deduplicates, and validates all names against the whitelist. Then chain_hash_file() or chain_hash_text() in hasher.py performs the sequential hashing.

7.5 Text / String Hashing

Hash an arbitrary string directly, without creating a file.

Syntax:

yhash -text "<string>"

Examples:

yhash -text "hello world"
yhash -sha512 -text "my secret key"
yhash -sha256 -sha512 -text "password123"
yhash -chain md5,sha256 -text "chain this string"

Output:

────────────────────────────────────────────
  Text    "hello world"

  →  SHA-256
       b94d27b9934d3e08a52e52d7da7dabfa...

How it works: The string is encoded to bytes using text.encode("utf-8") and passed to hash_text() or chain_hash_text() in hasher.py. No temporary file is created. Any Unicode string is accepted — including spaces, special characters, emoji, null bytes, and control characters — because the encoding step handles all of them.

Constraints:

-text and file arguments cannot be combined in the same command.
-text "" (empty string) is valid and produces the well-known hash of an empty byte sequence.
-text is compatible with -chain, -copy, -json, and all algorithm flags.

7.6 Multiple Files

Hash several files in a single command, displayed as a table.

Syntax:

yhash [algo flags] <file1> <file2> [file3 ...]

Example:

yhash -sha256 -sha512 report.pdf data.csv archive.zip

Output:

┌──────────────┬─────────────────────────────┬─────────────────────────────┐
│ File         │ SHA-256                     │ SHA-512                     │
├──────────────┼─────────────────────────────┼─────────────────────────────┤
│ report.pdf   │ a665a459...                 │ cf83e135...                 │
│ data.csv     │ 7f8e9a0b...                 │ 4fddee6d...                 │
│ archive.zip  │ 3b00361b...                 │ 5d9c46ef...                 │
└──────────────┴─────────────────────────────┴─────────────────────────────┘

How it works: When len(valid_fps) > 1, cli.py iterates through each file and calls _hash_file() sequentially (not in parallel at this level — parallel processing is reserved for the -r recursive mode, which may have hundreds of files). Results are collected into results_map and displayed via display_multi_file_results().

The table uses Rich's overflow="fold" on each hash column so that long hashes (SHA-512, BLAKE2b — 128 chars) wrap inside the cell rather than being clipped.

If a file fails (e.g. permission denied), its row shows the error message in red; processing continues for the remaining files.

7.7 Recursive Directory Hashing

Hash all files under a directory and its subdirectories.

Syntax:

yhash -r <directory>

Examples:

yhash -r ./project
yhash -sha512 -r ./backups
yhash -sha256 -sha512 -r ./documents

Output:

Found 47 file(s) — processing...

┌────────────────────┬──────────────────────────────────────┐
│ File               │ SHA-256                              │
├────────────────────┼──────────────────────────────────────┤
│ main.py            │ a665a45920422f9d...                  │
│ utils.py           │ cf83e1357eefb8bd...                  │
│ README.md          │ 7f8e9a0b1c2d3e4f...                  │
│ ...                │ ...                                  │
└────────────────────┴──────────────────────────────────────┘

How it works:

collect_files_recursive(path) walks the directory tree with Path.rglob("*"), returning a sorted list of all files.
A ThreadPoolExecutor (capped at min(8, file_count) workers) submits each file as a separate _hash_one_rec() task.
A Rich Progress bar tracks completion as as_completed() yields finished futures.
Results are displayed in a table.

Multiple directories and files can be combined:

yhash -r ./src ./tests README.md

Combined with -create: When -r and -create are combined, instead of a table, a .yhash manifest file is written. See Section 7.9.

7.8 Hash Verification

Compare a file's computed hash against a known expected value.

Syntax:

yhash -check <expected_hash> <file>

Examples:

yhash -check a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3 document.pdf

# Verify with SHA-512 instead of the default SHA-256
yhash -sha512 -check cf83e1357eefb8bdf1542850d66d800... document.pdf

Output on match:

────────────────────────────────────────────
  File          document.pdf
  Algorithm     SHA-256

  Expected
       a665a45920422f9d417e4867efdc4fb8...

  Computed
       a665a45920422f9d417e4867efdc4fb8...

╭──────────────────────────────────────────╮
│  MATCH  —  The file is intact...         │
╰──────────────────────────────────────────╯

Output on mismatch:

╭──────────────────────────────────────────╮
│  MISMATCH  —  Hash does not match.      │
│  File may be corrupted or tampered with. │
╰──────────────────────────────────────────╯

How it works:

The file is hashed using _hash_file() with only the first algorithm in algorithms list (defaults to SHA-256 unless an algorithm flag is set).
The computed digest is compared to the expected value using case-insensitive string comparison: expected_hash.lower() == computed_hash.lower().
Both hashes are displayed in full regardless of terminal width (using _print_hash()).
The verdict is displayed in a green (MATCH) or red (MISMATCH) Panel.

The comparison is always case-insensitive. Uppercase and lowercase hex strings are treated identically.

7.9 Manifest Creation

Generate a .yhash manifest file containing hash mappings for one or more files.

Syntax:

yhash -create [algo flags] <file1> [file2 ...]
yhash -create [algo flags] -r <directory>

Examples:

# Manifest for specific files
yhash -create report.pdf data.csv

# Recursive manifest with SHA-512
yhash -sha512 -create -r ./project

# Manifest with multiple algorithms
yhash -sha256 -sha512 -create -r ./dist

Manifest file format (manifest.yhash):

{
  "yhash_version": "1.0.0",
  "created_at": "2025-01-15T10:30:00.123456+00:00",
  "algorithms": ["sha256"],
  "entries": {
    "/absolute/path/to/report.pdf": {
      "sha256": "a665a45920422f9d417e4867efdc4fb8..."
    },
    "/absolute/path/to/data.csv": {
      "sha256": "cf83e1357eefb8bdf1542850d66d8007..."
    }
  }
}

How it works:

Files are collected (either the provided list or recursively via -r).
Each file is hashed with the selected algorithms using compute_hashes().
create_manifest_file() builds a dict with version, timestamp (UTC ISO 8601), algorithm list, and per-file hash entries.
The dict is serialised to UTF-8 JSON with json.dumps(..., indent=2, ensure_ascii=False) and written with a single write_text() call.
The manifest is saved as manifest.yhash in the current working directory.

The timestamp uses datetime.now(tz=timezone.utc) for a timezone-aware UTC value. The manifest is always valid JSON: path keys and hash values containing special characters are safely escaped by Python's json module.

7.10 Clipboard Copy

Copy the final hash to the system clipboard.

Syntax:

yhash -copy <file>
yhash -copy -sha512 <file>
yhash -copy -text "string"
yhash -copy -chain md5,sha256 <file>

Examples:

# Copy SHA-256 of a file
yhash -copy firmware.bin

# Copy SHA-512
yhash -sha512 -copy firmware.bin

# Copy the final chained hash
yhash -copy -chain md5,sha256,sha512 archive.tar.gz

Output:

  →  SHA-256
       a665a45920422f9d...

  OK    Copied to clipboard  a665a45920422f9d...

How it works: After the hash result is computed and displayed, _clipboard() in cli.py calls get_final_hash(results) to get the last entry in the results dict (which is always the final algorithm's output — in chained mode, this is the final chained hash). It then calls copy_to_clipboard(final) from utils.py.

copy_to_clipboard() calls pyperclip.copy(text) inside a try/except. On failure (e.g., no display server, no clipboard tool), it returns False and cli.py shows a warning message with installation instructions rather than crashing.

With multiple algorithms: The hash of the last specified algorithm is copied. Example: -sha256 -sha512 -copy file copies the SHA-512 hash (because SHA-512 was the last flag declared and _build_algorithms() preserves flag order).

With chaining: The final chained hash (last algorithm in the chain) is always what gets copied.

7.11 JSON Output

Print results as machine-readable JSON instead of the default formatted display.

Syntax:

yhash -json [other flags] <file>

Examples:

yhash -json document.pdf
yhash -json -sha512 document.pdf
yhash -json -text "hello world"
yhash -json -chain md5,sha256 archive.zip
yhash -json -sha256 -r ./folder

Single file output:

{
  "mode": "file",
  "file": "/path/to/document.pdf",
  "size_bytes": 1258496,
  "algorithms": ["sha256"],
  "hashes": {
    "sha256": "a665a45920422f9d417e4867efdc4fb8..."
  }
}

Text output:

{
  "mode": "text",
  "input": "hello world",
  "algorithms": ["sha256"],
  "hashes": {
    "sha256": "b94d27b9934d3e08a52e52d7da7dabfa..."
  }
}

Chain output:

{
  "mode": "chain",
  "file": "/path/to/archive.zip",
  "size_bytes": 45056,
  "chain": {
    "md5": "d41d8cd98f00b204e9800998ecf8427e",
    "sha256": "e3b0c44298fc1c149afbf4c8996fb924..."
  }
}

Recursive output:

{
  "mode": "recursive",
  "algorithms": ["sha256"],
  "file_count": 3,
  "results": {
    "/path/to/a.txt": {"hashes": {"sha256": "..."}},
    "/path/to/b.txt": {"hashes": {"sha256": "..."}},
    "/path/to/c.txt": {"error": "Permission denied"}
  }
}

How it works: -json is a boolean flag that, when set, routes all result data through display_json_results() instead of the Rich formatter functions. display_json_results() creates a dedicated Console(highlight=False, markup=False) to print the JSON string without any Rich markup processing, ensuring the output is clean and parseable by tools like jq.

Using with jq:

yhash -json document.pdf | jq '.hashes.sha256'
yhash -json -r ./folder | jq '.results | to_entries[] | {file: .key, hash: .value.hashes.sha256}'

7.12 Algorithm Reference Table

Display a table of all supported algorithms with security ratings.

Syntax:

yhash -algo
# or
yhash --algorithms

Output:

┌──────────┬──────────┬─────────────┬─────────────┬────────────────────────────────────┐
│   Flag   │ Algorithm│ Output Size │ Security    │ Recommended Use                    │
├──────────┼──────────┼─────────────┼─────────────┼────────────────────────────────────┤
│ -sha256  │ SHA-256  │   256-bit   │ Recommended │ General-purpose cryptographic...   │
│ -sha512  │ SHA-512  │   512-bit   │ Recommended │ High-security / large data         │
│ ...      │ ...      │   ...       │ ...         │ ...                                │
└──────────┴──────────┴─────────────┴─────────────┴────────────────────────────────────┘

Security ratings are colour-coded: green for Recommended, yellow for Deprecated or Weak.

How it works: display_algo_table() in formatter.py reads the ALGO_INFO dict from constants.py and builds a Rich Table. The Security column value is wrapped in Rich markup at render time to apply the colour.

7.13 About Panel

Display tool metadata in a formatted panel.

Syntax:

yhash -about

Output:

╭─────────────────── About Yhash ────────────────────╮
│ Yhash  (Yung Hash)                                 │
│                                                    │
│ Version       1.0.0                                │
│ Description   Modern, fast CLI hashing utility.   │
│ License       MIT                                  │
│ Python        >= 3.9                               │
│ Algorithms    SHA-256, SHA-512, SHA-384, ...       │
│ Install       pip install yhash / pipx install ... │
│ Repository    https://github.com/yhash/yhash       │
╰────────────────────────────────────────────────────╯

7.14 Help

Display the full usage reference.

Syntax:

yhash -help
# or
yhash --help

Prints all flags organised into sections: Basic Usage, Algorithm Flags, Chained Hashing, Text Hashing, Files & Directories, Verification, Manifest, Output Options, Install, and Info.

How it works: Click's context_settings = {"help_option_names": ["-help", "--help"]} registers both -help and --help as equivalent help triggers. When no arguments are provided at all, main() also calls display_help() alongside print_banner().

7.15 Version

Print the version number.

Syntax:

yhash --version

Output:

Yhash  v1.0.0

Implemented via Click's @click.version_option(VERSION, "--version", prog_name="Yhash", message="%(prog)s v%(version)s").

8. Flag Compatibility Matrix

Flag	`-sha*` / `-md5` / `-blake2b`	`-chain`	`-text`	`-r`	`-copy`	`-check`	`-create`	`-json`
`-sha*` / `-md5` / `-blake2b`	✅ Combine freely	❌ Mutual exclusion	✅	✅	✅	✅	✅	✅
`-chain`	❌ Mutual exclusion	—	✅	—	✅	—	—	✅
`-text`	✅	✅	—	❌ No files	✅	—	—	✅
`-r`	✅	—	❌	—	—	—	✅	✅
`-copy`	✅	✅	✅	—	—	—	—	—
`-check`	✅ (first algo used)	—	—	—	—	—	—	—
`-create`	✅	—	—	✅	—	—	—	—
`-json`	✅	✅	✅	✅	—	—	—	—

Mutual exclusions enforced in main() before any hashing begins:

-chain + any algorithm flag: error — all algorithms must be in the chain string.
-text + file arguments: error — text and file inputs cannot be mixed.

9. Progress Bars

A progress bar is shown automatically for any single file larger than 5 MB (LARGE_FILE_THRESHOLD = 5 * 1024 * 1024 bytes). Progress bars are also shown during recursive directory hashing regardless of individual file size.

What the progress bar shows:

⠋ Hashing firmware.bin  (120.0 MB) ████████████░░░░░░ 68% 42.1 MB/s 0:00:01

Spinner: animated rotating character (Rich SpinnerColumn)
Description: filename and size
Bar: filled proportionally to bytes read (Rich BarColumn, width 38)
Percentage: TaskProgressColumn
Speed: TransferSpeedColumn — bytes read per second
ETA: TimeRemainingColumn

How it works: _hash_file() in cli.py uses _progress_ctx() to create a Progress context manager. The task total is set to the file size in bytes. Inside compute_hashes(), after each 8 KB chunk is processed, progress.update(task, advance=len(chunk)) advances the bar. The progress bar is transient=True — it disappears after completion, leaving only the hash result.

The progress bar and hash output share the same console instance (the singleton from formatter.py), so Rich can correctly erase the progress bar before printing the result without output corruption.

10. Hash Display Design

Problem: Standard terminal output in Rich enforces its internal line-width limit. A 128-character SHA-512 or BLAKE2b digest in an 80-column terminal would be truncated with ….

Solution — two mechanisms combined:

Console(soft_wrap=True) — tells Rich not to enforce its own line-length limit when printing plain text. Rich emits the full string; the terminal handles visual wrapping.
Two-line layout — each hash is printed on its own dedicated line below the algorithm label. This gives the hash the full terminal width, not a portion of it shared with a label.

def _print_hash(digest: str, style: str = CLR_HASH) -> None:
    console.print(Text(_HASH_INDENT + digest, style=style, no_wrap=True))

no_wrap=True on the Text object reinforces the instruction: this object should not be wrapped by Rich at any level.

Effect in practice:

On a 200-column terminal: entire 128-char hash on one visual line.
On a 80-column terminal: hash wraps visually at column 80, continues on the next line. All characters are present.
The hash value is never truncated, ellipsised, or partially hidden regardless of terminal width.

For multi-file tables: hash columns use overflow="fold" (not overflow="ellipsis") so the content wraps within the cell visually but the full hash is always present in the output.

11. Security Model

Algorithm Whitelist

The first line of defence is _make_hasher() in hasher.py:

def _make_hasher(algorithm: str) -> Any:
    if algorithm not in SUPPORTED_ALGORITHMS:
        raise ValueError(f"Unsupported algorithm: {algorithm!r}. Supported: ...")
    return hashlib.blake2b() if algorithm == "blake2b" else hashlib.new(algorithm)

SUPPORTED_ALGORITHMS is a fixed dict with exactly six keys. Any string not in that set raises ValueError before any hashlib call is made. This prevents all forms of algorithm-name injection:

Injection attempt	Result
`sha256; rm -rf /`	`ValueError: Unsupported algorithm`
`sha256 && cat /etc/passwd`	`ValueError`
`../../../etc/shadow`	`ValueError`
`$(whoami)`	`ValueError`
`sha256\x00sha512`	`ValueError`

Validation happens at two independent layers: validate_algorithm() (single name) and validate_algorithms() (list). Both are called before any file is opened.

No Shell Calls

Yhash never calls subprocess, os.system(), eval(), or exec(). hashlib.new(name) is a Python API call, not a shell command. pyperclip.copy() may invoke a system clipboard tool internally, but the hash value passed to it is a hex string (characters [0-9a-f] only), which cannot be misinterpreted as shell syntax.

File Reading Is Read-Only

compute_hashes() and chain_hash_file() open files with open(path, "rb") — binary read mode. No write, no execute, no interpretation. Hashing a shell script does not run it.

Memory Safety

Files are never fully loaded into memory. The maximum memory overhead of the hashing engine is one CHUNK_SIZE (8192 byte) buffer plus the hasher state objects. A 100 GB file and a 1 KB file use the same peak memory.

Manifest Integrity

Manifest files are written with json.dumps() which safely escapes all special characters in keys and values. The output is always valid JSON — never executable code. Paths containing quotes, backslashes, or newlines are correctly serialised.

Clipboard Safety

The value passed to pyperclip.copy() is always the output of hexdigest() — a string of lowercase hexadecimal characters ([0-9a-f]+). This cannot be misinterpreted as shell commands even if pasted into a terminal.

Error Isolation in Parallel Processing

In hash_files_parallel() and the recursive directory flow, each file's hashing is wrapped in an individual try/except. A permission error, broken symlink, or read failure on one file produces {"error": str(exc)} in the result dict for that file and does not abort processing of other files.

12. Memory Safety

The core design principle is that Yhash's memory usage is O(1) with respect to file size — it does not scale with how large the file is.

Implementation:

with open(file_path, "rb") as fh:
    while True:
        chunk = fh.read(CHUNK_SIZE)   # CHUNK_SIZE = 8192 bytes
        if not chunk:
            break
        for hasher in hashers.values():
            hasher.update(chunk)      # hash state updated, chunk discarded

After each update() call, Python's garbage collector can reclaim the chunk buffer. The hashers maintain only their internal state (a fixed-size structure for each algorithm, typically 32–200 bytes), not a copy of the data.

Verified: In the test suite, test_30mb_file_memory_safe (in test_security.py) measures RSS memory before and after hashing a 30 MB file with three algorithms and asserts the increase is under 8 MB — well within normal Python interpreter overhead.

13. Error Handling

All errors are displayed via display_error() in formatter.py which prints a styled ERROR prefix followed by the message. The process then exits with code 1 via sys.exit(1).

Error conditions and their handling:

Condition	Detection point	Behaviour
File not found	`cli.py` before hashing	Error message, `sys.exit(1)` for single file; skipped with error for batches
Directory given without `-r`	`cli.py`	Error message, `sys.exit(1)`
Permission denied	`compute_hashes()` `open()`	`PermissionError` caught in `cli.py`, error message
Unsupported algorithm	`_make_hasher()`	`ValueError` with list of supported algorithms
`-chain` with < 2 algorithms	`parse_chain_algorithms()`	`ValueError` with example
`-chain` + algorithm flags	`main()`	Error message, `sys.exit(1)`
`-text` + file arguments	`main()`	Error message, `sys.exit(1)`
`-r` without path	`main()`	Error message, `sys.exit(1)`
`-check` without file	`main()`	Error message, `sys.exit(1)`
Clipboard unavailable	`copy_to_clipboard()`	Returns `False`; warning shown; execution continues
Empty directory with `-r`	`collect_files_recursive()`	Warning message; no hash attempted
Malformed manifest	`load_manifest_file()`	`json.JSONDecodeError` propagates to caller

Warnings (non-fatal) use display_warning() which prints a WARN prefix in yellow. Execution continues after a warning.

14. Test Suite

The test suite is written with Python's built-in unittest module and requires no additional testing framework. It consists of 199 tests across five modules.

Running the tests:

# From the project root, after pip install -e .
python3 -m unittest discover -s tests -v

Test modules:

`tests/test_hasher.py` — 48 tests (Unit)

Tests every public function in hasher.py in isolation.

Class	What it tests
`TestValidateAlgorithm`	All valid names, case normalisation, whitespace stripping, unsupported names, injection attempts
`TestComputeHashes`	Each of the 6 algorithms independently, all 6 together, empty file, binary file, 12 MB streaming correctness, missing file, result key ordering, multi-algo vs individual consistency
`TestChainHashFile`	2-algo chain, 3-algo chain, 6-algo chain, order sensitivity, chain vs independent hash, key ordering, error on < 2 algorithms
`TestChainHashText`	Same as above for text input; unicode, null bytes, empty string, single-algo rejection
`TestHashText`	Each algorithm, multiple independent algorithms, empty string, unicode, null byte, 1M character string, control characters
`TestHashFilesParallel`	All files hashed, hash correctness, multiple algorithms, nonexistent file returns error dict, deterministic across 5 concurrent runs, empty file list

`tests/test_utils.py` — 55 tests (Unit)

Tests every function in utils.py in isolation.

Class	What it tests
`TestFormatFileSize`	Zero, bytes boundary, KB, MB, GB, TB, return type
`TestCollectFilesRecursive`	Nested structure (4 files), files only, single file input, sorted order, empty directory
`TestCollectFilesFlat`	Non-recursive, single file, sorted order
`TestValidateAlgorithms`	Valid list, case, deduplication, order preservation, all 6, unsupported, error message
`TestParseChainAlgorithms`	2 and 3 algos, case, whitespace, single-algo rejection, empty string, unsupported, injection, 6 algos, error message
`TestCreateManifestFile`	File creation, valid JSON, all required fields, entries preserved, algorithms preserved, version type, default cwd path, unicode keys, special chars in keys, ISO timestamp, trailing newline
`TestLoadManifestFile`	Valid load, missing file, malformed JSON, round-trip
`TestGetFinalHash`	Single, multiple (returns last), empty dict
`TestCopyToClipboard`	Returns bool, no crash on unavailable clipboard, no crash on empty string

`tests/test_cli.py` — 47 tests (Integration)

Tests the CLI end-to-end using Click's CliRunner, which invokes main() in-process and captures stdout.

Class	What it tests
`TestInfoCommands`	No args shows usage, `--version`, `-algo`, `--algorithms`, `-about`, `-help`
`TestFileHashing`	All 6 algorithm flags, all 6 together, multiple algorithm flags, missing file, directory without `-r`, JSON output
`TestTextHashing`	Default SHA-256, with `-sha512`, multiple algorithms, empty string, string with spaces, `-text` + file conflict, JSON output
`TestChainHashing`	2-algo file chain, 3-algo file chain, chain with `-text`, intermediate hashes in output, `-chain` + algo flag conflict, single-algo chain rejection, missing file graceful handling, JSON output
`TestVerification`	Correct hash → MATCH, wrong hash → MISMATCH, case-insensitive comparison, SHA-512 check, BLAKE2b check, missing file
`TestMultipleFiles`	All hashes in output, JSON output, with SHA-512
`TestRecursive`	Finds all files in nested structure, missing path arg, nonexistent path
`TestManifest`	Single file manifest creation, recursive manifest creation

`tests/test_security.py` — 25 tests (Security)

Verifies that adversarial inputs cannot exploit the tool.

Class	What it tests
`TestAlgorithmInjection`	12 injection attempts rejected by `validate_algorithm`, `validate_algorithms`, `parse_chain_algorithms`, and the CLI chain flag
`TestFileOperationSafety`	Nonexistent path raises, script file not executed during hashing, manifest written only to specified path, manifest contains no executable code, manifest always valid JSON with special chars, malformed manifest raises not crashes, missing manifest raises FileNotFoundError
`TestInputSanitisation`	Null bytes, null bytes in chain, ANSI escape sequences, 7 unicode edge cases (zero-width space, BOM, emoji, combining chars, repeated nulls), 5 MB string, CLI verify with 10,000-char hash value
`TestMemoryBounds`	30 MB file with 3 algorithms; RSS increase must be < 8 MB
`TestCLIAdversarialInputs`	Conflicting chain + algo flags, text + file conflict, `-r` without path, `-check` without file, nonexistent file, directory as file argument
`TestConcurrencySafety`	10 files hashed in parallel, 5 repeated runs — all results correct and deterministic

`tests/test_system.py` — 24 tests (System / End-to-End)

Tests complete user workflows without mocking any internal components.

Class	What it tests
`TestHashAndVerifyRoundTrip`	All 6 algorithms: hash then verify → MATCH; tampered file → MISMATCH; default algo is SHA-256
`TestChainHashWorkflow`	3-step chain with all intermediates correct; chain with BLAKE2b as final; chain order changes output; chain text 3 steps
`TestMultiAlgorithmWorkflow`	All 6 simultaneously; JSON output contains correct digests; output contains algorithm labels
`TestManifestWorkflow`	Manifest created for all files in a 3-file corpus; all hashes are correct (verified by reading files and comparing); SHA-512 manifest entries correct; algorithms field matches
`TestRecursiveDirectoryWorkflow`	4-file nested structure; all hashes found in output; SHA-512 digests found (handles table folding)
`TestTextHashingWorkflow`	SHA-256 of "abc" correct; MD5 of "" correct; chain deterministic across two runs; unicode text uses UTF-8 encoding

15. Module Reference

`constants.py`

Pure data — no imports from other yhash modules, no side effects.

Name	Type	Value	Purpose
`VERSION`	`str`	`"1.0.0"`	Package version string
`CHUNK_SIZE`	`int`	`8192`	Bytes read per chunk during file streaming
`LARGE_FILE_THRESHOLD`	`int`	`5242880`	File size above which progress bars are shown
`DEFAULT_ALGORITHM`	`str`	`"sha256"`	Algorithm used when no flag is specified
`MANIFEST_EXTENSION`	`str`	`".yhash"`	File extension for manifest files
`SUPPORTED_ALGORITHMS`	`dict[str, str]`	`{"sha256": "SHA-256", ...}`	Internal name → display name mapping
`ALGO_FLAGS`	`set[str]`	`{"-sha256", ...}`	CLI flags for algorithms (lowercase)
`NORMALISABLE_FLAGS`	`set[str]`	`ALGO_FLAGS ∪ {...}`	All flags normalised by argv pre-processor
`ALGO_INFO`	`dict[str, tuple]`	`{"sha256": ("SHA-256", "256-bit", ...)}`	Algorithm metadata for the `-algo` table

`hasher.py`

Function	Signature	Description
`validate_algorithm`	`(str) -> str`	Validate and normalise a single algorithm name
`compute_hashes`	`(Path, List[str], *, progress, task) -> Dict[str, str]`	Stream a file through all hashers simultaneously
`chain_hash_file`	`(Path, List[str], *, progress, task) -> Dict[str, str]`	Chain-hash a file; output of each step feeds the next
`chain_hash_text`	`(str, List[str]) -> Dict[str, str]`	Chain-hash a UTF-8 string
`hash_text`	`(str, List[str]) -> Dict[str, str]`	Hash a string independently with each algorithm
`hash_files_parallel`	`(List[Path], List[str], *, max_workers) -> Dict[str, Any]`	Hash multiple files concurrently via ThreadPoolExecutor

`utils.py`

Function	Signature	Description
`format_file_size`	`(int) -> str`	Convert bytes to human-readable string
`collect_files_recursive`	`(Path) -> List[Path]`	All files under a directory (sorted)
`collect_files_flat`	`(Path) -> List[Path]`	Files directly inside a directory (non-recursive)
`validate_algorithms`	`(List[str]) -> List[str]`	Validate, normalise, and deduplicate algorithm names
`parse_chain_algorithms`	`(str) -> List[str]`	Parse a comma-separated chain spec
`create_manifest_file`	`(Dict, List[str], Optional[Path]) -> Path`	Write a `.yhash` JSON manifest file
`load_manifest_file`	`(Path) -> Dict[str, Any]`	Read and parse a manifest file
`copy_to_clipboard`	`(str) -> bool`	Copy text to system clipboard via pyperclip
`get_final_hash`	`(Dict[str, str]) -> str`	Return the last value in a results dict

`formatter.py`

Function	Description
`print_banner()`	ASCII art banner with version subtitle
`display_error(message)`	Red `ERROR` prefix + message
`display_warning(message)`	Yellow `WARN` prefix + message
`display_success(message)`	Green `OK` prefix + message
`display_hash_results(name, results, *, is_text, file_size)`	Two-line layout for file or text hash results
`display_chain_results(name, chain_results, *, is_text)`	Chain layout with `<- Final Chained Hash` marker
`display_multi_file_results(all_results, algorithms)`	Rich table for multiple files
`display_verification(file_path, expected, computed, algorithm)`	MATCH/MISMATCH panel with full hash display
`display_algo_table()`	Algorithm reference table
`display_about()`	About panel
`display_manifest_created(path, count)`	Success panel after manifest creation
`display_clipboard_success(hash_value)`	Clipboard copy confirmation
`display_json_results(data)`	Plain JSON output (no Rich markup)
`display_help()`	Full usage reference

`cli.py`

Symbol	Description
`cli_entry()`	Shell entry point; normalises argv then calls `main()`
`main()`	Click command; dispatches to the appropriate flow
`_normalise_argv(argv)`	Lowercases recognisable flags for case-insensitive input
`_build_algorithms(...)`	Converts boolean flag values to algorithm name list
`_hash_file(path, algorithms)`	Hash one file; conditionally shows progress bar
`_chain_file(path, algorithms)`	Chain-hash one file; conditionally shows progress bar
`_clipboard(results)`	Copy final hash to clipboard; show warning on failure
`_progress_ctx()`	Factory for a configured Rich Progress instance

Yhash v1.0.0 — MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
dist		dist
src		src
tests		tests
.DS_Store		.DS_Store
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Yhash: Complete Technical Documentation

Features

Table of Contents

1. Overview

2. Installation

From PyPI

Using pipx (recommended for CLI tools)

From source with pip

Editable install (for development)

3. Project Structure

4. Technology Stack

Click ≥ 8.1

Rich ≥ 13.7

hashlib (Python standard library)

pyperclip ≥ 1.9

concurrent.futures (Python standard library)

pyproject.toml / setuptools ≥ 68

5. Architecture & Internal Workings

5.1 Entry Point & argv Normalisation

5.2 CLI Dispatch Logic

5.3 Hashing Engine

_make_hasher(algorithm)

compute_hashes(file_path, algorithms, *, progress, task)

chain_hash_file(file_path, algorithms, *, progress, task)

chain_hash_text(text, algorithms)

hash_text(text, algorithms)

hash_files_parallel(file_paths, algorithms, *, max_workers)

5.4 Output Layer

5.5 Utilities Layer

6. Algorithms

7. Feature Reference

7.1 Default File Hashing

7.2 Algorithm Selection Flags

7.3 Multiple Algorithms in One Pass

7.4 Chained Hashing

7.5 Text / String Hashing

7.6 Multiple Files

7.7 Recursive Directory Hashing

7.8 Hash Verification

7.9 Manifest Creation

7.10 Clipboard Copy

7.11 JSON Output

7.12 Algorithm Reference Table

7.13 About Panel

7.14 Help

7.15 Version

8. Flag Compatibility Matrix

9. Progress Bars

10. Hash Display Design

11. Security Model

Algorithm Whitelist

No Shell Calls

File Reading Is Read-Only

Memory Safety

Manifest Integrity

Clipboard Safety

Error Isolation in Parallel Processing

12. Memory Safety

13. Error Handling

14. Test Suite

tests/test_hasher.py — 48 tests (Unit)

tests/test_utils.py — 55 tests (Unit)

tests/test_cli.py — 47 tests (Integration)

tests/test_security.py — 25 tests (Security)

tests/test_system.py — 24 tests (System / End-to-End)

15. Module Reference

constants.py

hasher.py

utils.py

formatter.py

cli.py

About

Resources

License

Uh oh!

Stars

`_make_hasher(algorithm)`

`compute_hashes(file_path, algorithms, *, progress, task)`

`chain_hash_file(file_path, algorithms, *, progress, task)`

`chain_hash_text(text, algorithms)`

`hash_text(text, algorithms)`

`hash_files_parallel(file_paths, algorithms, *, max_workers)`

`tests/test_hasher.py` — 48 tests (Unit)

`tests/test_utils.py` — 55 tests (Unit)

`tests/test_cli.py` — 47 tests (Integration)

`tests/test_security.py` — 25 tests (Security)

`tests/test_system.py` — 24 tests (System / End-to-End)

`constants.py`

`hasher.py`

`utils.py`

`formatter.py`

`cli.py`

Packages