Source of truth:
jdm-java(smartbranch) as of 2026-06-20. This document describes the behaviour of the Java implementation so that a functionally-equivalent Python port can be built without referencing the Java source directly.
pydiskmark is a cross-platform disk benchmark utility. It measures sustained sequential and random I/O performance and reports bandwidth (MB/s), latency (ms/block), and IOPS.
The tool runs in two modes:
- CLI — primary mode; invoked with
python -m pydiskmark run. - GUI — desktop interface built with Tkinter + matplotlib; launch with
python -m pydiskmark gui.
| Term | Meaning |
|---|---|
| Benchmark | A single top-level run, may contain one WRITE and/or one READ operation. |
| Operation | One directed I/O phase — either WRITE or READ — within a benchmark. |
| Sample | A single timed measurement unit: writes/reads numBlocks blocks and records elapsed time. |
| Block | The atomic I/O unit; size is configurable (e.g. 512 KB, 4 KB). |
| txSize | Total KB transferred by one operation = blockSizeKb × numBlocks × numSamples. |
| Bandwidth | MB/s = bytes-transferred / elapsed-seconds. |
| Latency | Average time per block in milliseconds = elapsed_ns / 1_000_000 / numBlocks. |
| IOPS | Blocks (ops) per second across the entire operation = totalOps / elapsed_sec. |
KILOBYTE = 1_024 # bytes
MEGABYTE = 1_024 * 1_024 # bytes
GIGABYTE = 1_024 * 1_024 * 1_024
APP_NAME = "pydiskmark"
DATADIRNAME = "pdm-data" # sub-directory created inside the chosen location
PROPERTIES_FILENAME = "pdm.properties"
Binary units (powers of 2) are used throughout — never decimal SI units.
READ — read-only benchmark (files written silently in a prep phase)
WRITE — write-only benchmark
READ_WRITE — write phase followed by a read phase on the same data
READ
WRITE
SEQUENTIAL — blocks accessed in order 0, 1, 2 … N-1
RANDOM — each block index chosen via randint(0, numBlocks-1)
MODERN — positional I/O via os.pwrite / os.pread (POSIX) or
CreateFileW / WriteFile (Windows), with optional Direct I/O
(O_DIRECT on Linux/macOS; FILE_FLAG_NO_BUFFERING on Windows).
Aligned buffers are required when Direct I/O is active.
| Constant | Bytes | Display Label |
|---|---|---|
NONE |
-1 (OS default) | None (OS Default) |
ALIGN_512 |
512 | 512 B (Legacy) |
ALIGN_4K |
4 096 | 4 KB (Standard) — default |
ALIGN_8K |
8 192 | 8 KB (Enterprise) |
ALIGN_16K |
16 384 | 16 KB (High-End) |
ALIGN_64K |
65 536 | 64 KB (RAID/Stripe) |
Each enum member stores a (bytes, display) tuple, with .bytes and .display
properties and a __str__ that returns the display label.
When MODERN engine is used with Direct I/O, the write/read buffer must
be aligned to the selected value. In Python, VirtualAlloc (Windows) returns
64 KB-aligned memory and mmap (POSIX) returns page-aligned memory, both of
which satisfy all alignment values.
Captured once at benchmark start; immutable during a run.
| Field | Type | Default | Notes |
|---|---|---|---|
app_version |
str | from version file | |
profile |
BenchmarkProfile | QUICK_TEST | |
profile_modified |
bool | False | |
benchmark_type |
BenchmarkType | READ_WRITE | |
block_order |
BlockSequence | SEQUENTIAL | |
num_blocks |
int | 32 | blocks per sample |
block_size |
int | bytes (e.g. 512*1024) | block_size_kb * KILOBYTE |
num_samples |
int | 200 | |
num_threads |
int | 1 | |
tx_size |
int | KB | block_size_kb * num_blocks * num_samples |
io_engine |
IoEngine | MODERN | |
direct_io_enabled |
bool | False | |
write_sync_enabled |
bool | False | |
sector_alignment |
SectorAlignment | ALIGN_4K | |
multi_file_enabled |
bool | True | |
test_dir |
str | path | absolute path to data directory |
Helper predicates:
def has_write_operation(self) -> bool:
return self.benchmark_type in (BenchmarkType.WRITE, BenchmarkType.READ_WRITE)
def has_read_operation(self) -> bool:
return self.benchmark_type in (BenchmarkType.READ, BenchmarkType.READ_WRITE)Captured once at benchmark start from the runtime environment.
| Field | Type | Notes |
|---|---|---|
os |
str | platform.system() |
arch |
str | platform.machine() |
processor_name |
str | CPU brand string |
runtime |
str | "Python x.y.z" replaces Java's jdk field |
location_dir |
str | absolute path of the test location |
| Field | Type | Notes |
|---|---|---|
drive_model |
str | human-readable device model |
partition_id |
str | drive letter (Windows) or partition path (Linux/macOS) |
percent_used |
int | 0–100 |
used_gb |
float | |
total_gb |
float |
One timed I/O measurement.
| Field | JSON key | Type | Notes |
|---|---|---|---|
type |
— | IOMode | not serialised |
sample_num |
sn |
int | 1-based sample index |
bw_mb_sec |
bw |
float | bandwidth for this sample (MB/s) |
cum_avg |
bt |
float | running cumulative average bandwidth |
cum_max |
mx |
float | running cumulative maximum |
cum_min |
mn |
float | running cumulative minimum |
access_time_ms |
la |
float | latency: elapsed_ns/1e6 / num_blocks |
cum_acc_time_ms |
lt |
float | running cumulative average latency |
All float fields are rounded to 4 decimal places in JSON/YAML/CSV output.
Aggregated results of one I/O phase.
| Field | Type | Notes |
|---|---|---|
io_mode |
IOMode | READ or WRITE |
block_order |
BlockSequence | |
num_blocks |
int | |
block_size |
int | bytes |
num_samples |
int | |
tx_size |
int | KB |
num_threads |
int | |
write_sync_enabled |
bool | None | None for READ operations |
start_time |
datetime | set on construction |
end_time |
datetime | None | set after all samples complete |
samples |
list[Sample] | all samples in insertion order |
bw_avg |
float | final average bandwidth (MB/s) |
bw_max |
float | |
bw_min |
float | |
acc_avg |
float | final average latency (ms) |
iops |
int | total blocks / elapsed seconds (see §8.4) |
Display helpers:
def get_mode_display(self) -> str:
if self.io_mode == IOMode.WRITE and self.write_sync_enabled:
return "Write*"
return self.io_mode.value
def get_duration_ms(self) -> int | None:
if self.end_time is None:
return None
return int((self.end_time - self.start_time).total_seconds() * 1000)Top-level container for a complete benchmark run.
| Field | JSON key | Type | Notes |
|---|---|---|---|
id |
_id |
str (UUID4) | primary key; string in JSON |
username |
— | str | os.getlogin() or "anonymous" |
system_info |
— | BenchmarkSystemInfo | |
drive_info |
— | BenchmarkDriveInfo | |
config |
— | BenchmarkConfig | |
start_time |
— | datetime | set by record_start_time() |
end_time |
— | datetime | None | set by record_end_time() |
operations |
— | list[BenchmarkOperation] | 1 or 2 elements |
Result text output format:
-------------------------------------------
pydiskmark Benchmark Results (vX.Y)
-------------------------------------------
Profile: <name>
Benchmark: <type>
Drive: <model>
Capacity: <percent>% (<used>/<total> GB)
Timestamp: <start_time>
CPU: <processor_name>
System: <os> / <arch>
Runtime: <runtime_string>
Path: <location_dir>
-------------------------------------------
Order: SEQUENTIAL|RANDOM
IOMode: Read|Write
Thread(s): N
Blocks(size): N(B)
Samples: N
TxSize(KB): N
Speed(MB/s): N.NN
SpeedMin(MB/s): N.NN
SpeedMax(MB/s): N.NN
Latency(ms): N.NN
IOPS: N
-------------------------------------------
Each profile is a named constant with fixed defaults. CLI users can override individual parameters without switching profiles.
| Symbol | Name | Type | Order | Threads | Samples | Blocks | BlockKB | Direct | Sync | Alignment | MultiFile |
|---|---|---|---|---|---|---|---|---|---|---|---|
QUICK_TEST |
Quick Test | READ_WRITE | SEQ | 1 | 50 | 32 | 1024 | Yes | No | 4K | No |
MAX_THROUGHPUT |
Max Throughput | READ_WRITE | SEQ | 1 | 100 | 256 | 1024 | Yes | No | 4K | No |
HIGH_LOAD_RANDOM_T32 |
Random 4K (T32) | READ_WRITE | RAND | 32 | 200 | 128 | 4 | Yes | No | 4K | Yes |
LOW_LOAD_RANDOM_T1 |
Random 4K (T1) | READ_WRITE | RAND | 1 | 150 | 64 | 4 | Yes | No | 4K | No |
MAX_WRITE_STRESS |
Max Write Stress (T4) | WRITE | SEQ | 4 | 250 | 512 | 512 | Yes | Yes | 4K | Yes |
MEDIA_PLAYBACK |
Media Playback | READ | SEQ | 1 | 160 | 64 | 2048 | Yes | No | 4K | No |
VIDEO_EXPORTING |
Video Exporting | WRITE | SEQ | 4 | 500 | 128 | 1024 | Yes | No | 4K | No |
PHOTO_LIBRARY |
Photo Library | READ | RAND | 8 | 1000 | 8 | 128 | Yes | No | 4K | Yes |
<location_dir>/
└── pdm-data/ ← DATADIRNAME
├── testdata.pdm ← single-file mode
└── testdata0.pdm ← multi-file mode (one per sample)
testdata1.pdm
…
testdataN.pdm
File path selection (per sample):
def get_test_file(sample_num: int, config: BenchmarkConfig) -> Path:
if config.multi_file_enabled:
return Path(config.test_dir) / f"testdata{sample_num}.pdm"
return Path(config.test_dir) / "testdata.pdm"Samples are divided evenly across threads. Remainder samples are distributed one-per-thread to the leading threads.
def divide_into_ranges(start_index: int, end_index: int, num_threads: int) -> list[tuple[int,int]]:
"""Returns list of (start, end) exclusive ranges, one per thread."""
n = end_index - start_index
ranges = []
range_size, remainder = divmod(n, num_threads)
start = start_index
for i in range(num_threads):
end = start + range_size + (1 if remainder > 0 else 0)
remainder = max(0, remainder - 1)
ranges.append((start, end))
start = end
return ranges1. Compute total units (for progress bar):
blocks_per_phase = num_blocks × num_samples
w_units = blocks_per_phase if has_write OR is READ (prep phase reuses write counter)
r_units = blocks_per_phase if has_read
units_total = w_units + r_units
2. Query drive info (model, partition, disk usage).
3. Construct Benchmark object; populate system/drive info.
4. Partition sample range into thread ranges:
start = next_sample_number (1-based, global counter)
end = start + num_samples
ranges = divide_into_ranges(start, end, num_threads)
5. record_start_time()
6. Execution:
a. If has_write: run_operation(WRITE, ranges)
b. Else (READ-only): run_read_preparation(ranges) # writes files silently
7. Force progress to 100 % (throttled_progress_update(force=True))
8. Cache drop (if not cancelled AND has_read AND not direct_io OR (direct_io AND macOS)):
call listener.attempt_cache_drop()
9. If has_read AND not cancelled: run_operation(READ, ranges)
10. record_end_time()
11. Return benchmark object.
1. Create BenchmarkOperation for this mode (copy params from config).
2. Launch one thread per range.
3. Per thread, for each sample index s in range:
a. Create Sample(type=mode, sample_num=s)
b. Perform I/O (measure_write or measure_read)
c. update_metrics(sample) — update global running stats
d. Update op cumulative stats: bw_max, bw_min, bw_avg, acc_avg
e. op.samples.append(sample)
f. Increment write_units_complete or read_units_complete (thread-safe)
g. listener.on_sample_complete(sample)
h. throttled_progress_update()
4. Wait for all threads to finish.
5. Set op.end_time; compute op.iops (see §8.4).
def set_total_ops(op: BenchmarkOperation, total_ops: int) -> None:
elapsed_ns = (op.end_time - op.start_time).total_seconds() * 1e9
if elapsed_ns > 0:
op.iops = round(total_ops / (elapsed_ns / 1e9))total_ops is the sum of blocks completed across all threads for this mode
(i.e. write_units_complete or read_units_complete).
Called after each sample, before notifying the listener.
# For WRITE samples:
if w_max == -1 or sample.bw_mb_sec > w_max: w_max = sample.bw_mb_sec
if w_min == -1 or sample.bw_mb_sec < w_min: w_min = sample.bw_mb_sec
n = sample.sample_num
if w_avg == -1:
w_avg = sample.bw_mb_sec
else:
w_avg = ((n - 1) * w_avg + sample.bw_mb_sec) / n
if w_acc == -1:
w_acc = sample.access_time_ms
else:
w_acc = ((n - 1) * w_acc + sample.access_time_ms) / n
sample.cum_avg = w_avg
sample.cum_max = w_max
sample.cum_min = w_min
sample.cum_acc_time_ms = w_acc
# Mirror for READ using r_* variablesProgress updates to the listener are rate-limited:
UPDATE_INTERVAL_MS = 25
def throttled_progress_update(force: bool = False) -> None:
now_ms = time.monotonic_ns() // 1_000_000
elapsed = now_ms - last_update_ms
completed = write_units_complete + read_units_complete
if force or elapsed >= UPDATE_INTERVAL_MS:
percent = int(completed / units_total * 100)
percent = max(0, min(100, percent))
listener.on_progress_update(percent, 100)
last_update_ms = now_msOpen file with: WRITE | CREATE
+ DSYNC if write_sync_enabled
+ O_DIRECT if direct_io_enabled (Linux/macOS only)
Allocate aligned buffer of block_size bytes (alignment = sector_alignment.bytes,
or default system alignment if NONE).
For b in range(num_blocks):
if cancelled: break
block_index = randint(0, num_blocks-1) if RANDOM else b
byte_offset = block_index * block_size
pwrite(fd, buffer, byte_offset) or equivalent positional write
total_bytes_written += block_size
runner.update_write_progress()
elapsed_ns = end_ns - start_ns
access_time_ms = (elapsed_ns / 1e6) / num_blocks
bw_mb_sec = (total_bytes_written / MEGABYTE) / (elapsed_ns / 1e9)
Open file with: READ
+ O_DIRECT if direct_io_enabled
For b in range(num_blocks):
if cancelled: break
block_index = randint(0, num_blocks-1) if RANDOM else b
byte_offset = block_index * block_size
pread(fd, buffer, byte_offset)
total_bytes_read += block_size
runner.update_read_progress()
elapsed_ns = end_ns - start_ns
access_time_ms = (elapsed_ns / 1e6) / num_blocks
bw_mb_sec = (total_bytes_read / MEGABYTE) / (elapsed_ns / 1e9)
Used when benchmark_type == READ (no prior WRITE phase). Writes sequential
data to the test file without timing or recording bandwidth. Uses the write
units counter so the progress bar reflects preparation work.
Open file with: WRITE | CREATE | TRUNCATE
For b in range(num_blocks):
if cancelled: break
byte_offset = b * block_size
pwrite(fd, buffer, byte_offset)
runner.update_write_progress()
Must be attempted before the READ phase of a READ or READ_WRITE benchmark
(to prevent reads hitting the page cache).
| OS | Privileged | Action |
|---|---|---|
| Linux | root | sync then echo 1 > /proc/sys/vm/drop_caches |
| Linux | non-root | Print instructions; block until user presses Enter |
| macOS | root | sync; sudo purge |
| macOS | non-root | Print instructions; block until user presses Enter |
| Windows | admin | Run EmptyStandbyList.exe from install dir |
| Windows | non-admin | Print instructions; block until user presses Enter |
Skip cache drop entirely when:
- Direct I/O is enabled and OS is not macOS (kernel bypasses page cache).
- Benchmark was cancelled.
| OS | Method |
|---|---|
| Linux | lsblk -o NAME,MODEL, resolve via /sys/block/<dev>/device/model |
| macOS | diskutil info <device> → Device / Media Name |
| Windows | WMI query Win32_DiskDrive → Model, mapped via drive letter |
| OS | Field |
|---|---|
| Linux | /proc/mounts or df → resolves symlinks to /dev/sdX or /dev/nvmeXnY |
| macOS | df then diskutil info |
| Windows | Extract drive letter from path root (C, D, …) |
| OS | Command |
|---|---|
| Linux / macOS | df -k <path> — parse 1K-blocks, Used, Use% columns |
| Windows | WMI Win32_LogicalDisk or GetDiskFreeSpaceEx |
Result fields: percent_used, used_gb, total_gb.
| OS | Method |
|---|---|
| Linux | Parse /proc/cpuinfo → model name |
| macOS | sysctl -n machdep.cpu.brand_string |
| Windows | WMIC CPU get Name or winreg → HKLM\HARDWARE\DESCRIPTION\System\CentralProcessor\0\ProcessorNameString |
| OS | Support |
|---|---|
| Linux | os.O_DIRECT flag; buffer must be sector-aligned |
| macOS | fcntl.F_NOCACHE via fcntl(fd, fcntl.F_NOCACHE, 1) |
| Windows | FILE_FLAG_NO_BUFFERING via CreateFile (ctypes or pywin32) |
If Direct I/O open fails, fall back silently to buffered I/O and log a warning.
| OS | Mechanism |
|---|---|
| Linux | Open with O_SYNC or O_DSYNC; alternatively os.fsync(fd) per write |
| macOS | Same as Linux |
| Windows | FILE_FLAG_WRITE_THROUGH via CreateFile |
Settings are stored in a Java .properties-style flat-text file:
~/.pdm/<version>/pdm.properties
Format: key=value, one per line, # comment lines.
| Key | Default | Notes |
|---|---|---|
activeProfile |
QUICK_TEST |
|
profileModified |
false |
|
benchmarkType |
READ_WRITE |
|
blockSequence |
SEQUENTIAL |
|
numOfSamples |
200 |
|
numOfBlocks |
32 |
|
blockSizeKb |
512 |
|
numOfThreads |
1 |
|
ioEngine |
MODERN |
|
writeSyncEnable |
false |
|
directEnable |
false |
|
sectorAlignment |
ALIGN_4K |
|
multiFile |
true |
|
autoRemoveData |
true |
|
autoReset |
true |
Note: The Python port also persists
theme,locationDirto cover the most-used GUI preferences. Properties not relevant to the Python port (palette,sharePortal,uploadResourceLocator,uploadProtocol) are omitted.
Entry point: python -m pydiskmark run [OPTIONS]
| Option | Short | Type | Default | Description |
|---|---|---|---|---|
--profile |
-p |
str | QUICK_TEST |
Named profile |
--type |
-t |
str | profile default | READ, WRITE, READ_WRITE |
--threads |
-T |
int | profile default | Number of concurrent threads |
--order |
-o |
str | profile default | SEQUENTIAL, RANDOM |
--blocks |
-b |
int | profile default | Blocks per sample |
--block-size |
-z |
int | profile default | Block size in KB |
--samples |
-n |
int | profile default | Number of samples |
--direct |
-d |
flag | False | Enable Direct I/O |
--write-sync |
-y |
flag | False | Enable write-sync |
--alignment |
-a |
str | ALIGN_4K |
NONE, ALIGN_512, ALIGN_4K, ALIGN_8K, ALIGN_16K, ALIGN_64K (profile default used if not specified) |
--multi-file |
-m |
flag | False | One file per sample |
--location |
-l |
path | $HOME |
Directory for test files |
--export |
-e |
path | None | Export results to JSON file |
--save |
-s |
flag | False | Persist to local database |
--clean |
-c |
flag | False | Delete existing data dir first |
--verbose |
-v |
flag | False | Verbose logging |
Override precedence: explicit CLI option > profile default.
1. Load profile defaults into App state.
2. Apply any explicit CLI overrides.
3. Set location_dir; derive data_dir = location_dir / "pdm-data".
4. Validate location_dir is writable.
5. If --clean and data_dir exists: delete recursively.
6. Create data_dir if not present.
7. init() — collect OS/CPU info.
8. Print progress bar during benchmark.
9. Print result text after completion.
10. Export JSON if --export specified.
11. Save to DB if --save specified.
12. Remove data_dir if auto_remove_data is True.
Progress: [########## ] 50% (50/100 units)
- Length: 50 characters of
#/ space. - Rendered with
\r(carriage return, no newline) atUPDATE_INTERVAL = 25 ms. - Cursor hidden during run (
\x1b[?25l), restored after (\x1b[?25h).
Full serialisation of the Benchmark object tree using the JSON field names
defined in §5. Pretty-printed, 2-space indent.
{
"_id": "a1b2c3d4-e5f6-...",
"username": "james",
"config": { ... },
"systemInfo": { ... },
"driveInfo": { ... },
"startTime": "2026-06-20T14:30:00",
"endTime": "2026-06-20T14:31:30",
"operations": [
{
"ioMode": "WRITE",
"samples": [
{ "sn": 1, "bw": 523.4, "bt": 523.4, "la": 0.95, "lt": 0.95, "mn": 523.4, "mx": 523.4 }
],
"bandwidth": 523.4,
"latency": 0.95,
"iops": 12345
}
]
}Same structure as JSON, emitted as YAML without --- document-start marker.
Flat table of samples with metadata header as # comment lines.
# pydiskmark x.y Benchmark Summary
# ---------------------------
# Date: 2026-06-20 14:30:00
# Model: Samsung 990 Pro
# Profile: Quick Test
# Type: READ_WRITE
# Threads: 1
# Order: SEQUENTIAL
# Blocks: 32
# BlockSize: 524288
# Samples: 50
# WRITE Result: bw 523.40 MB/s, lat 0.95 ms, iops 12345
# READ Result: bw 610.20 MB/s, lat 0.82 ms, iops 14321
# ---------------------------
sn,ioMode,bw,bt,la,lt,mn,mx
1,WRITE,523.4,523.4,0.95,0.95,523.4,523.4
...
The Java version uses Apache Derby (embedded SQL) with JPA/Hibernate. For the
Python port use SQLite (sqlite3 stdlib) as the embedded database.
Schema tables: benchmark, benchmark_operation.
benchmark.id— UUID string (primary key)benchmark_operation.benchmark_id— foreign key
The database file lives at:
~/.pdm/<version>/pdm.db
Operations to implement: save, find_all, delete_all, delete_by_ids.
BenchmarkRunner is decoupled from output via a listener protocol:
class BenchmarkListener(Protocol):
def on_sample_complete(self, sample: Sample) -> None: ...
def on_progress_update(self, completed: int, total: int) -> None: ...
def is_cancelled(self) -> bool: ...
def attempt_cache_drop(self) -> None: ...The CLI implementation of attempt_cache_drop calls the OS-specific cache
drop logic (§10.1) and blocks until the user confirms or completes.
These globals are equivalent to Java's static App.* fields.
In Python, encapsulate in an AppState dataclass or module-level variables.
| Variable | Type | Default | Notes |
|---|---|---|---|
location_dir |
Path | None | where test files go |
data_dir |
Path | None | location_dir / "pdm-data" |
export_path |
Path | None | |
auto_save |
bool | False | persist to DB |
verbose |
bool | False | |
multi_file |
bool | True | |
auto_remove_data |
bool | True | delete data dir after run |
auto_reset |
bool | True | reset running stats before run |
direct_enable |
bool | False | |
write_sync_enable |
bool | False | |
io_engine |
IoEngine | MODERN | |
sector_alignment |
SectorAlignment | ALIGN_4K | |
active_profile |
BenchmarkProfile | QUICK_TEST | |
profile_modified |
bool | False | |
benchmark_type |
BenchmarkType | READ_WRITE | |
block_sequence |
BlockSequence | SEQUENTIAL | |
num_of_samples |
int | 200 | |
num_of_blocks |
int | 32 | |
block_size_kb |
int | 512 | |
num_of_threads |
int | 1 | |
next_sample_number |
int | 1 | global monotonic counter |
os |
str | — | from platform |
arch |
str | — | from platform |
processor_name |
str | — | |
username |
str | — | from os |
Running stats (reset between benchmarks when auto_reset = True):
w_max = w_min = w_avg = w_acc = w_iops = -1.0
r_max = r_min = r_avg = r_acc = r_iops = -1.0jdm-python/
├── pydiskmark/
│ ├── __init__.py
│ ├── __main__.py ← entry point: python -m pydiskmark
│ ├── app.py ← global state, init(), get_config()
│ ├── benchmark.py ← Benchmark, BenchmarkConfig, BenchmarkSystemInfo,
│ │ BenchmarkDriveInfo, BenchmarkType, IOMode,
│ │ BlockSequence, IoEngine, SectorAlignment
│ ├── benchmark_operation.py
│ ├── benchmark_profile.py ← enum of pre-defined profiles
│ ├── benchmark_runner.py ← BenchmarkRunner, BenchmarkListener
│ ├── cli.py ← argparse entry point, CliListener, results printer
│ ├── exporter.py ← JSON / YAML / CSV export
│ ├── io_engine.py ← cross-platform Direct I/O: alloc_aligned, open_file,
│ │ pwrite, pread, close_file, free_aligned
│ ├── sample.py ← Sample, measure_write, measure_read, prepare_read
│ ├── util.py ← randint, delete_directory, etc.
│ └── util_os.py ← OS-specific: drive model, cache drop, disk usage,
│ processor name, partition id
├── tests/
│ ├── test_phase1.py
│ ├── test_phase2.py
│ └── test_phase3.py
├── pyproject.toml
├── README.md
└── SPEC.md ← this file
| Java concern | Python equivalent |
|---|---|
ExecutorService.newFixedThreadPool(N) |
concurrent.futures.ThreadPoolExecutor(N) |
LongAdder / AtomicLong |
threading.Lock + int, or threading.local partial sums |
RandomAccessFile (legacy) |
removed — only MODERN engine is implemented |
FileChannel + MemorySegment (modern) |
os.open() + os.pwrite() / os.pread() |
ExtendedOpenOption.DIRECT |
os.O_DIRECT (Linux); fcntl.F_NOCACHE (macOS) |
StandardOpenOption.DSYNC |
os.O_DSYNC (Linux/macOS) or os.fsync() per write |
| JPA/Derby database | sqlite3 (stdlib) |
| Jackson JSON serialiser | json stdlib or dataclasses-json |
System.nanoTime() |
time.perf_counter_ns() |
| GC detection / hints | Not applicable in Python — removed from model |
picocli |
argparse or click |
Single-instance lock (FileLock) |
Not required for CLI-only port |
- SMART data collection (
Smart.java,SmartPanel.java) - Community portal upload (
Portal.java) - Windows MSI / Linux DEB / macOS PKG packaging
Status: Implemented. Launch with
python -m pydiskmark gui.
The GUI provides a visual desktop interface that replicates the Java Swing
frontend's layout and functionality. It uses the same BenchmarkRunner /
BenchmarkListener pipeline as the CLI — no changes to the engine layer.
Chosen: Tkinter + matplotlib + sv-ttk
| Factor | Decision |
|---|---|
| Toolkit | tkinter (stdlib) — zero runtime cost, ships with Python |
| Theme | sv-ttk (Sun Valley) — modern flat dark/light appearance |
| Chart | matplotlib via FigureCanvasTkAgg — dual-axis, real-time, well-packaged |
| Threading | Queue + root.after(50ms) polling — all Tkinter mutations on main thread |
Additional runtime dependencies: matplotlib>=3.7, sv-ttk>=2.5, Pillow>=9 (optional — used for splash and About icons; falls back to emoji if absent).
┌────────────────────────────────────────────────────────────────┐
│ Menu: File | Action | Options | Help │
├───┬────────────────────────────────────────────────────────────┤
│ D │ Drives tab selected: │
│ r │ DrivesPanel fills the ENTIRE content area │
│ i │ (drive selector, info card, all-drives table, test dir) │
│ v │ │
│ e │ Benchmark tab selected: │
│ s │ ┌──────────────┬──────────────────────────────────────┐ │
│ │ │ ControlPanel │ ChartPanel │ │
│ B │ │ (320 px) │ (matplotlib dual-axis, fills rest) │ │
│ e │ │ settings / │ │ │
│ n │ │ start/stop / │ │ │
│ c │ │ metrics grid │ │ │
│ h │ └──────────────┴──────────────────────────────────────┘ │
├───┴────────────────────────────────────────────────────────────┤
│ [Benchmark Operations] [Events] │
│ HistoryPanel — treeview of past operations │
├────────────────────────────────────────────────────────────────┤
│ Status text [progress bar] Total Tx (KB): N │
└────────────────────────────────────────────────────────────────┘
pydiskmark/
├── db.py ← SQLite persistence (Phase 5)
└── gui/
├── __init__.py ← launch_gui() entry point
├── theme.py ← sv_ttk dark/light + chart colour palette
├── listener.py ← GuiListener — queue-based BenchmarkListener
├── chart_panel.py ← matplotlib FigureCanvasTkAgg, dual-axis
├── control_panel.py ← Benchmark tab — settings combos + results grid
├── drives_panel.py ← Drives tab — drive list, info card, dir chooser
├── history_panel.py ← Benchmark Operations tab — DB history treeview
└── main_window.py ← MainWindow orchestrator
- Top: drive selector dropdown (Drive letter — capacity)
- Left sub-panel: Drive Info card
- Drive model, partition, usage percentage, used/total GB
- Access indicators: Read ✓ / Write ✓
- Usage progress bar
- Right sub-panel: All Drives table
- Columns: Drive/Mount | Total (GB) | Used (GB) | Free (GB) | Usage %
- Click a row → update drive info card + set
app.location_dir
- Bottom: Test Directory path display + Browse button
- Profile, Type, Threads, Block Order, Blocks/Sample, Block Size (KB), Samples — dropdowns
- Start / Stop button (toggles, disables combos during run)
- Results grid: 3-column (Metric | Write | Read) for MB/s, Lat (ms), IOPS
- Left Y-axis: Bandwidth (MB/s)
- Right Y-axis: Latency (ms)
- X-axis: Sample number
- Series:
- Write BW — solid orange line
- Write Avg — dashed orange line
- Write Latency — small orange square markers (right axis)
- Read BW — solid cyan line
- Read Avg — dashed cyan line
- Read Latency — small cyan square markers (right axis)
- Batched redraws every 3rd sample to avoid lag at high sample counts
set_title(str)— updates the chart suptitle with drive inforetheme()— reapply colours after dark/light toggle
Worker thread (BenchmarkRunner)
└─ GuiListener.on_sample_complete() ──→ queue.put((EVT_SAMPLE, sample))
└─ GuiListener.on_progress_update() ──→ queue.put((EVT_PROGRESS, %, %))
└─ GuiListener.attempt_cache_drop() ──→ queue.put((EVT_CACHE_DROP, event))
blocks until main thread sets event
└─ _run_worker() completes ──→ queue.put((EVT_COMPLETE, benchmark))
Main thread (Tkinter)
└─ _poll_queue() every 50 ms via root.after()
drains queue → updates chart, progress bar, metrics labels
on EVT_COMPLETE → saves to DB, refreshes history, re-enables controls
After the worker thread dies, _poll_queue performs one extra drain to
catch EVT_COMPLETE posted just before thread exit (eliminates the
"Benchmark cancelled" false-positive race condition).
The main window is built entirely off-screen to avoid flicker:
1. root.withdraw() — hide window before building
2. _SplashScreen shown — borderless Toplevel, progress bar 0–100
3. Each build step advances the progress bar (10 % → 80 %)
4. DrivesPanel.refresh() — FAST phase only (shutil.disk_usage, <5 ms)
Background thread starts; results delivered via queue.Queue
5. update_idletasks() + retheme()
6. root.wm_attributes("-alpha", 0) — compositor-level invisible
7. root.deiconify()
8. root.update() — drain full event queue (background
patches, matplotlib draws) while invisible
9. root.wm_attributes("-alpha", 1) — snap to fully-rendered in one frame
10. splash.close() — dismiss only after main window is opaque
DrivesPanel two-phase refresh:
| Phase | Thread | What happens |
|---|---|---|
| FAST | Main | shutil.disk_usage() only — populates table with "..." in Model column, <5 ms |
| SLOW | Daemon | get_drive_model(), get_filesystem(), get_bus_type(), get_sector_sizes() per drive; selected drive processed first |
Results are handed back to the main thread via queue.Queue + a
50 ms after() poller (_poll_update_queue). self.after() is
never called from the background thread — Tkinter's _register()
is not thread-safe in Python 3.14+.
DB location: ~/.pdm/<version>/pdm.db (SQLite, stdlib sqlite3).
Schema — one row per benchmark operation:
CREATE TABLE benchmark_ops (
id INTEGER PRIMARY KEY AUTOINCREMENT,
group_id TEXT, -- UUID shared by ops from the same run
drive_model TEXT,
partition_id TEXT,
profile TEXT,
benchmark_type TEXT,
io_mode TEXT, -- WRITE | READ
block_order TEXT,
num_samples INTEGER,
num_blocks INTEGER,
block_size_kb INTEGER,
num_threads INTEGER,
start_time TEXT,
elapsed_ms INTEGER,
lat_avg_ms REAL,
iops INTEGER,
bw_mb_sec REAL,
data_json TEXT -- full benchmark JSON for chart replay
);Benchmarks are auto-saved after each successful run. Double-clicking a row in the "Benchmark Operations" history tab replays that benchmark in the chart without re-running I/O.
1. User selects drive (Drives tab) — sets location_dir
2. User selects profile / adjusts settings (Benchmark tab)
3. User clicks Start (or Ctrl+R)
4. GUI disables controls, clears chart, resets metrics
5. BenchmarkRunner.execute() runs in a daemon thread
6. on_sample_complete() → queue → chart.add_sample() every 50 ms
7. on_progress_update() → queue → progress bar update
8. attempt_cache_drop() → queue → modal dialog, blocks worker until dismissed
9. EVT_COMPLETE → db.save_benchmark() → history.refresh() → re-enable controls
10. Elapsed time shown in status bar; drive info refreshed
- File → Export: file dialog → JSON / YAML / CSV (reuses
exporter.export()) - Double-click history row: replay historical benchmark in chart (no re-run)
- Left-tab layout matches jdm-java (Drives | Benchmark vertical tabs)
- Dark mode by default via sv_ttk; Options → Toggle Theme switches to light
- Settings persisted to
~/.pdm/<version>/pdm.propertieson exit; loaded at startup - Default
benchmarkTypeisREAD_WRITEwhen no config file exists - Keyboard shortcuts:
Ctrl+R(start),Esc(stop) - About dialog centred on the parent window (
transient + grab_set) - Window title:
pydiskmark <version> — <arch> — <CPU> - Chart suptitle:
<drive model> — <partition>: <pct>% (<used>/<total> GB)
The ControlPanel widget uses a 3-column Tkinter grid that replicates
jdiskmark's Swing layout. The container frame has a fixed width=320 px
(pack_propagate(False)).
| Col | Role | Weight | Effect |
|---|---|---|---|
| 0 | Narrow label anchor (Profile / Type rows only) | 0 (fixed) | minsize=60 px — holds the short "Profile" / "Type" label text |
| 1 | Middle — label overflow + combo left-edge | 2 | Receives ~40 % of the expandable space |
| 2 | Combo-only column (rows 3–7) | 3 | Receives ~60 % of the expandable space |
Columns 1 and 2 together fill all space beyond col 0's 60 px minimum. The 2 : 3 weight ratio produces a 40 % : 60 % split, matching jdiskmark.
Row │ Widget │ Col span (label) │ Col span (combo)
────┼─────────────────┼──────────────────┼──────────────────
0 │ Profile │ col 0 only │ cols 1+2, sticky="ew"
1 │ Type │ col 0 only │ cols 1+2, sticky="ew"
2 │ Threads │ cols 0+1 │ col 2 only, sticky="ew"
3 │ Block Order │ cols 0+1 │ col 2 only, sticky="ew"
4 │ Blocks / Sample │ cols 0+1 │ col 2 only, sticky="ew"
5 │ Block Size (KB) │ cols 0+1 │ col 2 only, sticky="ew"
6 │ Samples │ cols 0+1 │ col 2 only, sticky="ew"
7 │ Start button │ cols 0+1+2 (columnspan=3), sticky="ew"
8 │ Separator │ cols 0+1+2 (columnspan=3), sticky="ew"
9 │ Results frame │ cols 0+1+2 (columnspan=3), sticky="ew"
Col 0 (≥60 px) │ Col 1 (40 %) │ Col 2 (60 %) │
────────────────┼────────────────────────────────────┤
Profile │ [Profile combo ── spans 1+2 ──] ▼│
Type │ [Type combo ──── spans 1+2 ──] ▼│
[Threads ─── spans 0+1 ──────] │ [ 1 ] ▼│
[Block Order ─ spans 0+1 ──] │ [Sequential] ▼│
[Blocks / Sample ─ spans 0+1] │ [ 512 ] ▼│
[Block Size (KB) ─ spans 0+1] │ [ 512 ] ▼│
[Samples ─── spans 0+1 ──────] │ [ 250 ] ▼│
[────────────── Start ──────────────────────────] │
- No
width=hint on combos. All combos usesticky="ew"to fill their column. Hardcodedwidth=values fight with Tkinter's grid geometry and should not be set. pack_propagate(False)onctrl_frameis essential. Without it the frame would shrink to the minimum size of its children.- 40 : 60 split ensures the shorter numeric combos (Threads, Samples, …) are proportionally narrower than the Profile / Type combos, matching jdiskmark.
- Column 0
weight=0prevents the label anchor from expanding; all growth goes to cols 1 and 2 in a 2 : 3 ratio.
control_panel.py—_build_ui()methodmain_window.py—ctrl_frame = ttk.Frame(bench_page, width=320)
End of specification.