dprism

Decompose your data at the speed of light.

htop meets pandas-profiling -- written in Rust.

What is dprism?

dprism is a terminal-native tool that lets you explore, profile, and understand datasets instantly -- without leaving your terminal, without spinning up Jupyter, without writing a single line of code.

Built with Polars for lightning-fast data processing and Ratatui for a beautiful, responsive TUI.

┌─────────────────────────────────────────────────────────────────┐
│ dprism   sales.csv   1,000,000 rows x 12cols | 48 MiB      │
├──────────────────────────┬──────────────────────────────────────┤
│  Columns (12)        │  Column Profile                  │
│                          │                                      │
│  #  Column      Type  N  │  revenue  [f64]                     │
│  1  id          i64  0%  │                                      │
│  2  date        date 0%  │  ─── Overview ───                   │
│ >3  revenue     f64  2%  │  Count ............ 1,000,000       │
│  4  region      str  0%  │  Null count ........ 20,000         │
│  5  category    str  1%  │  Null % ............ 2.00%          │
│                          │  Unique ............ 847,231        │
│                          │                                      │
│                          │  ─── Statistics ───                  │
│                          │  Mean ............. 4,521.87        │
│                          │  Median ........... 3,200.00        │
│                          │  Std Dev .......... 2,876.43        │
│                          │  Q1 (25%) ......... 1,250.00       │
│                          │  Q3 (75%) ......... 6,800.00       │
│                          │  Min .............. 0.50            │
│                          │  Max .............. 99,999.99       │
│                          │  (!) Outliers ........ 42 (0.4%)     │
│                          │                                      │
│                          │  ─── Distribution ───               │
│                          │      0.5 ▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 312       │
│                          │  10000.5 ▓▓▓▓▓▓▓▓▓▓ 247            │
│                          │  20000.5 ▓▓▓▓▓▓▓▓ 198              │
│                          │                                      │
│                          │  ─── Top Values ───                 │
│                          │  99.99        ████████████ 1,204    │
│                          │  49.99        ████████ 892          │
├──────────────────────────┴──────────────────────────────────────┤
│ ↑/k Up  ↓/j Down  / Search  Tab Cycle  c Corr  f Filter  q Q  │
└─────────────────────────────────────────────────────────────────┘

Features

Data Loading

Multi-format -- CSV, Parquet, and Arrow IPC out of the box
Instant loading -- powered by Polars, handles multi-GB files
stdin piping -- cat data.csv | dprism explore -
Progress indicator -- spinner for large files

Profiling & Statistics

Column profiling -- mean, median, std dev, Q1/Q3, min/max, null %, unique count
Inline histograms -- distribution plots for numeric columns
Correlation matrix -- pairwise Pearson correlations between numeric columns
Outlier detection -- IQR-based outlier highlighting
Top values -- frequency bar chart per column

Interactive TUI

Data preview -- toggle between profile and raw data view with Tab
Column search -- press / to find columns by name
Interactive filtering -- press f to filter rows (e.g., age>30, name=Alice)
Correlation view -- press c to see the correlation matrix
Type detection -- colour-coded data types (int, float, string, bool, date)
Keyboard-driven -- vim-style navigation (j/k, g/G)

CLI Tools

Schema validation -- validate datasets against YAML rules
Dataset diff -- compare two datasets and see changes
JSON export -- dump stats for CI/automation
Config file -- ~/.dprism.toml for persistent preferences

Installation

From source (requires Rust 1.75+)

git clone https://github.com/whispem/dprism.git
cd dprism
cargo install --path .

From crates.io

cargo install dprism

Usage

Explore a dataset

dprism explore data.csv
dprism explore warehouse.parquet
dprism explore events.arrow

Pipe from stdin

cat data.csv | dprism explore -
curl -s https://example.com/data.csv | dprism explore -

Explore with filtering

dprism explore data.csv --filter "age > 30"
dprism explore data.csv --filter "department = Engineering"
dprism explore data.csv --filter "city ~ York"

Options

dprism explore data.csv --delimiter ';'              # Custom CSV delimiter
dprism explore data.csv --no-header                  # First row is data
dprism explore data.csv --head 10000                 # Load first 10k rows
dprism explore data.csv --export-stats stats.json    # Export & exit

Tip: The alias ex works too: dprism ex data.csv

Validate a dataset against a YAML schema

dprism validate data.csv schema.yaml

Example schema (schema.yaml):

columns:
  age:
    type: Int64
    nullable: false
    min: 0
    max: 150
  department:
    type: String
    values: [Engineering, Marketing, Sales, HR]
  salary:
    type: Float64
    nullable: true
    min: 0

Schema rules: type, nullable, min, max, unique, values.

Compare two datasets

dprism diff old.csv new.csv
dprism diff v1.parquet v2.parquet

Output includes schema changes (columns added/removed/type changed), row count deltas, and per-column value summaries.

Config file

Create ~/.dprism.toml for persistent defaults:

[defaults]
delimiter = ";"
head = 50000
# no_header = true

[ui]
theme = "dark"
histogram_bins = 20

Keyboard shortcuts

Key	Action
`↑` / `k`	Previous column
`↓` / `j`	Next column
`/`	Search columns by name
`Tab`	Cycle views: Profile -> Preview -> Corr
`c`	Jump to correlation matrix
`f`	Filter rows (or clear active filter)
`g` / `Home`	Jump to first column
`G` / `End`	Jump to last column
`q` / `Esc`	Quit (or exit search/filter)

Filter expressions

In both --filter CLI flag and interactive f mode:

Expression	Meaning
`age > 30`	Numeric greater-than
`salary >= 50000`	Numeric greater-or-equal
`name = Alice`	String equality
`name != Bob`	String inequality
`city ~ York`	String contains

Architecture

src/
├── main.rs          # Entry point & command routing
├── lib.rs           # Public API for tests
├── cli.rs           # Clap-based CLI (explore, validate, diff)
├── config.rs        # ~/.dprism.toml config file support
├── error.rs         # Custom error types (thiserror)
├── data/
│   ├── mod.rs
│   ├── loader.rs    # Polars-powered CSV, Parquet & Arrow IPC loading
│   ├── stats.rs     # Per-column statistics (with histograms & outliers)
│   ├── correlation.rs # Pearson correlation matrix
│   ├── filter.rs    # Row filter expression parser
│   ├── schema.rs    # YAML schema validation
│   └── diff.rs      # Dataset comparison
└── ui/
    ├── mod.rs
    └── explorer.rs  # Ratatui TUI (profile, preview, correlation, filter)

Roadmap

v1.1 -- Pipeline Mode

dprism watch pipeline.yaml -- real-time file monitoring
Webhook alerts for data quality
JSON / NDJSON format support
Excel (.xlsx) format support

v2.0 -- Extensibility

Plugin system (custom stats, custom views)
ML model benchmark runner
Homebrew & apt packages
WASM playground

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feat/amazing-feature)
Write tests for new functionality
Ensure cargo clippy and cargo fmt pass
Open a PR with a clear description

License

MIT -- see LICENSE for details.

Built with love and Rust by @whispem

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
examples		examples
src		src
testdata		testdata
tests		tests
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dprism

What is dprism?

Features

Data Loading

Profiling & Statistics

Interactive TUI

CLI Tools

Installation

From source (requires Rust 1.75+)

From crates.io

Usage

Explore a dataset

Pipe from stdin

Explore with filtering

Options

Validate a dataset against a YAML schema

Compare two datasets

Config file

Keyboard shortcuts

Filter expressions

Architecture

Roadmap

v1.1 -- Pipeline Mode

v2.0 -- Extensibility

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

dprism

What is dprism?

Features

Data Loading

Profiling & Statistics

Interactive TUI

CLI Tools

Installation

From source (requires Rust 1.75+)

From crates.io

Usage

Explore a dataset

Pipe from stdin

Explore with filtering

Options

Validate a dataset against a YAML schema

Compare two datasets

Config file

Keyboard shortcuts

Filter expressions

Architecture

Roadmap

v1.1 -- Pipeline Mode

v2.0 -- Extensibility

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages