XLSX file CLI built with Agent Experience (AX) in mind.
agent-xlsx gives LLM agents the same depth of understanding of Excel workbooks that a human gets by opening them in Excel — structure, data, formatting, charts, formulas, VBA, and visual layout — all accessible through a single CLI that returns token-efficient JSON.
agent-xlsx probe report.xlsx
agent-xlsx read report.xlsx "Sales!A1:F50"
agent-xlsx search report.xlsx "revenue" --ignore-case
agent-xlsx screenshot report.xlsxLLM agents working with Excel files face a fundamental problem: existing libraries are designed for humans writing Python scripts, not for agents that need to build understanding of a workbook incrementally and efficiently.
agent-xlsx solves this with three design principles:
-
Progressive Disclosure —
probe(structure) →screenshot(visual) →read(data) →inspect(metadata). Each layer adds detail only when needed. No wasted tokens. -
Speed First — The primary data backend is Polars + fastexcel (Rust/Calamine), delivering 7-10x faster reads than openpyxl with zero-copy Arrow integration. A full workbook profile completes in under 50ms.
-
Token Efficiency — Every output is optimised for minimal token consumption. Aggregation over enumeration. Capped lists with counts. An agent builds comprehensive understanding of a workbook in 1-2 round-trips, not 10.
uvx agent-xlsx probe report.xlsxuv tool install agent-xlsxAlso available via pipx install agent-xlsx.
pip install agent-xlsxGive AI agents built-in knowledge of agent-xlsx commands and workflows:
npx skills add apetta/agent-xlsxCompatible with Claude Code, Cursor, Gemini CLI, and 20+ other agents.
agent-xlsx includes Aspose.Cells for Python as a dependency for cross-platform screenshot, recalc, and objects support.
Important: Aspose.Cells is a proprietary, commercially licensed library by Aspose Pty Ltd. It is not covered by this project's Apache-2.0 licence. By installing agent-xlsx you also install Aspose.Cells and agree to Aspose's EULA. A separate Aspose licence is required for production use without watermarks.
Without a licence, Aspose runs in evaluation mode (watermarks on rendered images, 100-file-per-session limit). To remove watermarks, purchase and set an Aspose licence:
agent-xlsx license --set /path/to/Aspose.Cells.licOr via environment variable:
export ASPOSE_LICENSE_PATH=/path/to/Aspose.Cells.licmacOS:
brew install --cask libreofficeUbuntu / Debian / ECS:
apt install libreoffice-calcAlpine:
apk add libreoffice-calcAll other commands (probe, read, search, export, write, format, inspect, overview, sheet, vba) work with zero system dependencies.
The recommended agent workflow is probe first, then drill down:
- Profile the workbook — lean skeleton in <10ms:
agent-xlsx probe workbook.xlsx- Drill into types / samples if needed:
agent-xlsx probe workbook.xlsx --types --sample 3- Visual understanding — see formatting, charts, layout:
agent-xlsx screenshot workbook.xlsx- Read specific data:
agent-xlsx read workbook.xlsx --sheet Sales "A1:F100"- Inspect metadata — formulas, charts, merged cells, conditional formatting:
agent-xlsx inspect workbook.xlsx --sheet SalesThe first command an agent should run. Lean by default — returns sheet names, dimensions, and headers with zero data parsing (<10ms). Use flags to opt into richer detail.
agent-xlsx probe data.xlsx
agent-xlsx probe data.xlsx --types
agent-xlsx probe data.xlsx --sample 3
agent-xlsx probe data.xlsx --stats
agent-xlsx probe data.xlsx --full
agent-xlsx probe data.xlsx --sheet "Sales"| Flag | Effect |
|---|---|
| (none) | Sheet names, dims, headers only |
--types |
Add column types + null counts |
--sample N |
Add N head + N tail rows |
--stats |
Full stats (implies --types) |
--full |
Shorthand for --types --sample 3 --stats |
--sheet |
Target a single sheet |
Default output (~250 tokens for 6 sheets):
{
"file": "data.xlsx",
"size_bytes": 107679,
"format": "xlsx",
"probe_time_ms": 7.9,
"sheets": [
{
"name": "txns",
"index": 0,
"visible": true,
"rows": 255,
"cols": 34,
"headers": ["user_id", "txn_day", "txn_month", "amount", "currency", "..."]
}
]
}With --full (types + sample + stats):
{
"file": "data.xlsx",
"size_bytes": 107679,
"format": "xlsx",
"probe_time_ms": 18.5,
"sheets": [
{
"name": "txns",
"index": 0,
"visible": true,
"rows": 255,
"cols": 34,
"headers": ["user_id", "txn_day", "txn_month", "amount", "currency", "..."],
"column_types": {
"user_id": "string",
"txn_day": "float64",
"amount": "float64",
"txn_date": "datetime",
"category": "string"
},
"null_counts": {"user_id": 0, "amount": 0, "currency": 0},
"sample": {
"head": [["8bb055ad-...", 1, 12, -39.0, "GBP"]],
"tail": [["8bb055ad-...", 1, 8, -150.0, "GBP"]]
},
"numeric_summary": {
"amount": {"min": -4888.06, "max": 5000.0, "mean": -142.3, "std": 892.1}
},
"string_summary": {
"category": {"unique": 12, "top_values": ["Software & Technology", "Sales", "Employees"]}
}
}
]
}Focuses on elements that probe cannot detect: formulas, charts, tables, named ranges. Uses openpyxl for metadata that the Rust backend doesn't expose.
agent-xlsx overview data.xlsx
agent-xlsx overview data.xlsx --include-formulas
agent-xlsx overview data.xlsx --include-formatting{
"file": "data.xlsx",
"size_bytes": 107679,
"overview_time_ms": 157.2,
"sheets": [
{
"name": "txns",
"index": 0,
"dimensions": "A1:AZ324",
"row_count": 324,
"col_count": 52,
"has_formulas": false,
"has_charts": true,
"chart_count": 1,
"has_tables": false
}
]
}Read data from any range or sheet. Default path uses Polars + fastexcel for speed. Use --formulas to fall back to openpyxl for formula string extraction.
agent-xlsx read data.xlsx
agent-xlsx read data.xlsx "A1:F50"
agent-xlsx read data.xlsx --sheet Sales "B2:G100"
agent-xlsx read data.xlsx --limit 500 --offset 100
agent-xlsx read data.xlsx --formulas
agent-xlsx read data.xlsx --sort amount --descending{
"range": "A1:E5",
"dimensions": {"rows": 4, "cols": 5},
"headers": ["user_id", "txn_day", "txn_month", "txn_year", "txn_hour"],
"data": [
["8bb055ad-caa1-40b6-a577-832425b02408", 1, 12, 2024, 8],
["8bb055ad-caa1-40b6-a577-832425b02408", 1, 12, 2024, 4]
],
"row_count": 4,
"truncated": false,
"backend": "polars+fastexcel",
"read_time_ms": 8.9
}Search for values across all sheets. Supports regex and case-insensitive matching.
agent-xlsx search data.xlsx "revenue"
agent-xlsx search data.xlsx "rev.*" --regex
agent-xlsx search data.xlsx "stripe" --ignore-case
agent-xlsx search data.xlsx "error" --sheet Summary
agent-xlsx search data.xlsx "SUM(" --in-formulas{
"query": "Stripe",
"match_count": 25,
"matches": [
{"sheet": "txns", "column": "txn_description", "row": 12, "value": "Stripe DemoCompany Ltd. Payout UK"},
{"sheet": "txns", "column": "merchant_name", "row": 12, "value": "Stripe"}
],
"truncated": true,
"search_time_ms": 18.8
}Deep inspection of workbook elements: formulas, charts, merged cells, named ranges, comments, conditional formatting, data validation, and hyperlinks.
agent-xlsx inspect data.xlsx --sheet Sales
agent-xlsx inspect data.xlsx --sheet Sales --range A1:C10
agent-xlsx inspect data.xlsx --names
agent-xlsx inspect data.xlsx --charts
agent-xlsx inspect data.xlsx --comments
agent-xlsx inspect data.xlsx --conditional "A1:Z100"
agent-xlsx inspect data.xlsx --validation Sales
agent-xlsx inspect data.xlsx --hyperlinks SalesExport workbook sheets as HD PNG images. Three rendering engines auto-detected in order: Aspose.Cells (cross-platform, included) → Excel (xlwings, highest fidelity) → LibreOffice (free fallback). Use --engine to force a specific backend.
agent-xlsx screenshot data.xlsx
agent-xlsx screenshot data.xlsx --sheet Summary
agent-xlsx screenshot data.xlsx --sheet "Sales,Summary"
agent-xlsx screenshot data.xlsx "Sales!A1:F20"
agent-xlsx screenshot data.xlsx --engine aspose
agent-xlsx screenshot data.xlsx --dpi 300
agent-xlsx screenshot data.xlsx --output ./shots/
agent-xlsx screenshot data.xlsx --timeout 60Single sheet/range output:
{
"status": "success",
"format": "png",
"path": "/tmp/agent-xlsx/data_Summary.png",
"sheet": "Summary",
"size_bytes": 245000,
"dpi": 200,
"capture_time_ms": 3200.0,
"engine": "libreoffice+pymupdf"
}Multi-sheet output:
{
"status": "success",
"format": "png",
"dpi": 200,
"sheets": [
{"name": "Sales", "path": "/tmp/agent-xlsx/data_Sales.png", "size_bytes": 245000},
{"name": "Summary", "path": "/tmp/agent-xlsx/data_Summary.png", "size_bytes": 89000}
],
"capture_time_ms": 4100.0,
"engine": "libreoffice+pymupdf"
}Export entire sheets to JSON, CSV, or Markdown.
agent-xlsx export data.xlsx --format csv
agent-xlsx export data.xlsx --format markdown
agent-xlsx export data.xlsx --format json
agent-xlsx export data.xlsx --format csv --output out.csv
agent-xlsx export data.xlsx --format csv --sheet SalesWrite values or formulas to cells. Supports single cells, ranges (via JSON), and CSV file imports.
agent-xlsx write data.xlsx "A1" "Hello"
agent-xlsx write data.xlsx "A1" "=SUM(B1:B100)" --formula
agent-xlsx write data.xlsx "A1:C3" --json '[[1,2,3],[4,5,6],[7,8,9]]'
agent-xlsx write data.xlsx "A1" --from-csv import.csv
agent-xlsx write data.xlsx "A1" "42" --number-format "0.00%"
agent-xlsx write data.xlsx "A1" "Hello" --sheet Summary
agent-xlsx write data.xlsx "A1" "Hello" --output new_file.xlsxUse --output to write to a new file and preserve the original.
Read or modify cell formatting: fonts, fills, borders, number formats.
Read formatting:
agent-xlsx format data.xlsx "A1" --read --sheet SalesApply formatting:
agent-xlsx format data.xlsx "A1:D1" --font '{"bold": true, "size": 14}'
agent-xlsx format data.xlsx "B2:B100" --number-format "#,##0.00"
agent-xlsx format data.xlsx "A1:D10" --fill '{"color": "FFFF00"}'
agent-xlsx format data.xlsx "A1:D10" --border '{"style": "thin"}'
agent-xlsx format data.xlsx "A1:D10" --copy-from "G1"{
"cell": "A1",
"value": "user_id",
"font": {"name": "Aptos Narrow", "size": 12.0, "bold": false, "italic": false},
"fill": {"type": "solid", "color": "indexed:9"},
"border": {
"top": {"style": "thin", "color": "indexed:10"},
"bottom": {"style": "thin", "color": "indexed:10"}
},
"alignment": {"horizontal": null, "vertical": null, "wrap_text": null},
"number_format": "@"
}List, create, rename, delete, copy, hide, and unhide sheets.
agent-xlsx sheet data.xlsx --list
agent-xlsx sheet data.xlsx --create "New Sheet"
agent-xlsx sheet data.xlsx --rename "Old Name" --new-name "New Name"
agent-xlsx sheet data.xlsx --delete "Temp"
agent-xlsx sheet data.xlsx --copy "Template" --new-name "Q1 Report"
agent-xlsx sheet data.xlsx --hide "Internal"
agent-xlsx sheet data.xlsx --unhide "Internal"Extract and analyse VBA macros using oletools. Works headless on all platforms without Microsoft Excel.
agent-xlsx vba macros.xlsm --list
agent-xlsx vba macros.xlsm --read Main
agent-xlsx vba macros.xlsm --read-all
agent-xlsx vba macros.xlsm --securityScan for formula errors or trigger a full recalculation. Auto-detects engine: Aspose → Excel → LibreOffice.
Scan for errors (no engine needed):
agent-xlsx recalc data.xlsx --check-onlyFull recalculation (requires Excel, Aspose, or LibreOffice):
agent-xlsx recalc data.xlsx
agent-xlsx recalc data.xlsx --engine aspose
agent-xlsx recalc data.xlsx --timeout 120{
"status": "success",
"mode": "check_only",
"total_formulas": 847,
"total_errors": 3,
"check_time_ms": 184.1,
"error_summary": {
"#REF!": {"count": 2, "locations": ["Sales!F12", "Sales!F15"]},
"#DIV/0!": {"count": 1, "locations": ["Summary!C8"]}
}
}agent-xlsx uses a multi-backend architecture, choosing the fastest backend capable of satisfying each request:
agent-xlsx CLI
|
+---------------+-----------+-----------+---------------+
| | | | |
Polars+fastexcel openpyxl Aspose.Cells xlwings LibreOffice
(Rust/Calamine) (Pure Py) (Cross-plat) (Excel) (Headless)
Data reads Metadata Screenshots Screenshots Screenshots
Profiling Formulas Recalc Recalc Recalc
Search Formatting Objects Objects
Export Writes
+ oletools (VBA extraction & analysis)
Rendering engine auto-detection: Aspose.Cells → Excel (xlwings) → LibreOffice. Use --engine to force a specific backend.
| Backend | Role | Speed | Used by |
|---|---|---|---|
| Polars + fastexcel | Primary data engine | 7-10x faster than openpyxl | probe, read, search, export |
| openpyxl | Metadata + writes | Baseline | overview, inspect, write, format, sheet |
| Aspose.Cells (separately licensed) | Cross-platform rendering (default) | ~1-3s per sheet | screenshot, recalc, objects |
| xlwings (Excel) | Highest-fidelity rendering | ~2s per sheet | screenshot, recalc, objects, vba --run |
| LibreOffice + PyMuPDF | Free rendering fallback | ~3s per sheet | screenshot, recalc |
| oletools | VBA extraction | Fast | vba |
openpyxl creates a Python object for every cell. For a 100K-row workbook, that's millions of allocations and ~50x the file size in RAM. Polars + fastexcel reads the same data through Rust with zero-copy Arrow transfer — the data never touches Python's heap until the agent needs it.
| Format | Extension | Read | Write | Screenshot | VBA |
|---|---|---|---|---|---|
| Excel (Open XML) | .xlsx |
Yes | Yes | Yes | N/A |
| Excel (Macro-enabled) | .xlsm |
Yes | Yes | Yes | Yes |
| Excel (Binary) | .xlsb |
Yes | - | Yes | Yes |
| Excel (Legacy) | .xls |
Yes | - | Yes | - |
| OpenDocument | .ods |
Yes | - | Yes | - |
agent-xlsx is designed for headless deployment in agentic infrastructure — no GUI, no Excel installation, no Docker requirement.
Aspose.Cells is included as a dependency (see licence note). Add system fonts for rendered output on Linux:
FROM python:3.12-slim
RUN apt-get update && \
apt-get install -y --no-install-recommends libgdiplus libfontconfig1 fonts-liberation && \
apt-get clean && rm -rf /var/lib/apt/lists/*
RUN pip install agent-xlsx
RUN agent-xlsx --helpOptionally add LibreOffice as a fallback rendering engine:
RUN apt-get update && \
apt-get install -y --no-install-recommends libreoffice-calc && \
apt-get clean && rm -rf /var/lib/apt/lists/*All errors return structured JSON with an error code, message, and actionable suggestions:
{
"error": true,
"code": "FILE_NOT_FOUND",
"message": "File not found: missing.xlsx",
"suggestions": [
"Check the file path is correct",
"Ensure the file exists and is readable"
]
}Error codes include: FILE_NOT_FOUND, INVALID_FORMAT, SHEET_NOT_FOUND, INVALID_RANGE, EXCEL_REQUIRED, ASPOSE_NOT_INSTALLED, LIBREOFFICE_REQUIRED, NO_RENDERING_BACKEND, and more.
Benchmarked on a 255-row, 34-column, 6-sheet workbook:
| Operation | Time | Backend |
|---|---|---|
probe (default, lean) |
~8ms | Polars + fastexcel |
probe --full (types + sample + stats) |
~20ms | Polars + fastexcel |
read (range) |
~9ms | Polars + fastexcel |
search (cross-workbook) |
~19ms | Polars + fastexcel |
overview |
~157ms | openpyxl |
inspect |
~120ms | openpyxl |
recalc --check-only |
~184ms | openpyxl |
screenshot (PNG, per-sheet) |
~3s + ~0.1s/page | LibreOffice + PyMuPDF |
recalc (full) |
~2.5s | LibreOffice |
The Polars + fastexcel backend maintains sub-50ms response times even on workbooks with 100K+ rows.
Clone and install:
git clone https://github.com/apetta/agent-xlsx.git
cd agent-xlsx
uv syncRun commands:
uv run agent-xlsx probe sample_data.xlsxLint:
uv run ruff check src/
uv run ruff format src/src/agent_xlsx/
cli.py # Typer CLI entry point
commands/ # 14 command implementations
probe.py # Ultra-fast profiling (Polars)
overview.py # Structural metadata (openpyxl)
read.py # Data extraction (Polars)
search.py # Cross-workbook search (Polars)
export.py # Bulk export (Polars)
inspect.py # Deep inspection (openpyxl)
write.py # Write operations (openpyxl)
format.py # Formatting read/write (openpyxl)
sheet.py # Sheet management (openpyxl)
screenshot.py # Visual capture (Excel/Aspose/LO)
objects.py # Embedded objects (Excel/Aspose)
vba.py # VBA analysis (oletools)
recalc.py # Recalculation (Excel/Aspose/LO)
license_cmd.py # Aspose licence management
adapters/ # Backend adapters
polars_adapter.py # Polars + fastexcel (primary data)
openpyxl_adapter.py # openpyxl (metadata + writes)
xlwings_adapter.py # xlwings/Excel (rendering + objects)
aspose_adapter.py # Aspose.Cells (cross-platform rendering)
libreoffice_adapter.py # LibreOffice headless (fallback rendering)
oletools_adapter.py # oletools (VBA extraction)
formatters/ # Output formatting
json_formatter.py # Token-efficient JSON output
token_optimizer.py # Output capping and aggregation
utils/ # Shared utilities
errors.py # Error types and handler
validation.py # File and range validation
constants.py # Caps and limits
memory.py # Memory budget checking
dates.py # Date detection and serial→ISO conversion
config.py # Persistent config (~/.agent-xlsx/)
This project is licensed under Apache-2.0 — see LICENSE for details.
Third-party notice: agent-xlsx depends on Aspose.Cells for Python, which is proprietary software by Aspose Pty Ltd, distributed under its own EULA. Aspose.Cells is not covered by this project's Apache-2.0 licence. A separate commercial licence from Aspose is required for production use without evaluation watermarks. By installing agent-xlsx you accept responsibility for complying with Aspose's licensing terms.