torch2pprof

A tool to convert PyTorch profiler traces to the pprof format for visualization with go tool pprof.

Overview

PyTorch's profiler outputs traces in Chrome Trace Event format (JSON), which is difficult to analyze directly. This tool converts those traces into pprof's binary format, allowing you to:

Visualize call stacks
Identify performance bottlenecks
Analyze CPU usage patterns
Use the full suite of pprof analysis tools

Installation

From Source

git clone https://github.com/yourusername/torch2pprof
cd torch2pprof
make install

Using Go

go install github.com/yourusername/torch2pprof/cmd/torch2pprof@latest

Usage

Converting PyTorch Traces to pprof

# Using the convert subcommand
torch2pprof convert input_trace.json output_profile.pb.gz

# Works with compressed files too
torch2pprof convert input_trace.json.gz output_profile.pb.gz

# Or use the default behavior (for backward compatibility)
torch2pprof input_trace.json output_profile.pb.gz

This will:

Load the PyTorch trace JSON file (supports both .json and .json.gz files)
Parse all complete events (ph=X) with positive durations
Build call stacks by analyzing event nesting
Encode to pprof protobuf format with gzip compression

Note: Input files can be either plain JSON or gzip-compressed. The tool automatically detects compression based on file extension (.gz) or file content (magic number detection).

Analyzing Traces

# Show top 20 operations (default)
torch2pprof analyze input_trace.json

# Works with compressed files
torch2pprof analyze input_trace.json.gz

# Show top 50 operations
torch2pprof analyze -top 50 input_trace.json.gz

This displays:

Total number of events and statistics
Time breakdown by category
Top operations by total time

Note: Both .json and .json.gz files are supported.

Viewing with pprof

After conversion, analyze the profile with go tool pprof:

go tool pprof output_profile.pb.gz

Common pprof commands:

top - Show top functions by time
list <function> - Show source code with annotations
web - Generate a graph visualization (requires graphviz)
flame - Generate flame graph

Commands

convert

Convert PyTorch trace to pprof format.

torch2pprof convert <input.json|input.json.gz> <output.pb.gz>

Arguments:

input.json|input.json.gz - PyTorch trace file in Chrome Trace Event format (plain or gzip-compressed)
output.pb.gz - Output pprof profile (gzip compressed)

Features:

Automatically detects gzip compression via .gz extension or magic number
Supports both plain JSON and compressed JSON files

analyze

Analyze PyTorch trace and show statistics.

torch2pprof analyze [options] <input.json|input.json.gz>

Options:

-top N - Show top N operations (default: 20)

Arguments:

input.json|input.json.gz - PyTorch trace file to analyze (plain or gzip-compressed)

Features:

Automatically detects gzip compression via .gz extension or magic number
Supports both plain JSON and compressed JSON files

Project Structure

torch2pprof/
├── cmd/                          # Command-line applications
│   └── torch2pprof/              # Main tool with subcommands
│       └── main.go               # Entry point with convert & analyze commands
│
├── internal/                     # Private packages (not for external import)
│   ├── profile/
│   │   └── profile.go            # pprof protobuf encoding
│   └── converter/                # Core conversion and analysis logic
│       ├── trace.go              # Trace loading, processing, and conversion
│       └── analyzer.go           # Trace analysis and statistics
│
├── test/                         # Test data and utilities
│   └── pprof_verification.py     # Python script to verify pprof output
│
├── data/                         # Sample data
│   └── trace.json.gz             # Example PyTorch trace
│
├── doc/                          # Documentation
├── go.mod                        # Go module definition
├── go.sum                        # Dependency checksums
├── Makefile                      # Build automation
├── README.md                     # User documentation

Building

# Build binary
make build

# Run tests
make test

# Run tests with coverage
make test-coverage

# Run tests with race detector
make test-race

# Format code
make fmt

# Lint code
make vet

# Build for multiple platforms
make dist

Testing

The project has comprehensive unit tests with high code coverage:

Test Coverage: 96.2% (converter), 93.0% (profile)
Total Tests: 20 unit tests
CI/CD: Automated testing on Linux, macOS, Windows
Race Detection: All tests run with race detector

See TESTING.md for detailed testing documentation.

# Run all tests
make test

# Run with coverage report
make test-coverage
# Open coverage.html in browser

# Run with race detector
make test-race

How It Works

Trace Conversion Algorithm

Load Trace: Parse the JSON trace file containing Chrome Trace Event format
Filter Events: Keep only complete events (ph=X) with positive duration
Group by Thread: Organize events by their thread ID
Build Stacks: For each event, determine its call stack by analyzing event overlaps:
- Events that temporally contain other events represent parent functions
- Uses a linear-time stack-based algorithm instead of O(n²) comparison
Aggregate: Combine identical stacks and sum their durations
Encode: Convert to pprof protobuf format and compress with gzip

Performance

Linear time complexity for stack building (O(n) per thread)
Parallel processing across multiple threads
Efficient memory usage with string interning

Requirements

Go 1.24 or later

License

MIT

Contributing

Contributions are welcome! Please ensure:

Code passes go fmt and go vet
All tests pass
New features include tests

Troubleshooting

Large trace files

For very large trace files (>100MB):

Ensure sufficient memory (at least 2GB recommended)
Consider filtering the trace in PyTorch before exporting
Use gzip-compressed files (.json.gz) to save disk space and reduce I/O time
- Example: A 322MB JSON file compresses to 23MB with gzip (93% reduction)

Memory usage

The tool maintains maps for:

String interning (string → index)
Function deduplication (name+file → ID)
Location deduplication (name+file → ID)

For profiles with millions of unique functions, this can use several GB.

Related Tools

pprof - Profile visualization
PyTorch Profiler - Profile generation
Chrome DevTools - View traces directly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

torch2pprof

Overview

Installation

From Source

Using Go

Usage

Converting PyTorch Traces to pprof

Analyzing Traces

Viewing with pprof

Commands

convert

analyze

Project Structure

Building

Testing

How It Works

Trace Conversion Algorithm

Performance

Requirements

License

Contributing

Troubleshooting

Large trace files

Memory usage

Related Tools

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
cmd/torch2pprof		cmd/torch2pprof
data		data
doc		doc
internal		internal
test		test
.gitignore		.gitignore
.golangci.yml		.golangci.yml
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Folders and files

Latest commit

History

Repository files navigation

torch2pprof

Overview

Installation

From Source

Using Go

Usage

Converting PyTorch Traces to pprof

Analyzing Traces

Viewing with pprof

Commands

convert

analyze

Project Structure

Building

Testing

How It Works

Trace Conversion Algorithm

Performance

Requirements

License

Contributing

Troubleshooting

Large trace files

Memory usage

Related Tools

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages