-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture Overview
iamvirul edited this page Mar 21, 2026
·
1 revision
DeepDiff DB is organized into five clean layers with strict dependency direction — outer layers depend on inner layers, never the reverse.
┌─────────────────────────────────────────────────────────┐
│ CLI Layer │
│ cmd/deepdiffdb/main.go │
│ Commands: check, schema-diff, diff, gen-pack, apply │
└───────────────────────┬─────────────────────────────────┘
│
┌───────────────────────▼─────────────────────────────────┐
│ Config Layer │
│ pkg/config/ │
│ YAML loading, validation, defaults │
└──────────┬────────────────────────────┬─────────────────┘
│ │
┌──────────▼──────────┐ ┌─────────────▼───────────────┐
│ Schema Layer │ │ Content Layer │
│ internal/schema/ │ │ internal/content/ │
│ Introspect, diff, │ │ Hash, pack, apply, ignore │
│ migrate, order │ │ resolve/ │
└──────────┬──────────┘ └─────────────┬───────────────┘
│ │
┌──────────▼────────────────────────────▼───────────────┐
│ Driver Layer │
│ internal/drivers/ │
│ Open connections, build DSNs, retry logic │
└───────────────────────────────────────────────────────┘
Supporting packages (used by all layers):
pkg/logger/ — Structured logging (slog-based, JSON/text)
pkg/errors/ — Typed errors with codes, context, suggestions, retry
pkg/progress/ — Progress bars and spinners
internal/checkpoint/ — State persistence for resume
internal/report/html/ — HTML report generation
deepdiff-db/
├── cmd/deepdiffdb/
│ └── main.go # CLI entry point, command dispatch
│
├── internal/
│ ├── schema/
│ │ ├── model.go # Column, Index, ForeignKey, Table, Schema types
│ │ ├── introspect.go # LoadSchema — driver-specific introspection
│ │ ├── diff.go # DiffSchemas, TableDiff, DiffResult
│ │ ├── migrate.go # GenerateMigration, MigrationOptions
│ │ ├── ordering.go # Topological sort for FK-safe operation ordering
│ │ ├── report.go # WriteReports (JSON + text)
│ │ └── primary_keys.go # CheckPrimaryKeys
│ │
│ ├── content/
│ │ ├── diff.go # TableDataDiff, DataDiff, Conflicts types
│ │ ├── hash.go # HashTable — keyset pagination + full load
│ │ ├── cursor.go # BuildCursorQuery — driver-specific pagination SQL
│ │ ├── pack.go # GeneratePack — builds migration_pack.sql
│ │ ├── apply.go # ApplyPack — executes pack transactionally
│ │ ├── ignore.go # IgnoreMatcher — glob/exact column ignoring
│ │ ├── report.go # WriteReports (JSON + text)
│ │ └── resolve/
│ │ ├── resolve.go # Strategy, Decision, ApplyStrategy, Conflicts
│ │ ├── fetch.go # FetchConflictRows, CompareRows, FormatValue
│ │ └── persistence.go # Save/load resolutions to disk
│ │
│ ├── checkpoint/
│ │ ├── checkpoint.go # Manager — save/load/delete/update
│ │ ├── state.go # State, HashTableState, GeneratePackState, ApplyPackState
│ │ └── resume.go # Resume helpers
│ │
│ ├── drivers/
│ │ ├── drivers.go # Open — connection + retry + pool config
│ │ └── imports.go # Driver side-effect imports (mysql, pgx, sqlite, etc.)
│ │
│ ├── cli/
│ │ └── prompt.go # Interactive prompts for resolve-conflicts
│ │
│ └── report/html/
│ ├── types.go # ReportData, ReportSummary, display types
│ ├── generator.go # GenerateReport
│ └── template.go # Embedded HTML template
│
├── pkg/
│ ├── config/
│ │ └── config.go # Config struct, Load, Validate, defaults
│ ├── logger/
│ │ ├── logger.go # Logger, New, Debug/Info/Warn/Error
│ │ ├── context.go # ToContext, FromContext
│ │ └── fields.go # Field name constants
│ ├── progress/
│ │ ├── manager.go # Manager, Bar, Spinner
│ │ ├── metrics.go # Throughput + ETA tracking
│ │ └── context.go # ToContext, FromContext
│ └── errors/
│ ├── errors.go # Error type, New, Wrap, With, Suggestions
│ ├── codes.go # ErrorCode enum
│ ├── suggestions.go # Actionable suggestion generation
│ └── retry.go # Retry with exponential backoff + jitter
│
└── tests/
├── config/ # Config unit tests
├── content/ # Content unit tests
├── checkpoint/ # Checkpoint unit tests
├── drivers/ # Driver unit tests
├── schema/ # Schema unit tests (SQLite-based)
├── html/ # HTML report unit tests
├── resolve/ # Resolve unit tests
├── errors/ # Error/retry unit tests
└── integration_test.go # Full workflow integration tests
type Column struct {
Name string
DataType string
IsNullable bool
DefaultValue *string
}
type Index struct {
Name string
Columns []string
IsUnique bool
}
type ForeignKey struct {
Name string
ReferencedTable string
Columns []string
ReferencedColumns []string
OnDelete, OnUpdate string
}
type Table struct {
Name string
Columns map[string]Column
PrimaryKey []string
Indexes map[string]Index
ForeignKeys map[string]ForeignKey
}
type Schema struct {
Tables map[string]Table
}// Row hashes: map[compositePKString]sha256Hash
type TableHashes map[string]string
type TableDataDiff struct {
Table string
Added []string // PK keys only in dev
Removed []string // PK keys only in prod
Updated []string // Same key, different hash
}
type Conflict struct {
Table, Key, ProdHash, DevHash string
}All packages propagate shared state via context.Context — nothing is global:
ctx = logger.ToContext(ctx, log) // structured logger
ctx = progress.ToContext(ctx, progressMgr) // progress bars
ctx = checkpoint.ToContext(ctx, ckptMgr) // checkpoint managerEvery function extracts what it needs:
log := logger.FromContext(ctx)
mgr := checkpoint.FromContext(ctx)All errors are typed with a machine-readable ErrorCode:
type Error struct {
Code ErrorCode
Message string
Cause error
Context map[string]any
Suggestions []string
}Errors are wrapped at each layer boundary, adding context as they propagate up:
return errors.Wrap(err, errors.ErrHashingFailed, "failed to hash table").
With("table", tableName).
With("batch", batchNum).
WithSuggestion("Check that the table has a primary key")| Decision | Rationale |
|---|---|
| SHA-256 row hashing | Deterministic, content-addressable comparison without full data transfer |
| Keyset pagination | O(batchSize) memory regardless of table size; cursor stability across pages |
| Single transaction for apply | All-or-nothing guarantees; never leave prod in partial state |
| Checkpoint atomic write (temp + rename) | Prevents corrupt state file if process is killed mid-write |
| Destructive ops commented out by default | Production safety; operator must explicitly opt in |
| Config hash in checkpoint | Prevents resuming with a different config than what started the operation |
| Context-carried logger/progress | No global state; testable; per-request context isolation |
Home · Problem Statement · Architecture · Data Flow · CLI Reference · Configuration · Contributing
DeepDiff DB — safe, deterministic database synchronization