Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions .github/workflows/publish-docker.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
name: Publish Docker image

on:
push:
tags:
- "v*"
workflow_dispatch:

jobs:
docker:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Checkout
uses: actions/checkout@v4

- name: Set up QEMU
uses: docker/setup-qemu-action@v3

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ghcr.io/${{ github.repository_owner }}/rustream:latest
platforms: linux/amd64,linux/arm64
82 changes: 82 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ serde_json = "1"
bytes = "1"
glob = "0.3"
arrow-csv = "57"
axum = { version = "0.7", features = ["json"] }

[features]
default = []
Expand Down
47 changes: 47 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,12 @@ rustream sync --config config.yaml --dry-run

# Run sync
rustream sync --config config.yaml

# If you need to wipe saved watermarks/cursors (full resync)
rustream sync --config config.yaml --reset-state

# Reset state for one table
rustream reset-state --table users
```

### Ingest (S3/local → Postgres)
Expand All @@ -61,6 +67,47 @@ RUST_LOG=rustream=debug rustream sync --config config.yaml
RUST_LOG=rustream=debug rustream ingest --config ingest_config.yaml
```

### Jobs / Worker (experimental)

```
# create control-plane table in Postgres
rustream init-jobs --control-db-url "$CONTROL_DB_URL"

# enqueue a table sync job
rustream add-job --control-db-url "$CONTROL_DB_URL" --table users --config config.yaml --interval-secs 300 --timeout-secs 900 --max-concurrent-jobs 1

# run the worker loop (polls every 5s by default, runs up to 4 jobs at once)
rustream worker --control-db-url "$CONTROL_DB_URL" --poll-seconds 5 --max-concurrent 4

# force a job to run ASAP
rustream force-job --control-db-url "$CONTROL_DB_URL" --job-id 1

# status API (optional, returns JSON)
rustream status-api --control-db-url "$CONTROL_DB_URL" --bind 0.0.0.0:8080

# control DB URL can also come from env
# RUSTREAM_CONTROL_DB_URL=postgres://user:pass@host:5432/db rustream worker
# status endpoints:
# /jobs (json, optional ?status=pending), /jobs/html (auto-refresh + filter),
# /jobs/summary, /logs?limit=50, /health, /health/worker
# UI buttons: force run, retry failed, reset state

# reset all state (CLI)
rustream reset-state --state-dir .rustream_state

# reset one table
rustream reset-state --table users

# optional: run a data-quality command on each local output table
# (use {path} placeholder for the Parquet directory)
RUSTREAM_DQ_CMD="dq-prof --input {path}" rustream worker --control-db-url "$CONTROL_DB_URL"
```

### Production templates
- Local: `examples/production-template/docker-compose.yml`
- K8s: `examples/production-template/helm`
- AWS: `examples/production-template/terraform`

## Configuration

### Specific tables (recommended)
Expand Down
40 changes: 40 additions & 0 deletions examples/production-template/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# rustream production template (minimal skeleton)

Quick ways to try rustream end-to-end without Kafka:

## Local (docker-compose)
```
cd examples/production-template
cp config.example.yaml config.yaml # edit Postgres/S3/MinIO creds if needed
docker compose up --build
# worker + status API come up; status UI on http://localhost:8080/jobs/html
```

Services:
- Postgres (demo/demo)
- MinIO (minioadmin/minioadmin) at http://localhost:9001
- rustream worker (polls jobs table)
- rustream status API (port 8080)

## Kubernetes (Helm skeleton)
```
cd examples/production-template/helm
helm install rustream . \
--set controlDbUrl=postgres://user:pass@host:5432/db \
--set image.repository=ghcr.io/yourorg/rustream \
--set image.tag=latest
```
Notes:
- Chart is minimal: no Secrets/IAM wired yet. Add IRSA/kiam annotations and env for AWS creds.
- Mount your `config.yaml` via a ConfigMap/Secret and add a bootstrap Job that runs `rustream add-job`.

## AWS Terraform stub
```
cd examples/production-template/terraform
terraform init
terraform apply -var bucket_name=rustream-demo-bucket
```
Creates S3 bucket + IAM role/policy for ECS tasks. You still need to wire ECS service/task defs to run rustream.

## GHCR Docker image
A GitHub Actions workflow builds/pushes `ghcr.io/<owner>/rustream:latest` on tag pushes (`.github/workflows/publish-docker.yml`). Customize tags as needed.
17 changes: 17 additions & 0 deletions examples/production-template/config.example.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
postgres:
host: postgres
database: demo
user: demo
password: demo

output:
type: s3
bucket: rustream-demo
prefix: raw/postgres
region: us-east-1
endpoint: http://minio:9000

tables:
- name: users
incremental_column: updated_at
incremental_tiebreaker_column: id
57 changes: 57 additions & 0 deletions examples/production-template/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
version: "3.9"

services:
postgres:
image: postgres:16
environment:
POSTGRES_USER: demo
POSTGRES_PASSWORD: demo
POSTGRES_DB: demo
ports:
- "5432:5432"

minio:
image: quay.io/minio/minio
command: server /data --console-address ":9001"
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
ports:
- "9000:9000"
- "9001:9001"
volumes:
- minio-data:/data

rustream-worker:
build: ../..
depends_on:
- postgres
- minio
environment:
RUSTREAM_CONTROL_DB_URL: postgres://demo:demo@postgres:5432/demo
RUSTREAM_S3_ENDPOINT: http://minio:9000
AWS_ACCESS_KEY_ID: minioadmin
AWS_SECRET_ACCESS_KEY: minioadmin
RUST_LOG: rustream=info
command: >
sh -c "
rustream init-jobs --control-db-url $$RUSTREAM_CONTROL_DB_URL &&
rustream add-job --control-db-url $$RUSTREAM_CONTROL_DB_URL --table users --config /config/config.yaml --interval-secs 300 &&
rustream worker --control-db-url $$RUSTREAM_CONTROL_DB_URL --poll-seconds 5
"
volumes:
- ./config.yaml:/config/config.yaml:ro

rustream-status:
build: ../..
depends_on:
- postgres
environment:
RUSTREAM_CONTROL_DB_URL: postgres://demo:demo@postgres:5432/demo
RUST_LOG: rustream=info
ports:
- "8080:8080"
command: rustream status-api --control-db-url ${RUSTREAM_CONTROL_DB_URL:-postgres://demo:demo@postgres:5432/demo} --bind 0.0.0.0:8080

volumes:
minio-data:
6 changes: 6 additions & 0 deletions examples/production-template/helm/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v2
name: rustream
description: Minimal chart to run rustream worker and status API
type: application
version: 0.1.0
appVersion: "0.2.0"
3 changes: 3 additions & 0 deletions examples/production-template/helm/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{{- define "rustream.fullname" -}}
{{- printf "%s" .Release.Name -}}
{{- end -}}
Loading
Loading