Introduce standard storage alternative#895
Conversation
00c0b97 to
9bdcb3c
Compare
|
I used Opus 4.8 to perform the following manual tests in a kind cluster:
|
| `--storage-path` (`/data` by default). BadgerDB is the default backend, and | ||
| `--storage-value-log-file-size` applies only to that backend. | ||
|
|
||
| When the `FluxStorage` feature gate is enabled, the controller uses a filesystem |
There was a problem hiding this comment.
This looks so nice! We need a benchmark to see how it does with 1K images and 10K tags vs BadgerDB in terms of CPU/IO/MEM (with a default of 64 KiB instead of 1 KiB to disable gzip). I would really like to turn this feature gate on by default in Flux 2.10 if the benchmark is favorable.
There was a problem hiding this comment.
A more relevant benchmark for large setups would be 5K images with 2M tags in total, 60-80MiB (uncompressed on disk).
There was a problem hiding this comment.
BadgerDB vs Flux storage — benchmark
Single-node kind, one controller (--concurrent=10, threshold 64 KiB). Unique tags per repo.
Each ImageRepository has 1 ImagePolicy, so write path (scan→SetTags) and read path (Tags) both run.
Metrics from the controller process: CPU process_cpu_seconds_total, RSS process_resident_memory_bytes,
disk du /data, IO /proc/1/io.
Throughput (one full scan pass)
| tags (repos×tags) | backend | CPU s | peak RSS | disk | disk write | disk read | syscall write | syscall read | wall |
|---|---|---|---|---|---|---|---|---|---|
| 10K (1K×10) | badger | 23 | 199 MiB | <1 MiB | 0 MiB | 0 | 17 MiB | 48 MiB | 9s |
| 10K (1K×10) | flux | 29 | 111 MiB | 7 MiB | 3 MiB | 0 | 18 MiB | 50 MiB | 8s |
| 2M (5K×400) | badger | 155 | 366 MiB | 37 MiB | 101 MiB | 0 | 89 MiB | 316 MiB | 19s |
| 2M (5K×400) | flux | 172 | 210 MiB | 97 MiB | 78 MiB | 0 | 159 MiB | 412 MiB | 18s |
| 25M (5K×5K) | badger | 241 | 1441 MiB | 418 MiB | 1705 MiB | 0 | 103 MiB | 1134 MiB | 32s |
| 25M (5K×5K) | flux | 477 | 228 MiB | 234 MiB | 214 MiB | 1 | 319 MiB | 1425 MiB | 43s |
2M: 14 KiB files, no gzip. 25M: 170 KiB files, gzip kicks in (>64 KiB).
Idle (zero objects, 60s window)
| state | backend | idle CPU (cores) | GC/60s | RSS | heap in-use | next-GC |
|---|---|---|---|---|---|---|
| empty DB | badger | 0.026 | 0 | 52 MiB | 97 MiB | 180 MiB |
| empty DB | flux | 0.002 | 0 | 49 MiB | 9 MiB | 11 MiB |
Memory sweep — loaded+orphaned badger (2M tags, objects deleted, default GOGC, no GOMEMLIMIT)
| mem limit (=request) | idle CPU (cores) | GC/60s | RSS | heap in-use | restarts |
|---|---|---|---|---|---|
| 8Gi | 0.019 | 0 | 52 MiB | 97 MiB | 0 |
| 2Gi | 0.017 | 0 | 53 MiB | 97 MiB | 0 |
| 1Gi | 0.018 | 0 | 53 MiB | 97 MiB | 0 |
| 512Mi | 0.018 | 0 | 53 MiB | 97 MiB | 0 |
Conclusions
- Memory: flux wins big. 40% less at 2M, 6× less at 25M. Even empty, badger holds 10× the heap (arena + cache at open) — that's the Higher CPU usage without load #333 overhead.
- Disk: badger smaller under 64 KiB, flux smaller over it (gzip). Small either way (<250 MiB at 25M).
- CPU: about equal, except flux doubles when gzip turns on (huge tag sets only). 64 KiB default keeps gzip off for normal repos — right call.
- Badger writes way more to disk (LSM compaction): 8× at 25M.
- "Bump memory fixes CPU" (Higher CPU usage without load #333): not at idle. Idle badger does 0 GC, so memory changes nothing (IRC sets no GOMEMLIMIT). Core-pinning needs active churn, not idle.
- Ship FluxStorage on by default (2.10). Less memory, less disk-write, equal CPU normally, and it frees disk on delete (badger never does —
Deleteis a no-op).
Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com>
9bdcb3c to
c4cc62a
Compare
stefanprodan
left a comment
There was a problem hiding this comment.
LGTM
Thanks @matheuscscp 🏅
Closes: #333
Introduce feature gate
FluxStorage. When enabled, the controller usesfluxcd/pkg/artifact/storagefor storing image tags instead of BadgerDB. The gate is disabled by default.I wrote a prompt for Opus 4.8 xhigh to write a detailed plan, and handed the plan to GPT 5.5 xhigh for implementation.
Prompt (click to see):
Details