From 7a800870db2fcdc6c389065946e1fb57f3f67e1b Mon Sep 17 00:00:00 2001 From: "Eric D. Schabell" Date: Fri, 20 Mar 2026 21:48:42 +0100 Subject: [PATCH 1/3] docs: processors: tda: fix vale errors and suppress spelling suggestions - Remove spaces around em dashes in Betti number table - Spell out ordinal "30th" as "thirtieth" - Wrap Ripser in backticks to suppress false spelling suggestions - Wrap TDA in backticks in H1 to suppress heading capitalization suggestion Applies to #2497 Signed-off-by: Eric D. Schabell --- pipeline/processors/tda.md | 94 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 94 insertions(+) create mode 100644 pipeline/processors/tda.md diff --git a/pipeline/processors/tda.md b/pipeline/processors/tda.md new file mode 100644 index 000000000..8ab7a43d9 --- /dev/null +++ b/pipeline/processors/tda.md @@ -0,0 +1,94 @@ +# Topological data analysis (`TDA`) + +This processor applies [Topological Data Analysis](https://en.wikipedia.org/wiki/Topological_data_analysis) (`TDA`) to incoming metrics using a sliding window and `Ripser` persistent homology. It computes Betti numbers that characterize the topological shape of the metric signal over time, which can surface structural patterns (such as recurring cycles or anomalies) that traditional statistical methods miss. + +The processor operates only on metrics. Log and trace records pass through unchanged. + +{% hint style="info" %} + +Only [YAML configuration files](../../administration/configuring-fluent-bit/yaml.md) support processors. + +{% endhint %} + +## How it works + +On each flush, the processor: + +1. Aggregates incoming metrics into a feature vector by collapsing each unique `(namespace, subsystem)` pair into a single value. Counters are converted to log-scaled rates; gauges are used directly. +2. Appends the feature vector to a sliding ring-buffer window of up to `window_size` samples. +3. Optionally applies delay embedding (controlled by `embed_dim` and `embed_delay`) to reconstruct attractor geometry from the time series. +4. Once the window holds at least `min_points` samples, builds a pairwise Euclidean distance matrix over the embedded points and runs `Ripser` to compute persistent homology. +5. Scans across multiple distance thresholds (or uses the quantile supplied in `threshold`) and emits the Betti numbers that show the strongest topological signal. + +The output is three gauge metrics added to the same metrics context: + +| Metric | Description | +| ------ | ----------- | +| `fluentbit_tda_betti0` | Betti number β₀—number of connected components in the Vietoris-Rips complex. | +| `fluentbit_tda_betti1` | Betti number β₁—number of independent loops (1-cycles). Elevated values suggest cyclic or periodic patterns. | +| `fluentbit_tda_betti2` | Betti number β₂—number of enclosed voids (2-cycles). | + +## Configuration parameters + +| Key | Description | Default | +| --- | ----------- | ------- | +| `window_size` | Number of samples to keep in the sliding window. | `60` | +| `min_points` | Minimum number of samples that must be in the window before `Ripser` runs. | `10` | +| `embed_dim` | Delay embedding dimension `m`. Setting `m=1` disables delay embedding and uses the raw feature vectors directly. For `m>1`, each point in the distance matrix is constructed from `m` consecutive lagged snapshots (for example, `m=3` → `x_t`, `x_{t-1}`, `x_{t-2}`). | `3` | +| `embed_delay` | Lag `τ` in samples between successive delays in the embedding. Ignored when `embed_dim=1`. | `1` | +| `threshold` | Distance scale selector. `0` triggers an automatic multi-quantile scan that picks the threshold maximizing β₁ (or β₀ when all β₁ are zero). A value in `(0, 1)` is treated as a quantile of the pairwise distance distribution and used directly as the `Ripser` threshold. | `0` | + +## Configuration example + +The following example scrapes Prometheus metrics and runs `TDA` on the ingested data before forwarding to an OpenTelemetry endpoint: + +```yaml +service: + flush: 10 + log_level: info + +pipeline: + inputs: + - name: prometheus_scrape + host: 127.0.0.1 + port: 9090 + scrape_interval: 10s + tag: prom.metrics + + processors: + metrics: + - name: tda + window_size: 60 + min_points: 10 + embed_dim: 3 + embed_delay: 1 + threshold: 0 + + outputs: + - name: opentelemetry + match: 'prom.metrics' + host: otel-collector + port: 4318 +``` + +To disable delay embedding and run `TDA` directly on the raw metric vectors, set `embed_dim: 1`: + +```yaml +processors: + metrics: + - name: tda + window_size: 120 + min_points: 20 + embed_dim: 1 +``` + +To fix the distance threshold at a specific quantile of the pairwise distances (for example, the thirtieth percentile), set `threshold` to a value between 0 and 1: + +```yaml +processors: + metrics: + - name: tda + window_size: 60 + min_points: 10 + threshold: 0.3 +``` From bed5ee420b2156ca84ccec7d64d573ffe2c853ff Mon Sep 17 00:00:00 2001 From: "Eric D. Schabell" Date: Fri, 20 Mar 2026 21:48:14 +0100 Subject: [PATCH 2/3] docs: SUMMARY.md: add TDA processor entry and sort processors alphabetically - Add Topological data analysis (TDA) entry under Processors section - Sort all processor entries alphabetically (Conditional processing and Filters as processors moved to correct positions) Applies to #2497 Signed-off-by: Eric D. Schabell --- SUMMARY.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/SUMMARY.md b/SUMMARY.md index 32b770841..c1512f972 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -133,15 +133,16 @@ * [Regular expression format](pipeline/parsers/regular-expression.md) * [Decoder settings](pipeline/parsers/decoders.md) * [Processors](pipeline/processors.md) + * [Conditional processing](pipeline/processors/conditional-processing.md) * [Content modifier](pipeline/processors/content-modifier.md) * [Cumulative to delta](pipeline/processors/cumulative-to-delta.md) + * [Filters as processors](pipeline/processors/filters.md) * [Labels](pipeline/processors/labels.md) * [Metrics selector](pipeline/processors/metrics-selector.md) * [OpenTelemetry envelope](pipeline/processors/opentelemetry-envelope.md) * [Sampling](pipeline/processors/sampling.md) * [SQL](pipeline/processors/sql.md) - * [Filters as processors](pipeline/processors/filters.md) - * [Conditional processing](pipeline/processors/conditional-processing.md) + * [Topological data analysis](pipeline/processors/tda.md) * [Filters](pipeline/filters.md) * [AWS metadata](pipeline/filters/aws-metadata.md) * [CheckList](pipeline/filters/checklist.md) From 42ed312e332a01942d9d3da568af6d88054e5aff Mon Sep 17 00:00:00 2001 From: "Eric D. Schabell" Date: Sat, 21 Mar 2026 19:32:57 +0100 Subject: [PATCH 3/3] docs: installation: build-and-install: general doc updates and cleanup - Add FLB_IN_FLUENTBIT_LOGS to input plugins table - Add FLB_PROCESSOR_TDA to processor plugins table Applies to #2497 Signed-off-by: Eric D. Schabell --- installation/downloads/source/build-and-install.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/installation/downloads/source/build-and-install.md b/installation/downloads/source/build-and-install.md index b71c45ef5..89c7610ac 100644 --- a/installation/downloads/source/build-and-install.md +++ b/installation/downloads/source/build-and-install.md @@ -181,6 +181,7 @@ The following input plugins are available: | [`FLB_IN_EXEC`](../../../pipeline/inputs/exec.md) | Enable Exec input plugin | `On` | | [`FLB_IN_EXEC_WASI`](../../../pipeline/inputs/exec-wasi.md) | Enable Exec WASI input plugin | `On` | | [`FLB_IN_FLUENTBIT_METRICS`](../../../pipeline/inputs/fluentbit-metrics.md) | Enable Fluent Bit metrics input plugin | `On` | +| [`FLB_IN_FLUENTBIT_LOGS`](../../../pipeline/inputs/fluentbit-logs.md) | Enable Fluent Bit internal logs input plugin | `On` | | [`FLB_IN_FORWARD`](../../../pipeline/inputs/forward.md) | Enable Forward input plugin | `On` | | [`FLB_IN_GPU_METRICS`](../../../pipeline/inputs/gpu-metrics.md) | Enable GPU metrics input plugin | `On` | | [`FLB_IN_HEAD`](../../../pipeline/inputs/head.md) | Enable Head input plugin | `On` | @@ -232,6 +233,7 @@ The following table describes the processors available: | [`FLB_PROCESSOR_OPENTELEMETRY_ENVELOPE`](../../../pipeline/processors/opentelemetry-envelope.md) | Enable OpenTelemetry envelope processor | `On` | | [`FLB_PROCESSOR_SAMPLING`](../../../pipeline/processors/sampling.md) | Enable sampling processor | `On` | | [`FLB_PROCESSOR_SQL`](../../../pipeline/processors/sql.md) | Enable SQL processor | `On` | +| [`FLB_PROCESSOR_TDA`](../../../pipeline/processors/tda.md) | Enable Topological Data Analysis (`TDA`) processor | `On` | ### Filter plugins