Problem Statement
When alerts fire, the response is still mostly manual. Alertmanager can send alerts, but it is not built to run remediation steps or operational workflows.
This creates a gap between detecting a problem and reacting to it in a clear and repeatable way. Operators still need to investigate the issue, decide what to do, and execute the action manually.
Keep helps close this gap. It adds workflow execution and playbook-based handling on top of the existing alerting setup. This makes it possible to move from simple notifications to guided or automated remediation.
This work follows #4 and applies it to one concrete and highly EO-relevant demo setup: an EO platform stack with APISIX -> eoapi -> PostgreSQL serving STAC APIs. The goal is to show how reusable remediation patterns can be implemented and tested in a real platform scenario.
Scope
Introduce GitOps-managed remediation workflows and playbooks for the APISIX -> eoapi -> PostgreSQL setup.
Tasks
Outcome
A repeatable setup for guided or automated remediation that improves response time and operational consistency, demonstrated on a realistic EO platform use case.
Problem Statement
When alerts fire, the response is still mostly manual. Alertmanager can send alerts, but it is not built to run remediation steps or operational workflows.
This creates a gap between detecting a problem and reacting to it in a clear and repeatable way. Operators still need to investigate the issue, decide what to do, and execute the action manually.
Keep helps close this gap. It adds workflow execution and playbook-based handling on top of the existing alerting setup. This makes it possible to move from simple notifications to guided or automated remediation.
This work follows #4 and applies it to one concrete and highly EO-relevant demo setup: an EO platform stack with APISIX -> eoapi -> PostgreSQL serving STAC APIs. The goal is to show how reusable remediation patterns can be implemented and tested in a real platform scenario.
Scope
Introduce GitOps-managed remediation workflows and playbooks for the APISIX -> eoapi -> PostgreSQL setup.
Tasks
Outcome
A repeatable setup for guided or automated remediation that improves response time and operational consistency, demonstrated on a realistic EO platform use case.