Fixes #353 - Add health check command#431
Conversation
|
Could you please show the output? Thx! |
|
Example output when all checks pass: Example output when a service is down: The command exits non-zero on failure, so it can be used in scripts like |
|
This is a good start. Can you look into following the conventions we've established for |
d6a97eb to
d702027
Compare
|
Good point, refactored to follow the
Force-pushed. |
|
Design question for you:
|
d702027 to
9ea20cc
Compare
|
Good questions. Service vs API checks - do we need both? They operate at different layers. Service checks verify that systemd units are running - this works offline and does not depend on a functional API. The API check ( Conditional checks based on features - yes, agreed. Updated Pulp and Candlepin checks to only run when Validation - could you clarify what you have in mind? An |
2c44427 to
4c3bd96
Compare
src/playbooks/health/health.yaml
Outdated
| - "../../vars/base.yaml" | ||
| tasks: | ||
| - name: Execute health checks | ||
| ansible.builtin.include_tasks: ../../roles/checks/tasks/execute_check.yml |
There was a problem hiding this comment.
I think this is a better pattern:
ansible.builtin.include_role:
name: common
tasks_from: login
There was a problem hiding this comment.
Done - switched to include_role: name: checks, tasks_from: execute_check to reuse the existing checks role and avoid the relative path.
| loop: | ||
| - pulp-api | ||
| - pulp-content | ||
| when: enabled_features | select('contains', 'content/') | list | length > 0 |
There was a problem hiding this comment.
Seems like we could use a better way to do this pattern of checking. Is there a filter or some other Ansible construct we can encapsulate feature checking logic in?
There was a problem hiding this comment.
Added a has_content_features filter to foremanctl.py - the condition is now just enabled_features | has_content_features.
|
I would expect to add this to the end of our Github Action tests so verify the command works and ensure that it passes. |
4c3bd96 to
15e6135
Compare
|
Updated based on your feedback:
|
I can't know in the PR if this is working if the CI is split off to another PR :) Testing and changes to CI should generally be kept together with the change. |
src/filter_plugins/foremanctl.py
Outdated
| return compact_list(plugins) | ||
|
|
||
|
|
||
| def has_content_features(features): |
There was a problem hiding this comment.
Do you forsee having a method per feature? Or should it be a more generic method that a feature can be passed in to?
There was a problem hiding this comment.
Good point - a generic filter makes more sense. I'll change it to has_feature_prefix(features, prefix) so it can be reused for any feature group, e.g. enabled_features | has_feature_prefix('content/').
|
What are your thoughts on how |
|
|
Add a new foremanctl health command that verifies the state of all Foreman services after installation or during troubleshooting. Checks performed: - Core services running (foreman, httpd, redis, postgresql) - Dynflow workers running (orchestrator, worker, worker-hosts-queue) - Pulp services running (pulp-api, pulp-content) - Candlepin service running - Foreman API responding (GET /api/v2/ping) - Foreman tasks status (via Katello ping response) Reports a summary of all failures and exits non-zero if any check fails, making it suitable for scripting and CI use.
15e6135 to
845cd50
Compare
Adds a new
foremanctl healthcommand that performs post-install health checks on all Foreman services.Checks performed:
The command reports individual results for each check and provides a summary. It exits non-zero when any check fails, making it suitable for scripting and CI use.
As discussed in the issue, this can also be integrated as a final step of deployment in the future.