The health check system provides lightweight operational endpoints for liveness and readiness monitoring. In the current starter implementation, it supports:
- a generic liveness endpoint
- a readiness endpoint with dependency checks
- database connectivity monitoring
- background worker heartbeat and queue-state monitoring
- queued email backlog monitoring
- structured JSON responses for external probes
The health system is intended to support local debugging, container orchestration, and deployment monitoring without exposing the full internals of the application.
This starter includes built-in health checks so a new project can answer basic operational questions immediately:
- is the web app process alive
- is the database reachable
- is the background worker still running
- are background emails piling up
That gives developers a baseline operational surface without having to build health infrastructure first.
The checks are registered in Program.cs:
| Check name | Class | Tags |
|---|---|---|
database |
DatabaseHealthCheck |
ready |
background-worker |
BackgroundWorkerHealthCheck |
ready |
stuck-emails |
StuckEmailsHealthCheck |
ready |
There are currently no checks registered without the ready tag.
The web app exposes two health endpoints.
/health
This endpoint is configured with:
Predicate = _ => falseThat means it does not execute dependency checks. It only confirms that the web process is up and can respond.
/health is intended to stay the minimal public-facing endpoint.
/health/ready
This endpoint runs all checks tagged with ready, which currently means:
- database connectivity
- background worker status
- stuck email detection
This is the endpoint to use for deployment readiness or operational dashboards.
/health/ready is no longer broadly public.
The current access rule is:
- local requests are allowed
- non-local requests must include a shared secret header
- if access is not allowed, the endpoint returns
404
This is handled by HealthEndpointAccess, which checks:
- whether the request is loopback or local to the host
- whether the configured header name exists
- whether the header value exactly matches the configured secret
The goal is to keep operational detail available for trusted probes without advertising the readiness surface publicly.
The access settings are configured through:
"HealthEndpoints": {
"ReadyHeaderName": "X-Health-Key",
"ReadySharedSecret": ""
}| Key | Meaning |
|---|---|
HealthEndpoints:ReadyHeaderName |
Header name expected on non-local readiness requests |
HealthEndpoints:ReadySharedSecret |
Shared secret value required for non-local readiness requests |
If ReadySharedSecret is empty:
- local requests to
/health/readystill work - non-local requests to
/health/readyare denied
If ReadySharedSecret is configured:
- non-local callers must send the configured header and matching value
Example:
GET /health/ready
X-Health-Key: your-secret-here
Responses are written by HealthCheckResponseWriter as JSON.
The shape is:
{
"status": "Healthy",
"totalDuration": 12.34,
"checks": [
{
"name": "database",
"status": "Healthy",
"description": "Database connection is healthy.",
"duration": 1.23,
"data": null,
"exception": null
}
]
}| Field | Meaning |
|---|---|
status |
Aggregate health status across all executed checks |
totalDuration |
Total time spent running the health report, in milliseconds |
checks |
Per-check results |
checks[].name |
Registered check name |
checks[].status |
Healthy, Degraded, or Unhealthy |
checks[].description |
Human-readable summary |
checks[].duration |
Per-check duration in milliseconds |
checks[].data |
Optional structured diagnostic data |
checks[].exception |
Exception message if the check failed with an exception |
DatabaseHealthCheck is the simplest check.
It calls:
_dbContext.Database.CanConnectAsync(...)Healthyif the app can connect to the configured databaseUnhealthyif it cannot connectUnhealthyif the connectivity test throws
This check does not validate:
- schema correctness
- migration state
- query latency
- transaction behavior
- specific table availability
It is strictly a connectivity check.
BackgroundWorkerHealthCheck is the most operationally detailed check in the starter.
It combines two signal sources:
- in-memory worker state from
IBackgroundWorker - persisted queue state from the
BackgroundTaskstable
The check currently includes:
localImmediateQueueLengthdbQueuedCountdbDueQueuedCountdbRunningCountdbStaleRunningCountlastHeartbeatelapsedSeconds
Degradedif the worker has not started yetUnhealthyif the worker heartbeat is staleDegradedif stale running tasks are detectedHealthyotherwise
The worker exposes LastHeartbeat, and the health check compares it against a hardcoded threshold of 2 minutes.
If the last heartbeat is older than that, readiness becomes Unhealthy.
The check currently flags a task as stale when:
Status == Running- and
StartedAtis older than 15 minutes or missing
This is used only for health reporting. It does not itself recover the task.
The worker’s actual stale-task recovery timeout is configurable through BackgroundTaskSettings.RunningTaskTimeoutMinutes, but the health check currently uses its own hardcoded 15-minute threshold.
That means the health signal can drift out of sync with real worker recovery behavior if a project changes the worker settings.
StuckEmailsHealthCheck monitors queued emails that appear not to be progressing.
It queries EmailLogs for records where:
Status == QueuedScheduledDate < now - 30 minutes
Degradedif one or more queued emails are older than the thresholdHealthyif none are staleUnhealthyif the query throws
The check currently includes:
stuckCountthresholdMinutes
The check uses ScheduledDate, not CreateDate.
That matters because requeued emails update ScheduledDate, so an old email that was just manually requeued does not immediately show up as stuck.
ASP.NET Core health checks aggregate the final endpoint status from the underlying checks.
Operationally:
- if any executed check is
Unhealthy, the overall endpoint status becomesUnhealthy - if no executed check is
Unhealthybut at least one isDegraded, the overall endpoint status becomesDegraded - otherwise the endpoint is
Healthy
That means /health/ready should be treated as the true deployment readiness surface, not just a debug endpoint.
Use /health when you only need to know whether the process is up and able to answer HTTP requests.
This is the safer endpoint for:
- basic uptime checks
- restart policies
- simple process liveness probes
This endpoint is the one you can expose most safely if you need anonymous health visibility.
Use /health/ready when you want to know whether the application is operational enough to receive traffic.
This is the better endpoint for:
- Kubernetes readiness probes
- deployment gates
- dashboards and monitoring alerts
If the probe originates off-host, make sure it sends the configured readiness header.
The current health-check implementation is strong in a few important ways:
- it separates liveness from readiness
- it keeps detailed readiness data off the public path by default
- it returns structured JSON
- it includes useful diagnostic data in readiness responses
- it checks both local worker heartbeat and persisted queue state
- it catches stuck queued emails explicitly
For a starter codebase, that is a solid operational baseline.
The most important limitations in the current implementation are:
BackgroundWorkerHealthCheck uses a hardcoded 15-minute stale-running threshold instead of reading the same setting the worker uses for recovery.
This can create false alarms if worker settings are tuned for longer-running jobs.
The background-worker readiness check currently runs four separate count queries against BackgroundTasks on every probe.
That is acceptable for a starter or moderate traffic, but it is heavier than a minimal health path.
DatabaseHealthCheck does not tell you whether the schema is correct or whether important queries are succeeding quickly.
The readiness model includes explicit email backlog monitoring, but there are no equivalent stuck-item checks yet for:
- webhook deliveries
- import tasks
- other future background job types
The current readiness protection is intentionally simple:
- local-request allow
- shared secret for non-local access
That is practical for a starter, but teams with stricter requirements may prefer:
- network allowlists
- reverse-proxy enforcement
- authenticated ops endpoints
- platform-native private health probes
For a new project using this starter:
- use
/healthfor liveness probes - use
/health/readyfor readiness probes - configure
HealthEndpoints:ReadySharedSecretin non-local environments - if your orchestrator probes from another host, make sure it sends the configured readiness header
- monitor
Degradedresponses, not justUnhealthy - inspect the
checks[].datapayloads when diagnosing worker or email issues - keep probe frequency reasonable, especially because readiness currently performs DB work
- if you tune background worker timeouts, remember that health-check stale-task thresholds are not automatically kept in sync
If a project needs a more mature operational surface later, the next logical improvements would be:
- align
BackgroundWorkerHealthCheckwithBackgroundTaskSettings - reduce DB work per readiness probe
- add stuck-item checks for other task types
- add latency or timeout-oriented DB checks
- add environment-specific readiness behavior if some dependencies are optional
The health-check system is already useful and production-shaped for a starter project. It gives clear liveness and readiness endpoints, structured results, and meaningful operational checks around the database, background worker, and queued emails.
The main caveats are that the background-worker readiness check currently duplicates one timeout threshold, does more database work than an ideal lightweight probe, and protects non-local readiness access with a simple shared-secret model. Those are worth knowing, but they do not prevent the current health-check system from being a strong baseline for new projects built on DevCoreApp.