fix(aws-vault-proxy): add a cli flag to check the running service#307
fix(aws-vault-proxy): add a cli flag to check the running service#307gdamjan wants to merge 1 commit intoByteNess:mainfrom
Conversation
|
I can't see where the "Validate PR" fails now? |
|
I think double colon might be confusing the check, because strictly features go in brackets 🤔 |
ahh, thanks |
contrib/_aws-vault-proxy/main.go
Outdated
| req, err := http.NewRequest("HEAD", "http://127.0.0.1:80/", nil) | ||
| if err != nil { | ||
| os.Exit(1) | ||
| } | ||
| _, err = http.DefaultClient.Do(req) | ||
| if err != nil { | ||
| os.Exit(1) | ||
| } | ||
| } |
There was a problem hiding this comment.
Could I suggest an improvemnt here and add timeout without building a body:
| req, err := http.NewRequest("HEAD", "http://127.0.0.1:80/", nil) | |
| if err != nil { | |
| os.Exit(1) | |
| } | |
| _, err = http.DefaultClient.Do(req) | |
| if err != nil { | |
| os.Exit(1) | |
| } | |
| } | |
| client := &http.Client{ | |
| Timeout: 2 * time.Second, | |
| } | |
| resp, err := client.Get("http://127.0.0.1:8080/health") | |
| if err != nil { | |
| os.Exit(1) | |
| } | |
| defer resp.Body.Close() | |
| if resp.StatusCode != http.StatusOK { | |
| os.Exit(1) | |
| } | |
| os.Exit(0) |
There was a problem hiding this comment.
it'll need to be port 80 though, otherwise off-course
There was a problem hiding this comment.
Ah, of course - my typo here!
There was a problem hiding this comment.
some concerns,
- the timeout is actually handled by the docker runtime itself, so if the test command doesn't exit in time, it'll consider it unhealthy and stop it.
- setting up 2 seconds might not be what docker users would want?
- signaling unhealthiness on
resp.StatusCode != http.StatusOKis not great, since the proxy can be alive and well but theGET /healthwould return502or404depending what's outside the proxy. We don't need to ultimately restart the proxy in that case IMHO
There was a problem hiding this comment.
hm, I was actually thinking about that and I think the question is - is the proxy useful if noting is responding? Do we need a health check if this check is irrelevant or we don't want to restart.
On the 2s, that was just an example and happy to change whatever feel a better fit here.
There was a problem hiding this comment.
Thanks @gdamjan . Looks look, the only thing I wonder if we should still add timeout on the connection. I don't have strong opinion on what it should be, but I'd be interested in your thoughts.
now that we build the aws-vault-proxy docker image from **scratch**, the image doesn't contain pgrep, or curl or any other executables. implement a healthcheck command in the same executable, that sends a http request to itself on localhost, and assume healthy for any http response. if the proxy process is stuck, then it'll become unhealthy. can be tested by running: ``` docker compose kill --signal STOP aws-vault-proxy ``` which will make the container "stuck". docker will soon enough (after the default timeouts and retries) mark the service/container unhealthy. Unstuck it with: ``` docker compose kill --signal CONT aws-vault-proxy ```
9d976ef to
cba4ca5
Compare
now that we build the aws-vault-proxy docker image from scratch, the image doesn't contain pgrep, or curl or any other executables.
implement a healthcheck command in the same executable, that sends a http request to itself on localhost, and assume healthy for any http response. if the proxy process is stuck, then it'll become unhealthy.
can be tested by running:
which will make the container "stuck". docker will soon enough (after the default timeouts and retries) mark the service/container unhealthy.
Unstuck it with: