Skip to content

chore(orch): log the process state#2372

Merged
jakubno merged 3 commits intomainfrom
chore/log-process-state
Apr 14, 2026
Merged

chore(orch): log the process state#2372
jakubno merged 3 commits intomainfrom
chore/log-process-state

Conversation

@jakubno
Copy link
Copy Markdown
Member

@jakubno jakubno commented Apr 13, 2026

Log the process state even if in another state than D

@cursor
Copy link
Copy Markdown

cursor bot commented Apr 13, 2026

PR Summary

Medium Risk
Touches Firecracker process shutdown/kill behavior and adds a new gopsutil-based status lookup; mistakes here could cause missed/extra SIGKILLs or reduced observability during sandbox teardown. Change is small but in a reliability-critical path.

Overview
Replaces the ps-based process state probe with a gopsutil-based getProcessStatus helper and updates Stop() to capture and log the Firecracker process status right before issuing SIGKILL after the 10s SIGTERM grace period, while also handling the "process already exited" case via process.ErrorProcessNotRunning.

Reviewed by Cursor Bugbot for commit e9e7190. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6dc71b4644

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/orchestrator/pkg/sandbox/fc/process.go Outdated
cursor[bot]

This comment was marked as resolved.

Comment thread packages/orchestrator/pkg/sandbox/fc/process.go Outdated
Comment thread packages/orchestrator/pkg/sandbox/fc/process.go Outdated
Comment thread packages/orchestrator/pkg/sandbox/fc/process.go
logger.L().Info(ctx, "fc process is in the D state after we call SIGKILL", logger.WithSandboxID(p.files.SandboxID))
}

logger.L().Info(ctx, "sent SIGKILL to fc process", logger.WithSandboxID(p.files.SandboxID), zap.String("state", state))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess that getProcessState() will most commonly return error because process was killed and reaped and it no longer exists (kill syscall returns when signal was delivered). Therefore I would expect state equal to "" in most cases. I would suggest to change it to something like this:

state, err := getProcessState(ctx, p.cmd.Process.Pid)
if err != nil {
   // undecided what exactly to do here, because `err` is mostly a good sign here original `Warn()` is a bit misleading.
} else {
   logger.L().Info(ctx, "sent SIGKILL to fc process", logger.WithSandboxID(p.files.SandboxID), zap.String("state", state))
}

Copy link
Copy Markdown
Contributor

@arkamar arkamar Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like getProcessState() function. Calling extra process for one letter from /proc/<pid>/stat file is overkill. gopsutils has StatusWithContext() but it maps Linux letters to strings https://github.com/shirou/gopsutil/blob/c2a1624b9f3ed0b38ad67134b93397142ed67a23/process/process.go#L609

@jakubno jakubno requested a review from arkamar April 13, 2026 16:56
Copy link
Copy Markdown
Contributor

@arkamar arkamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably ok to leave it as it is, but I dropped the comment for the record. The chance, that the process will exit literally right after the timer event should be close to zero.

// captured above is 10s stale and no longer useful here.
status, stateErr := getProcessStatus(p.cmd.Process.Pid)
if stateErr != nil && !errors.Is(stateErr, process.ErrorProcessNotRunning) {
logger.L().Warn(ctx, "failed to get fc process status before SIGKILL", zap.Error(stateErr), logger.WithSandboxID(p.files.SandboxID))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return here? The kill will fail as well, because the process is not running anymore.

@jakubno jakubno merged commit da90c8f into main Apr 14, 2026
45 checks passed
@jakubno jakubno deleted the chore/log-process-state branch April 14, 2026 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants