Skip to content

Conversation

@Sanchit2662
Copy link

Summary

This PR fixes a silent container state corruption issue in urunc where a container could be marked as running even though the VMM process never started.

The root cause was an ordering issue in the IPC protocol: StartSuccess was sent to urunc start before vmm.Execve() was executed. If Execve() failed afterward, the failure could no longer be reported, leaving the container in a ghost running state with no underlying process.

This change introduces a small, bounded error-checking window to ensure fast Execve() failures are correctly surfaced without redesigning the IPC protocol.


Fix

Instead of immediately proceeding after receiving StartSuccess, urunc start now performs a short error-check window.

A new helper waits briefly after success to catch a fast-following StartErr, which reliably indicates an Execve() failure.

Key addition:

err := AwaitMessageWithErrorCheck(
    listener,
    StartSuccess,
    StartErr,
    execveErrorCheckTimeout, // 100ms
)

How it works

  1. urunc start receives StartSuccess
  2. It waits up to 100ms for a potential StartErr
  3. If StartErr arrives → start fails and reports the error
  4. If the timeout expires or the connection closes → proceed normally (successful exec)

This preserves the existing process model while closing the false-success window.


Impact

Before

  • Containers could be marked running with no VMM
  • Failures were silent and delayed
  • Kubernetes Pods appeared healthy but never executed
  • Manual intervention was required to clean up corrupted state

After

  • Container state accurately reflects VMM startup reality
  • Fast Execve() failures are reliably reported
  • No more ghost “running” containers
  • No observable impact on successful startups (timeout only matters on failure paths)

Signed-off-by: Sanchit2662 <sanchit2662@gmail.com>
@netlify
Copy link

netlify bot commented Jan 25, 2026

Deploy Preview for urunc canceled.

Name Link
🔨 Latest commit f736d72
🔍 Latest deploy log https://app.netlify.com/projects/urunc/deploys/697563564e3ed1000843bf52

Copy link
Contributor

@cmainas cmainas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @Sanchit2662 ,

thank you for this contribution. However, i am not really sure I understand how your proposal works. How does a process catches execve failures from another process through a socket?

return fmt.Errorf("received unexpected message: %s", msg)
}

// Wait briefly for potential error after success (catches Execve failures)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How exactly does this work? Which process will write in case of execve failures?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reexec process is responsible for writing the error.

When syscall.Exec() fails and returns, the reexec process sends StartErr over the socket before exiting:

err = unikontainer.Exec(metrics)
if err != nil {
    unikontainer.SendMessage(unikontainers.StartErr)
}

The parent process (urunc start) does not detect the failure directly; it only observes the error via the socket message sent by the reexec process.

@Sanchit2662
Copy link
Author

Hello @Sanchit2662 ,

thank you for this contribution. However, i am not really sure I understand how your proposal works. How does a process catches execve failures from another process through a socket?

Hello @cmainas ,
This is handled through explicit signaling from the reexec process over the existing Unix socket between the two processes.

The reexec process sends a StartSuccess message before calling syscall.Exec(). If execve succeeds, the process image is replaced, no further Go code executes, and the socket is closed by the kernel. If execve fails, syscall.Exec() returns an error to the reexec process, which then sends a StartErr message over the socket.

On the parent side, urunc start uses AwaitMessageWithErrorCheck, which first receives StartSuccess and then waits briefly (with a timeout) for a possible StartErr. If StartErr is received during this window, it is treated as an execve failure.

@cmainas
Copy link
Contributor

cmainas commented Jan 27, 2026

Hello @Sanchit2662

The reexec process sends a StartSuccess message before calling syscall.Exec(). If execve succeeds, the process image is replaced, no further Go code executes, and the socket is closed by the kernel. If execve fails, syscall.Exec() returns an error to the reexec process, which then sends a StartErr message over the socket.

I do not think this is correct. First of all the socket has been opened by urunc start and it waits for connections. The reexec process simply connects to the socket, sends a message and closes the connection. Whatever happens to reexec process after that, it does not affect the socket at all.

On the parent side, urunc start uses AwaitMessageWithErrorCheck, which first receives StartSuccess and then waits briefly (with a timeout) for a possible StartErr. If StartErr is received during this window, it is treated as an execve failure.

This is also incorrect. urunc start waits for s specific message that no other process sends it. After an execve there is no process that cans end the StartErr message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants