Multi-Process support

This document describes the minimal changes to the shared litebox core
(`litebox/`) needed to support multiple guest processes. The design is
platform-agnostic: kernel-mode platforms (separate page tables per process)
and userland platforms (single host address space) implement the same trait
contract. POSIX-specific semantics (process groups, sessions, signals,
waitpid flags) belong in the shim layer, not the core.

---

## 1. New North Interface: Process Registry

The core introduces a `process` module that provides process identity and
lifecycle management. Shims build OS-specific semantics (POSIX sessions,
NT job objects, etc.) on top of these primitives.

### 1.1 Identity

```rust
/// Process identifier.
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct ProcessId(u32);

impl ProcessId {
    /// The first process created in every LiteBox instance.
    pub const INIT: Self = Self(1);
    pub fn new(raw: u32) -> Option<Self>;  // None if raw == 0
    pub fn as_u32(self) -> u32;
}
```

### 1.2 Process context and lifecycle

```rust
/// Per-process state tracked by the core.
pub struct ProcessContext {
    pub id: ProcessId,
    /// Parent process. `None` only for the init process.
    pub parent: Option<ProcessId>,
    pub state: ProcessState,
}

pub enum ProcessState {
    Running,
    /// The process has exited. The `u32` is opaque to the core;
    /// shims assign platform-specific meaning (POSIX: waitstatus
    /// encoding; NT: NTSTATUS / DWORD exit code, etc.).
    Exited(u32),
}

/// Returned by `exit_process` so the shim can notify the parent
/// through whatever mechanism is appropriate (SIGCHLD, handle
/// signaling, etc.). The `exit_status` is the same opaque value
/// passed to `exit_process`.
pub struct ExitNotification {
    pub parent_pid: ProcessId,
    pub child_pid: ProcessId,
    pub exit_status: u32,
}

/// Errors from `create_process`.
pub enum CreateProcessError {
    /// The specified parent PID does not exist in the registry.
    NoSuchParent,
    /// A root (init) process already exists; only one is allowed.
    InitAlreadyExists,
}
```

Note: the core's `ProcessContext` is intentionally minimal. Shims
maintain their own per-process state alongside it (POSIX: pgid, sid,
umask, credentials, signal mask; NT: job object handle, token, etc.).

### 1.3 ProcessRegistry API

`ProcessRegistry<M>` is a concrete struct parameterized on a mutex type
(`M: RawMutex`). It owns a process table and an atomic PID counter.

#### Creation and teardown

| Method | Signature | Description |
|--------|-----------|-------------|
| `create_process` | `(&self, parent: Option<ProcessId>) -> Result<ProcessId, CreateProcessError>` | Allocate a PID and register the parent-child relationship. `parent=None` creates the init process (PID 1). |
| `abort_process` | `(&self, id: ProcessId)` | Remove a process that was never started (e.g., child-process setup failed after PID allocation). The process must have no children and must still be in `Running` state; panics otherwise. |

#### Exit

| Method | Signature | Description |
|--------|-----------|-------------|
| `exit_process` | `(&self, id: ProcessId, status: u32, orphan_handler: impl FnMut(ProcessId)) -> Option<ExitNotification>` | Record exit status. For each orphaned child (children whose parent is the exiting process), calls `orphan_handler` so the shim can decide the reparenting policy. Returns `Some(ExitNotification)` if the parent is still alive, `None` otherwise. |

#### Queries

| Method | Signature | Description |
|--------|-----------|-------------|
| `with_context` | `(&self, id: ProcessId, f: FnOnce(&ProcessContext) -> R) -> Option<R>` | Read process context through a closure (avoids exposing the internal lock). Returns `None` if the process does not exist. |
| `is_alive` | `(&self, id: ProcessId) -> bool` | Convenience: returns `true` if the process exists and is in `Running` state. |
| `get_parent` | `(&self, id: ProcessId) -> Option<ProcessId>` | Parent PID. |
| `get_children` | `(&self, id: ProcessId) -> Option<Vec<ProcessId>>` | Child PIDs. |
| `process_count` | `(&self) -> usize` | Total running processes. |
| `remove_process` | `(&self, id: ProcessId)` | Remove an exited process from the table. Panics if the process is still running. |

#### Exit observation

```rust
/// Shared handle for observing a process's exit.
///
/// `exited` becomes `true` when the process exits. `subject` is
/// notified with readiness events so shims can integrate with their
/// event loop. `Subject` and `Events` are existing litebox core
/// abstractions for event-driven readiness notification; they are
/// not tied to any specific OS event model.
///
/// If `remove_process` is called while an observer is held, the
/// `AtomicBool` and `Subject` remain valid (they are `Arc`-backed)
/// but no further events will be delivered.
pub struct ProcessExitObserver<M: RawMutex> {
    pub exited: Arc<AtomicBool>,
    pub subject: Arc<Subject<Events, Events, M>>,
}
```

| Method | Signature | Description |
|--------|-----------|-------------|
| `exit_observer` | `(&self, id: ProcessId) -> Option<ProcessExitObserver<M>>` | Obtain a shared exit-observation handle for the given process. |

### 1.4 LiteBox integration

`LiteBox` owns a `ProcessRegistry` and creates the init process (PID 1)
during construction.

```rust
impl LiteBox<Platform> {
    pub fn process_registry(&self) -> &ProcessRegistry<Platform::RawMutex>;
}
```

---

## 2. New South Interface: AddressSpaceProvider

The core requires platforms to implement address-space management via the
`AddressSpaceProvider` trait, added to the `Provider` supertrait.

### 2.1 Address space kind

```rust
/// Platform-wide property: are address spaces isolated or shared?
pub enum AddressSpaceKind {
    /// Each address space has independent memory (e.g., kernel page
    /// tables, separate host processes). The platform handles memory
    /// isolation; the shim does not need to manage CoW.
    Isolated,
    /// Address spaces share the same host memory (e.g., VA partitions
    /// in a single userland process). The shim is responsible for
    /// copy-on-write or other memory separation.
    SharedMemory,
}
```

### 2.2 Trait definition

```rust
pub trait AddressSpaceProvider {
    type AddressSpaceId: Copy + Eq + Send + Sync + Hash + Debug;

    /// Platform-wide: are address spaces isolated or shared?
    const ADDRESS_SPACE_KIND: AddressSpaceKind;

    /// Create a new, empty address space.
    fn create_address_space(&self)
        -> Result<Self::AddressSpaceId, AddressSpaceError>;

    /// Destroy an address space, releasing all resources.
    fn destroy_address_space(&self, id: Self::AddressSpaceId)
        -> Result<(), AddressSpaceError>;

    /// Make `id` the active address space for the current thread.
    ///
    /// Activation is thread-local: each thread independently tracks
    /// its active address space. Multiple threads may be active in
    /// different address spaces concurrently.
    ///
    /// On kernel platforms this switches page tables (e.g., CR3).
    /// On userland platforms this may be a no-op if all address spaces
    /// are accessible from any thread.
    ///
    /// The caller is responsible for eventually switching to a
    /// different address space (there is no separate "deactivate"
    /// operation -- deactivation is simply activating another space).
    /// Prefer `with_address_space` for scoped activation.
    fn activate_address_space(&self, id: Self::AddressSpaceId)
        -> Result<(), AddressSpaceError>;

    /// Execute `f` with the given address space active, then restore
    /// the previously active address space. Implementations must
    /// restore the prior state even if `f` panics.
    fn with_address_space<R>(
        &self,
        id: Self::AddressSpaceId,
        f: impl FnOnce() -> R,
    ) -> Result<R, AddressSpaceError>;

    /// Return the VA range available to the given address space.
    ///
    /// Used by the shim to scope memory operations (e.g., mmap, brk)
    /// to the correct region for this process.
    fn address_space_range(&self, id: Self::AddressSpaceId)
        -> Result<Range<usize>, AddressSpaceError>;
}
```

`activate_address_space` exists separately from `with_address_space`
because some call sites need to switch address spaces for an extended
period (e.g., entering guest execution) where scoped RAII is impractical.

### 2.3 Errors

```rust
pub enum AddressSpaceError {
    NoSpace,
    InvalidId,
    NotSupported,
}
```

`AddressSpaceProvider` is added to the existing `Provider` supertrait
so all platforms must implement it.

---

## 3. Existing Core Internals Made Multi-Process Friendly

The following existing core subsystems require targeted changes to support
multiple processes. These are internal adaptations, not new public
interfaces.

### 3.1 File descriptors

Each process gets its own `RawDescriptorStorage` mapping guest descriptor
numbers to entries in the global `Descriptors` table. Multiple processes
can share the same underlying descriptor entry (via `Arc`) when a
descriptor is duplicated across process boundaries.

- **Single-descriptor duplication** -- `Descriptors::duplicate_descriptor()`
  (new method) creates a new slot sharing the same `Arc<DescriptorEntry>`
  as the source. This is the primitive that shims use to pass descriptors
  between processes. How many descriptors are duplicated and when is a
  shim policy decision.
- **Ref-counting hooks** -- `FdEnabledSubsystemEntry` gains `on_dup()`
  and `on_close()` callbacks so subsystems can track how many descriptor
  references exist across all processes. These fire on any
  duplication/close regardless of the reason (dup, inheritance, explicit
  close, process exit).

### 3.2 Pipes

Pipe write ends gain a reference count (`AtomicUsize`), incremented by
`on_dup()` and decremented by `on_close()`. This lets the pipe subsystem
detect when all writers across all processes have closed, triggering EOF
on the read end. Without this, a reader in one process could block
forever waiting for data from a writer that was only held open by a
now-exited sibling process.

### 3.3 Futex

`FutexManager::wait()` and `wake()` gain an `address_space_id: u64`
parameter. `FutexManager` is not generic over the platform provider (it
is a self-contained synchronization primitive), so it cannot use the
platform's `AddressSpaceId` associated type directly. Callers convert
their `AddressSpaceId` to `u64` (e.g., via a numeric cast or by using
the `Hash` impl). The conversion must be injective -- distinct address
spaces must produce distinct `u64` values.

The bucket hash and entry matching include this discriminator to prevent
false aliasing when a kernel-mode platform has overlapping VA ranges
across processes. Userland platforms where VA ranges never overlap pass
a constant `0`.

---

## 4. Guidance for Shim Implementors

This section collects expectations and responsibilities that fall on
the shim layer rather than the core.

### 4.1 Process creation is a shim-level composition

The core does **not** provide a single "fork" or "spawn" operation.
Creating a child process is a shim-level composition of core primitives:

1. `ProcessRegistry::create_process(Some(parent))` -- allocate a PID
2. `AddressSpaceProvider::create_address_space()` -- create memory context
3. Duplicate descriptors as needed via `Descriptors::duplicate_descriptor()`
4. Populate memory (platform-specific: CoW, copy, or load from executable)
5. Associate the `AddressSpaceId` with the `ProcessId` in the shim's own
   per-process state

If any step fails, the shim calls `abort_process` to roll back step 1
and `destroy_address_space` to roll back step 2.

The binding between `ProcessId` and `AddressSpaceId` is owned by the
shim, not the core. Different shims may store this association
differently (e.g., in a per-process struct, a side table, thread-local
state).

### 4.2 Descriptor cleanup on process exit

The core does not automatically close a process's descriptors when
`exit_process` is called. The shim is responsible for closing all
descriptors belonging to an exiting process (triggering `on_close()`
hooks for proper ref-count bookkeeping) either before or after calling
`exit_process`.

### 4.3 Orphan reparenting policy

When a process exits, the core calls the shim-provided `orphan_handler`
for each orphaned child. The shim decides what to do:

- **POSIX shim**: reparent orphans to PID 1 (the init process)
- **NT shim**: detach orphans (no parent)
- Other shims may implement alternative policies

### 4.4 Threading model

The core's process registry tracks processes only. Each process may have
one or more execution contexts (threads), but thread identity and
scheduling are managed by the shim and platform layers, not by the
process registry.


Method	Signature	Description
`create_process`	`(&self, parent: Option<ProcessId>) -> Result<ProcessId, CreateProcessError>`	Allocate a PID and register the parent-child relationship. `parent=None` creates the init process (PID 1).
`abort_process`	`(&self, id: ProcessId)`	Remove a process that was never started (e.g., child-process setup failed after PID allocation). The process must have no children and must still be in `Running` state; panics otherwise.

Method	Signature	Description
`with_context`	`(&self, id: ProcessId, f: FnOnce(&ProcessContext) -> R) -> Option<R>`	Read process context through a closure (avoids exposing the internal lock). Returns `None` if the process does not exist.
`is_alive`	`(&self, id: ProcessId) -> bool`	Convenience: returns `true` if the process exists and is in `Running` state.
`get_parent`	`(&self, id: ProcessId) -> Option<ProcessId>`	Parent PID.
`get_children`	`(&self, id: ProcessId) -> Option<Vec<ProcessId>>`	Child PIDs.
`process_count`	`(&self) -> usize`	Total running processes.
`remove_process`	`(&self, id: ProcessId)`	Remove an exited process from the table. Panics if the process is still running.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-Process support #797

1. New North Interface: Process Registry

1.1 Identity

1.2 Process context and lifecycle

1.3 ProcessRegistry API

Creation and teardown

Exit

Queries

Exit observation

1.4 LiteBox integration

2. New South Interface: AddressSpaceProvider

2.1 Address space kind

2.2 Trait definition

2.3 Errors

3. Existing Core Internals Made Multi-Process Friendly

3.1 File descriptors

3.2 Pipes

3.3 Futex

4. Guidance for Shim Implementors

4.1 Process creation is a shim-level composition

4.2 Descriptor cleanup on process exit

4.3 Orphan reparenting policy

4.4 Threading model

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multi-Process support #797

Description

1. New North Interface: Process Registry

1.1 Identity

1.2 Process context and lifecycle

1.3 ProcessRegistry API

Creation and teardown

Exit

Queries

Exit observation

1.4 LiteBox integration

2. New South Interface: AddressSpaceProvider

2.1 Address space kind

2.2 Trait definition

2.3 Errors

3. Existing Core Internals Made Multi-Process Friendly

3.1 File descriptors

3.2 Pipes

3.3 Futex

4. Guidance for Shim Implementors

4.1 Process creation is a shim-level composition

4.2 Descriptor cleanup on process exit

4.3 Orphan reparenting policy

4.4 Threading model

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions