Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,18 +32,21 @@ pup/
Pup is built with Deno. Key commands are defined in `deno.json`:

### Formatting and Linting

```bash
deno fmt # Format code
deno fmt --check # Check formatting
deno lint # Lint code
```

### Testing

```bash
deno test --allow-read --allow-write --allow-env --allow-net --allow-sys --allow-run --coverage=cov_profile
```

### Build Tasks

```bash
deno task check # Run format, lint, and tests
deno task build-schema # Generate JSON schema
Expand All @@ -54,6 +57,7 @@ deno task build # Complete build process
## Pre-commit Checks

The project uses GitHub Actions for CI (`.github/workflows/deno.yaml`):

- Format checking (`deno fmt --check`)
- Linting (`deno lint`)
- Full test suite with coverage
Expand All @@ -66,13 +70,15 @@ Before submitting PRs, run `deno task check` locally to ensure all checks pass.
Pup is part of an ecosystem of packages available on JSR:

### Core Dependencies

- **[@pup/api-definitions](https://github.com/hexagon/pup-api-definitions)** - API type definitions shared across the ecosystem
- **[@pup/api-client](https://github.com/hexagon/pup-api-client)** - REST API client for CLI, plugins, and telemetry
- **[@pup/telemetry](https://github.com/hexagon/pup-telemetry)** - Runtime-agnostic library for process telemetry and IPC
- **[@pup/common](https://github.com/hexagon/pup-common)** - Common utilities shared across Pup packages
- **[@pup/plugin](https://github.com/hexagon/pup-plugin)** - Base library for creating Pup plugins

### Official Plugins

- **[pup-plugin-web-interface](https://github.com/hexagon/pup-plugin-web-interface)** - Web-based UI for managing Pup

## Key Concepts
Expand Down
4 changes: 4 additions & 0 deletions docs/src/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ nav_order: 13

All notable changes to this project will be documented in this section.

## [Unreleased]

- feat(core): Add exponential backoff for process restarts via `restartBackoffMs` configuration option to prevent rapid restart loops

## [1.0.4] - 2024-11-19

- fix(core): Fix service auto start after install on Windows by upgrading dependency @cross/service
Expand Down
76 changes: 76 additions & 0 deletions docs/src/examples/max-restarts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Max Restarts and Exponential Backoff Example

This example demonstrates how to configure restart limits and exponential backoff for processes that may fail.

## Overview

The configuration shows two processes:

1. **max-3-times**: Uses a fixed restart delay of 3 seconds and allows up to 3 restarts
2. **with-exponential-backoff**: Uses exponential backoff starting at 1 second and capping at 30 seconds, allowing up to 5 restarts

## How Exponential Backoff Works

When a process fails repeatedly, exponential backoff prevents rapid restart loops by increasing the delay between each restart attempt:

- **1st restart**: 1 second delay (restartDelayMs)
- **2nd restart**: 2 seconds delay (1s Γ— 2ΒΉ)
- **3rd restart**: 4 seconds delay (1s Γ— 2Β²)
- **4th restart**: 8 seconds delay (1s Γ— 2Β³)
- **5th restart**: 16 seconds delay (1s Γ— 2⁴)
- **Further restarts**: 30 seconds delay (capped at restartBackoffMs)

This approach:

- Gives transient issues time to resolve
- Prevents resource exhaustion from rapid restart loops
- Allows more restart attempts while being system-friendly

## Running the Example

```bash
pup run
```

The `server.js` script exits immediately, so both processes will restart according to their configured policies until they reach their restart limits.

## Configuration

```jsonc
{
"processes": [
{
"id": "max-3-times",
"cmd": "deno run server.js",
"autostart": true,
"restart": "always",
"restartLimit": 3,
"restartDelayMs": 3000
},
{
"id": "with-exponential-backoff",
"cmd": "deno run server.js",
"autostart": true,
"restart": "always",
"restartLimit": 5,
"restartDelayMs": 1000,
"restartBackoffMs": 30000
}
]
}
```

## When to Use Exponential Backoff

Exponential backoff is particularly useful for:

- Services that may experience temporary network issues
- Processes that depend on external resources (databases, APIs)
- Applications that might fail during deployment or updates
- Any process where rapid restart loops could cause problems

## Notes

- If `restartBackoffMs` is not set, the process will use a fixed `restartDelayMs` between all restarts
- The backoff resets when the process exits successfully (status: FINISHED) or is manually stopped
- Combined with `restartLimit`, exponential backoff provides robust process recovery while preventing runaway restart loops
9 changes: 9 additions & 0 deletions docs/src/examples/max-restarts/pup.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@
"restart": "always",
"restartLimit": 3,
"restartDelayMs": 3000
},
{
"id": "with-exponential-backoff",
"cmd": "deno run server.js",
"autostart": true,
"restart": "always",
"restartLimit": 5,
"restartDelayMs": 1000,
"restartBackoffMs": 30000
}
]
}
4 changes: 3 additions & 1 deletion docs/src/usage/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,9 @@ You need to specify one of these for each process, else the process will never s
### Restart policy

- `restart` (optional): A string specifying when the process should be restarted. Allowed values: "always" or "error".
- `restartDelayMs` (optional): A number specifying the delay (in milliseconds) before restarting the process.
- `restartDelayMs` (optional): A number specifying the initial delay (in milliseconds) before restarting the process. Default: 10000ms (10 seconds), or 500ms when watching files.
- `restartBackoffMs` (optional): A number specifying the maximum delay (in milliseconds) for exponential backoff when a process fails repeatedly. When set, the restart delay will double after each
consecutive failure, starting from `restartDelayMs` and capping at `restartBackoffMs`. This prevents rapid restart loops that can exhaust system resources. If not set, restarts use a fixed delay.
- `restartLimit` (optional): A number specifying the maximum number of restarts allowed for the process.

### Stop/restart policy
Expand Down
2 changes: 2 additions & 0 deletions lib/core/configuration.ts
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ interface ProcessConfiguration {
logger?: ProcessLoggerConfiguration
restart?: string
restartDelayMs?: number
restartBackoffMs?: number
restartLimit?: number
}

Expand Down Expand Up @@ -157,6 +158,7 @@ const ConfigurationSchema = z.object({
terminateGracePeriod: z.number().min(0).default(0),
restart: z.optional(z.enum(["always", "error"])),
restartDelayMs: z.number().min(0).max(24 * 60 * 60 * 1000 * 1).default(10000), // Max one day
restartBackoffMs: z.optional(z.number().min(0).max(24 * 60 * 60 * 1000 * 1)), // Max one day - exponential backoff cap
overrun: z.optional(z.boolean()),
restartLimit: z.optional(z.number().min(0)),
timeout: z.optional(z.number().min(1)),
Expand Down
10 changes: 9 additions & 1 deletion lib/core/pup.ts
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,15 @@ class Pup {
const msSinceExited = status.exited ? (new Date().getTime() - status.exited?.getTime()) : Infinity

// Default restart delay to 10000ms, except when watching
const restartDelay = config.restartDelayMs ?? config.watch ? 500 : 10000
const baseRestartDelay = config.restartDelayMs ?? config.watch ? 500 : 10000

// Calculate exponential backoff if restartBackoffMs is configured
let restartDelay = baseRestartDelay
if (config.restartBackoffMs !== undefined && status.restarts && status.restarts > 0) {
// Exponential backoff: delay = baseDelay * (2 ^ restarts), capped at restartBackoffMs
const exponentialDelay = baseRestartDelay * Math.pow(2, status.restarts - 1)
restartDelay = Math.min(exponentialDelay, config.restartBackoffMs)
}

// Always restart if restartpolicy is undefined and autostart is true
const restartPolicy = config.restart ?? ((config.autostart || config.watch) ? "always" : undefined)
Expand Down
125 changes: 125 additions & 0 deletions test/core/restart-backoff.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
/*
* Test exponential backoff for process restarts
*
* @file test/core/restart-backoff.test.ts
*/

import type { Configuration } from "../../lib/core/configuration.ts"
import { ApiProcessState } from "@pup/api-definitions"
import { Pup } from "../../lib/core/pup.ts"
import { assertEquals, assertGreaterOrEqual, assertLessOrEqual } from "@std/assert"
import { test } from "@cross/test"

test("Process restart with exponential backoff", async () => {
const TEST_PROCESS_ID = "restart-backoff-test"
// Command that exits immediately with error
const TEST_PROCESS_COMMAND = "deno eval 'Deno.exit(1)'"

const config: Configuration = {
processes: [
{
"id": TEST_PROCESS_ID,
"cmd": TEST_PROCESS_COMMAND,
"restart": "error",
"restartDelayMs": 100, // 100ms base delay
"restartBackoffMs": 2000, // Cap at 2 seconds
"restartLimit": 5,
},
],
}
const pup = new Pup(config)
await pup.init()

// Find process
const testProcess = pup.processes.findLast((p) => p.getConfig().id === TEST_PROCESS_ID)
assertEquals(testProcess !== undefined, true)

// Start process
pup.start(TEST_PROCESS_ID, "test")

// Wait for first failure (process exits immediately)
await new Promise((resolve) => setTimeout(resolve, 500))

let status = testProcess!.getStatus()
assertEquals(status.status, ApiProcessState.ERRORED)
assertEquals(status.restarts, 0) // First run, no restarts yet

// Wait for first restart
// Watchdog runs every 1s, then waits for restartDelay (100ms)
await new Promise((resolve) => setTimeout(resolve, 1500))

status = testProcess!.getStatus()
assertGreaterOrEqual(status.restarts || 0, 1) // At least one restart

// Wait for potential second restart
// Watchdog 1s + exponential backoff delay (200ms for 2nd restart)
await new Promise((resolve) => setTimeout(resolve, 1500))

status = testProcess!.getStatus()
assertGreaterOrEqual(status.restarts || 0, 2) // At least two restarts

// Wait for potential third restart
// Watchdog 1s + exponential backoff delay (400ms for 3rd restart)
await new Promise((resolve) => setTimeout(resolve, 1500))

status = testProcess!.getStatus()
assertGreaterOrEqual(status.restarts || 0, 3) // At least three restarts

// Verify that restarts are limited by restartLimit
await new Promise((resolve) => setTimeout(resolve, 3000))

status = testProcess!.getStatus()
assertLessOrEqual(status.restarts || 0, 5) // Should not exceed restartLimit

// If limit reached, status should be EXHAUSTED
if (status.restarts === 5) {
assertEquals(status.status, ApiProcessState.EXHAUSTED)
}

// Terminate pup
await pup.terminate(500)
})

test("Process restart without backoff (default behavior)", async () => {
const TEST_PROCESS_ID = "restart-no-backoff-test"
// Command that exits immediately with error
const TEST_PROCESS_COMMAND = "deno eval 'Deno.exit(1)'"

const config: Configuration = {
processes: [
{
"id": TEST_PROCESS_ID,
"cmd": TEST_PROCESS_COMMAND,
"restart": "error",
"restartDelayMs": 100, // 100ms fixed delay
// No restartBackoffMs - should use fixed delay
"restartLimit": 3,
},
],
}
const pup = new Pup(config)
await pup.init()

// Find process
const testProcess = pup.processes.findLast((p) => p.getConfig().id === TEST_PROCESS_ID)
assertEquals(testProcess !== undefined, true)

// Start process
pup.start(TEST_PROCESS_ID, "test")

// Wait for first failure and restart
// Watchdog runs every 1s, then waits for restartDelay (100ms)
await new Promise((resolve) => setTimeout(resolve, 1500))

let status = testProcess!.getStatus()
assertGreaterOrEqual(status.restarts || 0, 1)

// With fixed 100ms delay, should get second restart after another 1.1s
await new Promise((resolve) => setTimeout(resolve, 1500))

status = testProcess!.getStatus()
assertGreaterOrEqual(status.restarts || 0, 2)

// Terminate pup
await pup.terminate(500)
})