Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions docs/designs/production-units.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,6 @@ This document does not restate full table schemas; it only defines key field sem
|------|------|------|
| GET | `/tasks` | List (filters: `workstation_id/status/limit/offset`) |
| GET | `/tasks/:id` | Detail (includes `episode` if linked) |
| PUT | `/tasks/:id` | Status update (restricted transitions; see §6.2) |
| GET | `/tasks/:id/config` | Generate recorder config (requires workstation robot + collector bindings) |

---
Expand All @@ -184,12 +183,14 @@ This document does not restate full table schemas; it only defines key field sem

### 6.2 Task states

- **State set**: `pending` | `ready` | `in_progress` | `completed` | `failed` | `cancelled`
- **Prepare (pending→ready)**: triggered by UI/scheduler (currently via `PUT /tasks/:id`).
- **Run (ready→in_progress)**: triggered by UI/device workflow (currently via `PUT /tasks/:id`).
- **State set**: `pending` | `ready` | `in_progress` | `uploading` | `completed` | `failed` | `cancelled`
- **Prepare (pending→ready)**: triggered by recorder config application (`config_applied` / ready state snapshot).
- **Run (ready→in_progress)**: triggered by recorder start callback or recording state snapshot.
- **Finish (in_progress→uploading)**: triggered by recorder finish callback; Keystone sends `upload_request` to Transfer when connected.
- **Transfer ACK**:
- On verified upload ACK, Keystone marks task `in_progress -> completed` (only if currently `in_progress`).
- On `upload_failed`, Keystone marks task `in_progress -> failed`.
- On verified upload ACK, Keystone sends `upload_ack`, then marks task `pending/ready/in_progress/uploading -> completed`.
- Duplicate ACKs for already `completed` tasks are allowed but must not re-advance batch/order state.
- On `upload_failed`, Keystone marks task `in_progress/uploading -> failed`.
- **Revert to pending (ready/in_progress→pending)**: used for recovery when Transfer disconnects (to avoid stuck runnable tasks).

### 6.3 Batch states
Expand Down Expand Up @@ -227,15 +228,14 @@ When the device reports `upload_complete`, Keystone runs the Verified ACK flow:
- **Idempotent**: if an Episode already exists for this `task_id`, do not insert again
- Insert into `episodes` (persist denormalized fields such as `batch_id/order_id/scene_id/...`)
- `batches.episode_count += 1` (only when a new Episode is inserted)
- Update `tasks.status` to **`completed`** (and set `completed_at`) **only when current status is `in_progress`**
3. **Send `upload_ack`** to the device
4. **Mark task completed**: update `tasks.status` to **`completed`** (and set `completed_at`) when current status is `pending/ready/in_progress/uploading`; already completed tasks remain idempotent

---

## 8. Known gaps and evolution

- **In-recording state**: `callbacks/start` does not persist state today; `ready -> in_progress` validation/persistence is not implemented yet.
- **Failure path**: an end-to-end `failed` terminal state and error attribution are not fully implemented (callbacks/transfer need to be extended).
- **Failure path**: terminal `failed` handling exists for transfer failures; recorder callback failure attribution still needs tighter end-to-end coverage.
- **Quota consistency**:
- Dispatch quota is based on non-deleted task rows, not completed rows.
- New/bulk batch creation uses `remaining_assignable = target_count - order_task_count`.
Expand Down
44 changes: 27 additions & 17 deletions docs/designs/task-manage.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,24 +50,26 @@ The Task state machine defines the complete lifecycle of a Task, with state tran
| State | Description | Valid Previous States | Valid Next States |
|-------|-------------|----------------------|-------------------|
| `pending` | Task created, awaiting workstation preparation | `ready`(cancel), `*`(new) | `ready`, `cancelled` |
| `ready` | Data collector clicked "Make Ready", awaiting recording start | `pending` | `in_progress`, `pending`, `cancelled` |
| `in_progress` | Recording in progress | `ready` | `completed`, `failed` |
| `completed` | Recording completed successfully | `in_progress` | *(terminal)* |
| `failed` | Recording failed | `in_progress` | *(terminal)* |
| `ready` | Recorder applied TaskConfig, awaiting recording start | `pending` | `in_progress`, `pending`, `cancelled` |
| `in_progress` | Recording in progress | `pending`, `ready` | `uploading`, `failed` |
| `uploading` | Recording finished, waiting for Transfer upload ACK | `pending`, `ready`, `in_progress` | `completed`, `failed` |
| `completed` | Recording uploaded and acknowledged successfully | `pending`, `ready`, `in_progress`, `uploading` | *(terminal)* |
| `failed` | Recording or upload failed | `in_progress`, `uploading` | *(terminal)* |
| `cancelled` | Task cancelled | `pending`, `ready` | *(terminal)* |

### 2.2 State Transition Diagram

```mermaid
stateDiagram-v2
[*] --> pending: Order created/Task generated
pending --> ready: Data Collector clicks "Make Ready"
pending --> ready: Recorder config_applied
pending --> cancelled: Task cancelled
ready --> in_progress: Axon POST /callbacks/start
ready --> pending: Data Collector clicks "Unready"
ready --> pending: Transfer disconnect recovery
ready --> cancelled: Task cancelled
in_progress --> completed: Axon POST /callbacks/finish (success)
in_progress --> failed: Axon POST /callbacks/finish (failure)
in_progress --> uploading: Axon POST /callbacks/finish
uploading --> completed: Transfer upload_complete + upload_ack
uploading --> failed: Transfer upload_failed
completed --> [*]
failed --> [*]
cancelled --> [*]
Expand All @@ -77,7 +79,7 @@ stateDiagram-v2

#### pending → ready

- **Trigger**: Data collector clicks "Make Ready" in Synapse UI
- **Trigger**: Recorder confirms TaskConfig application (`config_applied` or ready state snapshot)
- **Validation**:
- Task status is `pending`
- Task is assigned to a Workstation
Expand All @@ -96,12 +98,21 @@ stateDiagram-v2
- Record `started_at` timestamp
- Record active ROS Topics

#### in_progress → completed
#### in_progress → uploading

- **Trigger**: Axon calls [`POST /callbacks/finish`](implementation/axon_teleoperation.md:1107) with `error == null`
- **Validation**: Task status is `in_progress`
- **Side Effects**:
- Mark Task `uploading`
- Trigger Transfer `upload_request` when the device is connected

#### uploading → completed

- **Trigger**: Transfer reports `upload_complete`, Keystone verifies S3 objects, then sends `upload_ack`
- **Validation**: Task status is `pending`, `ready`, `in_progress`, or `uploading`
- **Side Effects**:
- Record `completed_at` timestamp
- Clear upload error message
- Create Episode (`qa_status: pending_qa`)
- Trigger Edge Dagster QA Job

Expand Down Expand Up @@ -147,13 +158,13 @@ Batch-scoped Tasks are created for selected Workstations (status: pending)
### Phase 1: Task Preparation and TaskConfig Distribution

```
Data Collector → Synapse UI clicks "Make Ready"
Keystone → Dispatch TaskConfig to recorder
Synapse → PATCH /tasks/{id} {status: "ready"}
Axon recorder applies config
Keystone → Task: pending → ready
Trigger notification for Axon to pull config
Axon keeps TaskConfig locally
Axon → GET /tasks/{id}/config (Device token)
Expand Down Expand Up @@ -253,7 +264,6 @@ Four weighted checks:
| `GET` | `/tasks` | List tasks (with filtering) | Synapse UI |
| `GET` | `/tasks/{id}` | Get task details | Synapse UI |
| `GET` | `/tasks/{id}/config` | Get task config (Axon pull) | Axon |
| `PATCH` | `/tasks/{id}` | Update task status | Synapse UI |
| `POST` | `/tasks` | Create task (internal use) | Order Manager |

### 4.2 Callback Interfaces
Expand Down Expand Up @@ -388,8 +398,8 @@ Response 500:

**Operations**:

- "Make Ready": `pending` → `ready`
- "Unready": `ready` → `pending` (timeout or reset)
- Synapse displays task status and creates tasks.
- Task status transitions are driven by recorder callbacks/state snapshots and Transfer upload events.

### 5.3 Task and Dagster QA Interaction

Expand Down Expand Up @@ -444,7 +454,7 @@ CREATE TABLE tasks (
sop_id BIGINT NOT NULL,

-- Status
status VARCHAR(32) NOT NULL DEFAULT 'pending' COMMENT 'pending|ready|in_progress|completed|failed|cancelled',
status VARCHAR(32) NOT NULL DEFAULT 'pending' COMMENT 'pending|ready|in_progress|uploading|completed|failed|cancelled',

-- Timestamps
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
Expand Down
Loading
Loading