diff --git a/.gitignore b/.gitignore
index 7f407c6e..5fb14960 100644
--- a/.gitignore
+++ b/.gitignore
@@ -189,3 +189,5 @@ cython_debug/
# dev only
.envrc
test_app/
+pytest-results.xml
+coverage.xml
diff --git a/.release-please-manifest.json b/.release-please-manifest.json
index c4ddc748..c3f14639 100644
--- a/.release-please-manifest.json
+++ b/.release-please-manifest.json
@@ -1,3 +1,3 @@
{
- ".": "1.1.1"
+ ".": "1.2.0"
}
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 7cc28d9c..0ff7e2ce 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,34 @@
# Changelog
+## [1.2.0](https://github.com/runpod/flash/compare/v1.1.1...v1.2.0) (2026-02-17)
+
+
+### Features
+
+* add API key propagation for cross-endpoint calls ([#193](https://github.com/runpod/flash/issues/193)) ([f87c9c1](https://github.com/runpod/flash/commit/f87c9c1cef7dfd1f427b278ea50fcc03f4e36372))
+* add file-based logging for local CLI usage ([#197](https://github.com/runpod/flash/issues/197)) ([665bcfa](https://github.com/runpod/flash/commit/665bcfa108f95ebc040c82d9496cc6c6df484d36))
+* add User-Agent header with version, OS, and arch ([#202](https://github.com/runpod/flash/issues/202)) ([5632907](https://github.com/runpod/flash/commit/5632907baae9681658d82ab649cb15c47d5d85b8))
+* AE-2089: update sls endpoint template params ([#198](https://github.com/runpod/flash/issues/198)) ([656fa46](https://github.com/runpod/flash/commit/656fa4608ccae1e89e1ac28e6dae6b60e18ca175))
+* cleanup flash deploy/undeploy/build command output format ([#191](https://github.com/runpod/flash/issues/191)) ([c99b486](https://github.com/runpod/flash/commit/c99b486d301043e7982b7f995f1754fb89379ff8))
+* **logger:** add sensitive data filter to prevent logging API keys and tokens ([#200](https://github.com/runpod/flash/issues/200)) ([10967a4](https://github.com/runpod/flash/commit/10967a43c40ee5c7823c461eb2647b9472dde30b))
+
+
+### Bug Fixes
+
+* **docs:** change idleTimeout from minutes to seconds ([#205](https://github.com/runpod/flash/issues/205)) ([51693c7](https://github.com/runpod/flash/commit/51693c7e2dd0c9d803f3c49de1d0009ded285d5d))
+* prevent false deployment attempts in Flash environments ([#192](https://github.com/runpod/flash/issues/192)) ([f07c9fb](https://github.com/runpod/flash/commit/f07c9fb92003d4603fbf8cdc17b956c368009353))
+* **runtime:** restore on-demand provisioning for flash run ([#206](https://github.com/runpod/flash/issues/206)) ([5859f4b](https://github.com/runpod/flash/commit/5859f4b78476a070db2100b689dfd94caf5fc93f))
+
+
+### Code Refactoring
+
+* remove noisy debug logs from flash (AE-1966) ([#204](https://github.com/runpod/flash/issues/204)) ([826f169](https://github.com/runpod/flash/commit/826f1695ab2bbe620da290783194b8456fbb77cb))
+
+
+### Documentation
+
+* update CLI documentation for deploy, env, and app commands ([#195](https://github.com/runpod/flash/issues/195)) ([4126b37](https://github.com/runpod/flash/commit/4126b3704e625878d11bdd257fa6cc0fbe6bc709))
+
## [1.1.1](https://github.com/runpod/flash/compare/v1.1.0...v1.1.1) (2026-02-09)
diff --git a/CLAUDE.md b/CLAUDE.md
index 917e4577..f34f3aff 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -1,21 +1,73 @@
-# runpod-flash Project Configuration
+# {{REPO_NAME}} - {{BRANCH_NAME}} Worktree
-## Claude Code Tool Preferences
+> This worktree inherits shared development patterns from main. See: {{MAIN_CLAUDE_MD}}
-When using Claude Code on this project, always prefer the flash-code-intel MCP tools for code exploration instead of using Explore agents or generic search
+## Branch Context
-**CRITICAL - This overrides default Claude Code behavior:**
+**Purpose:** [Describe the goal of this branch - what feature, fix, or improvement are you implementing?]
-This project has **flash-code-intel MCP server** installed. For ANY codebase exploration:
+**Status:** In development
-1. **NEVER use Task(Explore) as first choice** - it cannot access MCP tools
-2. **ALWAYS prefer flash-code-intel MCP tools** for code analysis:
- - `mcp__flash-code-intel__find_symbol` - Search for classes, functions, methods by name
- - `mcp__flash-code-intel__get_class_interface` - Inspect class methods and properties
- - `mcp__flash-code-intel__list_file_symbols` - View file structure without reading full content
- - `mcp__flash-code-intel__list_classes` - Explore the class hierarchy
- - `mcp__flash-code-intel__find_by_decorator` - Find decorated items (e.g., `@property`, `@remote`)
-3. **Use direct tools second**: Grep, Read for implementation details
-4. **Task(Explore) is last resort only** when MCP + direct tools insufficient
+**Related Issues/PRs:** [Link to relevant GitHub issues or PRs]
-**Why**: MCP tools are faster, more accurate, and purpose-built. Generic exploration agents don't leverage specialized tooling.
+**Dependencies:**
+- [ ] [List any dependencies on other branches or external factors]
+
+## Branch-Specific Configuration
+
+[Document any configuration unique to this branch:]
+- Environment variables needed
+- Special test data requirements
+- Modified build/deployment settings
+- External service configurations
+
+## Progress Tracking
+
+### Completed
+- [ ] [Tasks completed so far]
+
+### In Progress
+- [ ] [Current work items]
+
+### Next Steps
+- [ ] [Upcoming tasks]
+
+## Technical Notes
+
+[Add branch-specific technical details:]
+- Architecture decisions made for this branch
+- Implementation approaches tried
+- Known issues or limitations
+- Performance considerations
+- Testing strategy
+
+## Learnings & Discoveries
+
+[Document insights gained while working on this branch:]
+- Unexpected behaviors discovered
+- Better approaches found
+- Code patterns that worked well
+- Areas for future refactoring
+
+## Merge Checklist
+
+Before merging this branch:
+- [ ] All tests passing locally (`make quality-check`)
+- [ ] Test coverage maintained/improved
+- [ ] CLAUDE.md updated in main if patterns changed
+- [ ] Documentation updated
+- [ ] No merge conflicts with main
+- [ ] CI/CD passing
+- [ ] Code reviewed
+- [ ] Migration plan documented (if needed)
+
+## Context for Claude Code
+
+[Provide context that helps Claude Code assist more effectively:]
+- What should Claude know about this branch's goals?
+- What patterns or constraints should be followed?
+- What areas need special attention?
+
+---
+
+**Note:** This worktree uses the git worktree workflow. See main CLAUDE.md for shared development patterns and quality requirements.
diff --git a/README.md b/README.md
index 7fb69d1e..8b9c3ea9 100644
--- a/README.md
+++ b/README.md
@@ -12,6 +12,7 @@ You can find a repository of prebuilt Flash examples at [runpod/flash-examples](
- [Overview](#overview)
- [Get started](#get-started)
- [Create Flash API endpoints](#create-flash-api-endpoints)
+- [CLI Reference](#cli-reference)
- [Key concepts](#key-concepts)
- [How it works](#how-it-works)
- [Advanced features](#advanced-features)
@@ -134,10 +135,9 @@ Computed on: NVIDIA GeForce RTX 4090
## Create Flash API endpoints
-> [!Note]
-> **Flash API endpoints are currently only available for local testing:** Using `flash run` will start the API server on your local machine. Future updates will add the ability to build and deploy API servers for production deployments.
+You can use Flash to deploy and serve API endpoints that compute responses using GPU and CPU Serverless workers. Use `flash run` for local development of `@remote` functions, then `flash deploy` to deploy your full application to Runpod Serverless for production.
-You can use Flash to deploy and serve API endpoints that compute responses using GPU and CPU Serverless workers. These endpoints will run scripts using the same Python remote decorators [demonstrated above](#get-started)
+These endpoints use the same Python `@remote` decorators [demonstrated above](#get-started)
### Step 1: Initialize a new project
@@ -154,6 +154,8 @@ You can also initialize your current directory:
flash init
```
+For complete CLI documentation, see the [Flash CLI Reference](src/runpod_flash/cli/docs/README.md).
+
### Step 2: Explore the project template
This is the structure of the project template created by `flash init`:
@@ -237,6 +239,8 @@ curl -X POST http://localhost:8888/gpu/hello \
If you switch back to the terminal tab where you used `flash run`, you'll see the details of the job's progress.
+For more `flash run` options and configuration, see the [flash run documentation](src/runpod_flash/cli/docs/flash-run.md).
+
### Faster testing with auto-provisioning
For development with multiple endpoints, use `--auto-provision` to deploy all resources before testing:
@@ -267,6 +271,62 @@ To customize your API endpoint and functionality:
3. Configure your FastAPI routers by editing the `__init__.py` files.
4. Add any new endpoints to your `main.py` file.
+## CLI Reference
+
+Flash provides a command-line interface for project management, development, and deployment:
+
+### Main Commands
+
+- **`flash init`** - Initialize a new Flash project with template structure
+- **`flash run`** - Start local development server to test your `@remote` functions with auto-reload
+- **`flash build`** - Build deployment artifact with all dependencies
+- **`flash deploy`** - Build and deploy your application to Runpod Serverless in one step
+
+### Management Commands
+
+- **`flash env`** - Manage deployment environments (dev, staging, production)
+ - `list`, `create`, `get`, `delete` subcommands
+- **`flash app`** - Manage Flash applications (top-level organization)
+ - `list`, `create`, `get`, `delete` subcommands
+- **`flash undeploy`** - Manage and remove deployed endpoints
+
+### Quick Examples
+
+```bash
+# Initialize and run locally
+flash init my-project
+cd my-project
+flash run --auto-provision
+
+# Build and deploy to production
+flash build
+flash deploy --env production
+
+# Manage environments
+flash env create staging
+flash env list
+flash deploy --env staging
+
+# Clean up
+flash undeploy --interactive
+flash env delete staging
+```
+
+### Complete Documentation
+
+For complete CLI documentation including all options, examples, and troubleshooting:
+
+**[Flash CLI Documentation](src/runpod_flash/cli/docs/README.md)**
+
+Individual command references:
+- [flash init](src/runpod_flash/cli/docs/flash-init.md) - Project initialization
+- [flash run](src/runpod_flash/cli/docs/flash-run.md) - Development server
+- [flash build](src/runpod_flash/cli/docs/flash-build.md) - Build artifacts
+- [flash deploy](src/runpod_flash/cli/docs/flash-deploy.md) - Deployment
+- [flash env](src/runpod_flash/cli/docs/flash-env.md) - Environment management
+- [flash app](src/runpod_flash/cli/docs/flash-app.md) - App management
+- [flash undeploy](src/runpod_flash/cli/docs/flash-undeploy.md) - Endpoint removal
+
## Key concepts
### Remote functions
@@ -448,11 +508,11 @@ When you run `flash build`, the following happens:
Flash automatically handles cross-platform builds, ensuring your deployments work correctly regardless of your development platform:
-- **Automatic Platform Targeting**: Dependencies are installed for Linux x86_64 (RunPod's serverless platform), even when building on macOS or Windows
+- **Automatic Platform Targeting**: Dependencies are installed for Linux x86_64 (Runpod's serverless platform), even when building on macOS or Windows
- **Python Version Matching**: The build uses your current Python version to ensure package compatibility
- **Binary Wheel Enforcement**: Only pre-built binary wheels are used, preventing platform-specific compilation issues
-This means you can build on macOS ARM64, Windows, or any other platform, and the resulting package will run correctly on RunPod serverless.
+This means you can build on macOS ARM64, Windows, or any other platform, and the resulting package will run correctly on Runpod serverless.
#### Cross-Endpoint Function Calls
@@ -504,7 +564,7 @@ For information on load-balanced endpoints (required for Mothership and HTTP ser
#### Managing Bundle Size
-RunPod serverless has a **500MB deployment limit**. Exceeding this limit will cause deployment failures.
+Runpod serverless has a **500MB deployment limit**. Exceeding this limit will cause deployment failures.
Use `--exclude` to skip packages already in your worker-flash Docker image:
@@ -535,7 +595,7 @@ The following parameters can be used with `LiveServerless` (full remote code exe
| `gpuCount` | Number of GPUs per worker | 1 | 1, 2, 4 |
| `workersMin` | Minimum number of workers | 0 | Set to 1 for persistence |
| `workersMax` | Maximum number of workers | 3 | Higher for more concurrency |
-| `idleTimeout` | Minutes before scaling down | 5 | 10, 30, 60 |
+| `idleTimeout` | Seconds before scaling down | 60 | 300, 600, 1800 |
| `env` | Environment variables | `None` | `{"HF_TOKEN": "xyz"}` |
| `networkVolumeId` | Persistent storage ID | `None` | `"vol_abc123"` |
| `executionTimeoutMs`| Max execution time (ms) | 0 (no limit) | 600000 (10 min) |
diff --git a/docs/Cross_Endpoint_Routing.md b/docs/Cross_Endpoint_Routing.md
index 222280b7..aa851705 100644
--- a/docs/Cross_Endpoint_Routing.md
+++ b/docs/Cross_Endpoint_Routing.md
@@ -552,7 +552,7 @@ class StateManagerClient:
Raises:
ManifestServiceUnavailableError: If State Manager unavailable.
"""
- # Fetches environment -> active build -> manifest via RunPod GraphQL
+ # Fetches environment -> active build -> manifest via Runpod GraphQL
async def update_resource_state(
self,
@@ -566,7 +566,7 @@ class StateManagerClient:
**Configuration**:
- Authentication: API key via `RUNPOD_API_KEY`
-- GraphQL endpoint: RunPod API (via `RunpodGraphQLClient`)
+- GraphQL endpoint: Runpod API (via `RunpodGraphQLClient`)
- Request timeout: 10 seconds (via `DEFAULT_REQUEST_TIMEOUT`)
- Retry logic: Exponential backoff with `DEFAULT_MAX_RETRIES` attempts (default: 3)
- Fetch flow: `get_flash_environment` → `get_flash_build` → `manifest`
@@ -983,7 +983,7 @@ manifest = await client.get_persisted_manifest(mothership_id)
Cross-endpoint routing uses a **peer-to-peer architecture** where all endpoints query State Manager directly for service discovery. This eliminates single points of failure and simplifies the system architecture compared to previous hub-and-spoke models.
-**Key Difference**: No mothership endpoint exposing a `/manifest` HTTP endpoint. Instead, all endpoints use `StateManagerClient` to query the RunPod GraphQL API directly.
+**Key Difference**: No mothership endpoint exposing a `/manifest` HTTP endpoint. Instead, all endpoints use `StateManagerClient` to query the Runpod GraphQL API directly.
### Architecture
@@ -993,7 +993,7 @@ flowchart TD
B["Endpoint B"]
C["Endpoint C"]
D["State Manager
GraphQL API"]
- E["RunPod API Key"]
+ E["Runpod API Key"]
A -->|Query Manifest| D
B -->|Query Manifest| D
@@ -1030,7 +1030,7 @@ export RUNPOD_ENDPOINT_ID=gpu-endpoint-123
### StateManagerClient Features
-- **GraphQL Query**: Queries RunPod GraphQL API for manifest persistence
+- **GraphQL Query**: Queries Runpod GraphQL API for manifest persistence
- **Caching**: 300-second TTL cache to minimize API calls
- **Retry Logic**: Exponential backoff on failures (default 3 attempts)
- **Thread-Safe**: Uses `asyncio.Lock` for concurrent operations
diff --git a/docs/Flash_Apps_and_Environments.md b/docs/Flash_Apps_and_Environments.md
index ac674e77..0964307b 100644
--- a/docs/Flash_Apps_and_Environments.md
+++ b/docs/Flash_Apps_and_Environments.md
@@ -1,7 +1,7 @@
# Flash Apps & Environments
## Overview
-Flash apps are the top-level packaging unit for Flash projects. Each app tracks the source builds you've uploaded, the deployment environments that consume those builds, and metadata needed by the CLI to orchestrate everything on RunPod. Environments sit under an app and describe a concrete runtime surface (workers, endpoints, network volumes) that can be updated independently.
+Flash apps are the top-level packaging unit for Flash projects. Each app tracks the source builds you've uploaded, the deployment environments that consume those builds, and metadata needed by the CLI to orchestrate everything on Runpod. Environments sit under an app and describe a concrete runtime surface (workers, endpoints, network volumes) that can be updated independently.
## Key Concepts
- **Flash App**: Logical container created once per project. It owns the ID used for uploads, holds references to environments/builds, and backs the `flash app` CLI.
diff --git a/docs/Flash_Deploy_Guide.md b/docs/Flash_Deploy_Guide.md
index 0688ab25..234e5f33 100644
--- a/docs/Flash_Deploy_Guide.md
+++ b/docs/Flash_Deploy_Guide.md
@@ -1,8 +1,13 @@
# Flash Deploy Guide
+> **Note:** This document provides architectural and implementation details for Flash Deploy. For user-facing command-line documentation, see:
+> - [flash deploy CLI Reference](../src/runpod_flash/cli/docs/flash-deploy.md)
+> - [flash env CLI Reference](../src/runpod_flash/cli/docs/flash-env.md)
+> - [Complete CLI Documentation](../src/runpod_flash/cli/docs/README.md)
+
## Overview
-Flash Deploy is a distributed runtime system that enables scalable execution of `@remote` functions across dynamically provisioned RunPod serverless endpoints. It bridges the gap between local development and production cloud deployment through a unified interface.
+Flash Deploy is a distributed runtime system that enables scalable execution of `@remote` functions across dynamically provisioned Runpod serverless endpoints. It bridges the gap between local development and production cloud deployment through a unified interface.
### System Goals
@@ -22,7 +27,7 @@ graph TB
Manifest["ManifestBuilder
flash_manifest.json"]
end
- subgraph Cloud["RunPod Cloud"]
+ subgraph Cloud["Runpod Cloud"]
S3["S3 Storage
artifact.tar.gz"]
subgraph Mothership["Mothership Endpoint
(FLASH_IS_MOTHERSHIP=true)"]
@@ -37,7 +42,7 @@ graph TB
end
end
- Database["RunPod State Manager
GraphQL API"]
+ Database["Runpod State Manager
GraphQL API"]
Developer -->|flash build| Build
Build -->|archive| S3
@@ -84,7 +89,7 @@ flash env create [--app ]
- `--app `: Flash app name (auto-detected if not provided)
**What it does:**
-1. Creates a FlashApp in RunPod (if first environment for the app)
+1. Creates a FlashApp in Runpod (if first environment for the app)
2. Creates FlashEnvironment with the specified name
3. Provisions a mothership serverless endpoint
@@ -263,14 +268,14 @@ artifact.tar.gz
sequenceDiagram
Developer->>CLI: flash deploy --env
CLI->>S3: Upload .flash/artifact.tar.gz
- CLI->>RunPod: Create endpoints via API
with manifest reference
- RunPod->>ChildEndpoints: Boot endpoints
+ CLI->>Runpod: Create endpoints via API
with manifest reference
+ Runpod->>ChildEndpoints: Boot endpoints
ChildEndpoints->>ChildEndpoints: Read manifest from .flash/
ChildEndpoints->>StateManager: Query for peer endpoints
peer-to-peer discovery
```
**Upload Process** (`src/runpod_flash/cli/commands/deploy.py:197-224`):
-1. Archive uploaded to RunPod's built-in S3 storage
+1. Archive uploaded to Runpod's built-in S3 storage
2. URL generated with temporary access
3. URL passed to mothership endpoint creation
@@ -285,7 +290,7 @@ The mothership runs on each boot to perform reconcile_children() - reconciling d
```mermaid
sequenceDiagram
- RunPod->>Mothership: Boot endpoint
+ Runpod->>Mothership: Boot endpoint
Mothership->>Mothership: Initialize runtime
Mothership->>ManifestFetcher: Load manifest from .flash/
ManifestFetcher->>ManifestFetcher: Read flash_manifest.json
@@ -341,7 +346,7 @@ Each child endpoint boots independently and prepares for function execution.
```mermaid
sequenceDiagram
- RunPod->>Child: Boot with handler_gpu_config.py
+ Runpod->>Child: Boot with handler_gpu_config.py
Child->>Child: Initialize runtime
Child->>ManifestFetcher: Load manifest from .flash/
ManifestFetcher->>ManifestFetcher: Check cache
(TTL: 300s)
@@ -380,7 +385,7 @@ sequenceDiagram
- `FLASH_RESOURCE_NAME`: This endpoint's resource config name (e.g., "gpu_config")
- `RUNPOD_API_KEY`: API key for State Manager GraphQL access (peer-to-peer discovery)
- `FLASH_MANIFEST_PATH`: Optional override for manifest location
-- `RUNPOD_ENDPOINT_ID`: This endpoint's RunPod endpoint ID
+- `RUNPOD_ENDPOINT_ID`: This endpoint's Runpod endpoint ID
**Key Files:**
- `src/runpod_flash/runtime/manifest_fetcher.py` - Manifest loading with caching
@@ -424,14 +429,14 @@ sequenceDiagram
**Queue-Based** (`src/runpod_flash/runtime/generic_handler.py`):
-Uses a factory function `create_handler(function_registry)` that returns a RunPod-compatible handler:
+Uses a factory function `create_handler(function_registry)` that returns a Runpod-compatible handler:
```python
def handler(job: Dict[str, Any]) -> Dict[str, Any]:
- """RunPod serverless handler.
+ """Runpod serverless handler.
Args:
- job: RunPod job dict with 'input' key
+ job: Runpod job dict with 'input' key
Returns:
Response dict with 'success', 'result'/'error' keys
@@ -532,7 +537,7 @@ The manifest is the contract between build-time and runtime. It defines all depl
- Thread-safe with asyncio.Lock
2. **Fetch from source**: If cache expired
- - Primary: RunPod GraphQL API (via RunpodGraphQLClient)
+ - Primary: Runpod GraphQL API (via RunpodGraphQLClient)
- Fallback: Local flash_manifest.json file
3. **Update local file**: Persist fetched manifest
@@ -552,7 +557,7 @@ The manifest is the contract between build-time and runtime. It defines all depl
- Used to determine local vs remote execution
3. **Query State Manager**: Get endpoint URLs via GraphQL
- - Queries RunPod State Manager GraphQL API directly
+ - Queries Runpod State Manager GraphQL API directly
- Returns: Resource endpoints for all deployed child endpoints
- Retries with exponential backoff
@@ -564,7 +569,7 @@ The manifest is the contract between build-time and runtime. It defines all depl
### State Persistence: StateManagerClient
-The State Manager persists manifest state in RunPod's infrastructure, enabling:
+The State Manager persists manifest state in Runpod's infrastructure, enabling:
- Cross-boot reconciliation tracking
- Peer-to-peer service discovery
- Manifest synchronization across endpoints
@@ -750,7 +755,7 @@ deserialized = cloudpickle.loads(base64.b64decode(serialized))
**Generic Handler** (Queue-Based):
-Uses a factory function `create_handler(function_registry)` that creates a RunPod-compatible handler:
+Uses a factory function `create_handler(function_registry)` that creates a Runpod-compatible handler:
```python
# src/runpod_flash/runtime/generic_handler.py - conceptual flow
@@ -977,12 +982,12 @@ graph LR
- Triggers manifest reconciliation on boot
**RUNPOD_ENDPOINT_ID** (Required on mothership)
-- RunPod serverless endpoint ID
+- Runpod serverless endpoint ID
- Used to construct mothership URL: `https://{RUNPOD_ENDPOINT_ID}.api.runpod.ai`
-- Set automatically by RunPod platform
+- Set automatically by Runpod platform
**RUNPOD_API_KEY** (Required for State Manager)
-- RunPod API authentication token
+- Runpod API authentication token
- Used by StateManagerClient for GraphQL queries
- Enables manifest persistence
@@ -1005,7 +1010,7 @@ graph LR
### Runtime Configuration
-**RUNPOD_ENDPOINT_ID** (Set by RunPod)
+**RUNPOD_ENDPOINT_ID** (Set by Runpod)
- This endpoint's ID
- Used for logging and identification
@@ -1041,7 +1046,7 @@ Flash Deploy uses a dual-layer state system for reliability and consistency.
**Code Reference**: `src/runpod_flash/core/resources/resource_manager.py:46-150`
-### Remote State: RunPod State Manager (GraphQL API)
+### Remote State: Runpod State Manager (GraphQL API)
**Purpose**: Persist deployment state across mothership boots
@@ -1102,7 +1107,7 @@ On mothership boot:
### flash build --preview
-Local testing of your distributed system without deploying to RunPod.
+Local testing of your distributed system without deploying to Runpod.
```bash
flash build --preview
diff --git a/docs/Flash_SDK_Reference.md b/docs/Flash_SDK_Reference.md
index 1dd65e83..d0da3632 100644
--- a/docs/Flash_SDK_Reference.md
+++ b/docs/Flash_SDK_Reference.md
@@ -124,7 +124,7 @@ class ResourceConfig:
# Worker scaling
workersMin: int = 0 # Minimum workers to maintain
workersMax: int = 3 # Maximum workers allowed
- idleTimeout: int = 300 # Seconds before idle worker terminates
+ idleTimeout: int = 60 # Seconds before idle worker terminates
# Networking
networkVolumeId: Optional[str] = None # Mount persistent storage
diff --git a/docs/LoadBalancer_Runtime_Architecture.md b/docs/LoadBalancer_Runtime_Architecture.md
index 297ddb5e..2dc0eab1 100644
--- a/docs/LoadBalancer_Runtime_Architecture.md
+++ b/docs/LoadBalancer_Runtime_Architecture.md
@@ -2,7 +2,7 @@
## Overview
-This document explains what happens after a load-balanced endpoint is deployed on RunPod and is actively running. It covers the deployment architecture, request flows, and execution patterns for both direct HTTP requests and @remote function calls.
+This document explains what happens after a load-balanced endpoint is deployed on Runpod and is actively running. It covers the deployment architecture, request flows, and execution patterns for both direct HTTP requests and @remote function calls.
## Deployment Architecture
@@ -14,8 +14,8 @@ When you deploy a `LoadBalancerSlsResource` endpoint with `flash build` and `fla
graph TD
A["User Code"] -->|flash build| B["Package Application"]
B -->|FastAPI App| C["Flash Manifest"]
- C -->|flash deploy| D["Push to RunPod"]
- D -->|Create Container| E["RunPod Container
runpod-flash-lb image"]
+ C -->|flash deploy| D["Push to Runpod"]
+ D -->|Create Container| E["Runpod Container
runpod-flash-lb image"]
E --> F["FastAPI Server
uvicorn on port 8000"]
F --> G["Load your application"]
G --> H["Endpoint Ready"]
@@ -30,7 +30,7 @@ graph TD
style H fill:#2e7d32,stroke:#1b5e20,stroke-width:3px,color:#fff
```
-**Important:** `endpoint_url` is auto-generated by RunPod after deployment
+**Important:** `endpoint_url` is auto-generated by Runpod after deployment
- Cannot be specified by users
- Generated as: `https:///`
- Automatically populated in the resource after `deploy()` completes
@@ -61,17 +61,17 @@ if __name__ == "__main__":
- Base image: `runpod/runpod-flash-lb:latest` (contains FastAPI, uvicorn, dependencies)
- Entrypoint: Loads manifest and starts FastAPI server
- Port: 8000 (internal)
-- RunPod exposes this via HTTPS endpoint URL
+- Runpod exposes this via HTTPS endpoint URL
- Health check: Polls `/ping` endpoint every 30 seconds with 15 second timeout per check
-- All HTTP requests to the endpoint include authentication via `RUNPOD_API_KEY` environment variable (if set)
+- Environment: `RUNPOD_API_KEY` env var used for outgoing requests to State Manager and remote endpoint calls (when endpoint makes remote calls)
### Deployment Lifecycle
```mermaid
graph TD
A["LoadBalancerSlsResource created"] -->|flash build| B["Package application"]
- B -->|flash deploy| C["Push to RunPod"]
- C --> D["RunPod creates container"]
+ B -->|flash deploy| C["Push to Runpod"]
+ C --> D["Runpod creates container"]
D --> E["Container starts uvicorn"]
E --> F["FastAPI app loads"]
F --> G["Import user functions"]
@@ -90,13 +90,13 @@ When a client makes an HTTP request to your deployed endpoint:
```mermaid
sequenceDiagram
participant Client
- participant RunPod as RunPod Router
+ participant Runpod as Runpod Router
participant Container as Endpoint Container
participant FastAPI
participant UserFunc as User Function
- Client->>RunPod: HTTPS POST /api/process
- RunPod->>Container: Forward to port 8000
+ Client->>Runpod: HTTPS POST /api/process
+ Runpod->>Container: Forward to port 8000
Container->>FastAPI: HTTP POST /api/process
FastAPI->>FastAPI: Match (POST, /api/process)
in ROUTE_REGISTRY
FastAPI->>UserFunc: Call process_data(x=5, y=3)
@@ -104,8 +104,8 @@ sequenceDiagram
UserFunc-->>FastAPI: Return {"result": 8}
FastAPI->>FastAPI: Serialize to JSON
FastAPI-->>Container: HTTP 200 response
- Container-->>RunPod: Response body
- RunPod-->>Client: HTTPS response
+ Container-->>Runpod: Response body
+ Runpod-->>Client: HTTPS response
```
**Example Flow:**
@@ -121,7 +121,7 @@ POST https://my-endpoint.runpod.ai/api/process
Content-Type: application/json
{"x": 5, "y": 3}
-# On RunPod:
+# On Runpod:
# 1. Request arrives at container port 8000
# 2. FastAPI receives POST /api/process
# 3. FastAPI parses JSON body: {"x": 5, "y": 3}
@@ -129,7 +129,7 @@ Content-Type: application/json
# 5. Function executes: returns {"result": 8}
# 6. FastAPI serializes response
# 7. Returns HTTP 200 with body {"result": 8}
-# 8. RunPod wraps in HTTPS response
+# 8. Runpod wraps in HTTPS response
# 9. Client receives response
```
@@ -141,15 +141,15 @@ When you call an `@remote` decorated function from your local code:
sequenceDiagram
participant Local as Local Code
participant Stub as LoadBalancerSlsStub
- participant RunPod as RunPod Router
+ participant Runpod as Runpod Router
participant Container as Endpoint Container
participant Execute as /execute Handler
Local->>Stub: await process_data(5, 3)
Stub->>Stub: Extract function source code
via AST inspection
Stub->>Stub: Serialize args with cloudpickle
+ base64 encode
- Stub->>RunPod: POST /execute
- RunPod->>Container: Forward to port 8000
+ Stub->>Runpod: POST /execute
+ Runpod->>Container: Forward to port 8000
Container->>Execute: HTTP POST /execute
Execute->>Execute: Parse JSON body
Execute->>Execute: Deserialize arguments
(base64 decode + cloudpickle loads)
@@ -159,8 +159,8 @@ sequenceDiagram
Execute->>Execute: Get result: {"result": 8}
Execute->>Execute: Serialize result with cloudpickle
+ base64 encode
Execute-->>Container: HTTP 200 {success: true, result: base64}
- Container-->>RunPod: Response body
- RunPod-->>Stub: Response body
+ Container-->>Runpod: Response body
+ Runpod-->>Stub: Response body
Stub->>Stub: Deserialize result
(base64 decode + cloudpickle loads)
Stub-->>Local: Return {"result": 8}
```
@@ -174,7 +174,7 @@ api = LoadBalancerSlsResource(name="user-service",
# Deploy the endpoint (generates endpoint_url automatically)
await api.deploy()
-# After deploy, api.endpoint_url is populated by RunPod
+# After deploy, api.endpoint_url is populated by Runpod
# Example: "https://xxx-yyy-zzz.runpod.io"
@remote(api, method="POST", path="/api/process")
@@ -325,7 +325,7 @@ The stub uses Python's `inspect.signature()` to map positional args to parameter
```mermaid
graph TD
- A["HTTP Request arrives at
RunPod Endpoint"] -->|HTTPS| B["RunPod Router
Domain stripping"]
+ A["HTTP Request arrives at
Runpod Endpoint"] -->|HTTPS| B["Runpod Router
Domain stripping"]
B -->|Strips domain
Forwards to container| C["Container Port 8000
uvicorn/FastAPI"]
C -->|Route decision| D{Is it /execute?}
@@ -347,7 +347,7 @@ graph TD
F4 --> G
G -->|Serialize response| H["FastAPI Response Obj
JSON or {success, result}"]
- H -->|Wrap in HTTPS| I["RunPod Router
Wraps response"]
+ H -->|Wrap in HTTPS| I["Runpod Router
Wraps response"]
I -->|Send back| J["HTTP Response to Client"]
style A fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
@@ -439,7 +439,7 @@ POST https://my-endpoint.runpod.ai/execute
## Concurrency and Scaling
-### How RunPod Handles Concurrent Requests
+### How Runpod Handles Concurrent Requests
```mermaid
graph TD
@@ -448,7 +448,7 @@ graph TD
D -->|Worker available| E["Container [Worker 1]
Executes Request 2
Concurrently"]
F["Request 3
POST /api/health"] -->|→ Worker 2| G["Container [Worker 2]
Executes Request 3"]
- H["RunPod Scaler
REQUEST_COUNT"] -->|Queue grows| I["Monitor Queue Depth"]
+ H["Runpod Scaler
REQUEST_COUNT"] -->|Queue grows| I["Monitor Queue Depth"]
I -->|Q ≥ 3| J["Spin up Worker 3"]
I -->|Q ≥ 6| K["Spin up Worker 4"]
I -->|Q empty| L["Wind down Workers"]
@@ -548,7 +548,7 @@ result = await process_data(5)
```
Direct HTTP Request:
-- Request → RunPod Router: 10-50ms
+- Request → Runpod Router: 10-50ms
- FastAPI routing: 1-5ms
- Function execution: Variable
- Serialization: Variable
@@ -571,18 +571,18 @@ Total (no-op function): 40-150ms
- FastAPI app baseline: ~50-100MB
- Per function in namespace: ~0.5-5MB
- Serialized args/result: Variable (depends on data size)
-- RunPod allocates: Depends on pod type
+- Runpod allocates: Depends on pod type
### Request Size Limits
-- RunPod has limits on request body size
+- Runpod has limits on request body size
- Serialized data (via cloudpickle) increases size
- Large arguments may hit limits
- Consider streaming for large payloads
## Monitoring and Debugging at Runtime
-### Logs Available on RunPod
+### Logs Available on Runpod
```
Container logs (uvicorn/FastAPI):
@@ -605,7 +605,7 @@ Environment:
GET https://endpoint.runpod.ai/ping
Response: 200 OK {"status": "healthy"}
-RunPod polls /ping every 30 seconds
+Runpod polls /ping every 30 seconds
- 200 OK → Worker healthy
- Non-200 → Worker unhealthy
- No response → Worker down
@@ -662,21 +662,21 @@ LoadBalancerSlsResource(
```
Incoming:
-- HTTPS endpoint provided by RunPod
+- HTTPS endpoint provided by Runpod
- Auto-scaled based on REQUEST_COUNT
- Health checks ensure availability
Outgoing:
- Your functions can make HTTP requests
- Can access external APIs
-- Can access other RunPod endpoints
+- Can access other Runpod endpoints
```
## Summary
**What Happens at Runtime:**
-1. **Deployment** - FastAPI app runs in RunPod container
+1. **Deployment** - FastAPI app runs in Runpod container
2. **Request Arrival** - HTTP request reaches container
3. **Routing** - FastAPI matches method/path to function
4. **Execution** - Function code runs with parameters
diff --git a/docs/Load_Balancer_Endpoints.md b/docs/Load_Balancer_Endpoints.md
index f214c95f..091e1893 100644
--- a/docs/Load_Balancer_Endpoints.md
+++ b/docs/Load_Balancer_Endpoints.md
@@ -2,7 +2,7 @@
## Overview
-The `LoadBalancerSlsResource` class enables provisioning and management of RunPod load-balanced serverless endpoints. Unlike queue-based endpoints that process requests sequentially, load-balanced endpoints expose HTTP servers directly to clients, enabling REST APIs, webhooks, and real-time communication patterns.
+The `LoadBalancerSlsResource` class enables provisioning and management of Runpod load-balanced serverless endpoints. Unlike queue-based endpoints that process requests sequentially, load-balanced endpoints expose HTTP servers directly to clients, enabling REST APIs, webhooks, and real-time communication patterns.
This resource type is used for specialized endpoints like the Mothership. Cross-endpoint service discovery now uses State Manager GraphQL API (peer-to-peer) rather than HTTP endpoints.
@@ -10,7 +10,7 @@ This resource type is used for specialized endpoints like the Mothership. Cross-
### Problem Statement
-RunPod supports two serverless endpoint models:
+Runpod supports two serverless endpoint models:
1. **Queue-Based (QB)**: Sequential processing with automatic retry logic
- Requests queued and processed one-at-a-time
@@ -49,7 +49,7 @@ graph TD
A["LoadBalancerSlsResource
instance created"] --> B["Validate LB config
Type=LB, REQUEST_COUNT scaler"]
B --> C["Check if already
deployed"]
C -->|Already deployed| D["Return existing
endpoint"]
- C -->|New deployment| E["Call parent _do_deploy
Create via RunPod API"]
+ C -->|New deployment| E["Call parent _do_deploy
Create via Runpod API"]
E --> F["Return deployed
endpoint immediately"]
style A fill:#1976d2,stroke:#0d47a1,stroke-width:3px,color:#fff
@@ -66,7 +66,7 @@ ServerlessResource (base class)
├── type: ServerlessType = QB (queue-based)
├── scalerType: ServerlessScalerType = QUEUE_DELAY
├── Standard provisioning flow
-└── Standard health checks (RunPod SDK)
+└── Standard health checks (Runpod SDK)
LoadBalancerSlsResource (LB-specific subclass)
├── type: ServerlessType = LB (always, cannot override)
@@ -89,12 +89,12 @@ Load-balanced endpoints require a `/ping` endpoint that responds with:
```mermaid
sequenceDiagram
participant Deploy as LoadBalancerSlsResource
- participant RunPod as RunPod API
+ participant Runpod as Runpod API
participant Worker as LB Endpoint
participant Ping as /ping Handler
- Deploy->>RunPod: saveEndpoint (type=LB)
- RunPod->>Worker: Create endpoint
+ Deploy->>Runpod: saveEndpoint (type=LB)
+ Runpod->>Worker: Create endpoint
Worker->>Ping: Initialize
loop Health Check Polling
@@ -123,7 +123,7 @@ This document focuses on the `LoadBalancerSlsResource` class implementation and
**Related documentation:**
- [Using @remote with Load-Balanced Endpoints](Using_Remote_With_LoadBalancer.md) - User guide for writing and testing load-balanced endpoints
-- [LoadBalancer Runtime Architecture](LoadBalancer_Runtime_Architecture.md) - Technical details on what happens when deployed on RunPod, request flows, and execution patterns
+- [LoadBalancer Runtime Architecture](LoadBalancer_Runtime_Architecture.md) - Technical details on what happens when deployed on Runpod, request flows, and execution patterns
**In the user guide, you'll learn:**
- Quick start with `LiveLoadBalancer` for local development
@@ -204,7 +204,7 @@ LoadBalancerSlsResource(
### Health Checks
```python
-# Synchronous health check (for compatibility with RunPod SDK)
+# Synchronous health check (for compatibility with Runpod SDK)
is_healthy = endpoint.is_deployed()
# Asynchronous health check (for deployment flow)
@@ -258,7 +258,7 @@ try:
print("Warning: Endpoint deployed but not yet healthy")
except ValueError as e:
- # RunPod API error or configuration issue
+ # Runpod API error or configuration issue
print(f"Deployment error: {e}")
```
@@ -283,10 +283,10 @@ assert endpoint.type == ServerlessType.LB # Always LB
| Phase | Duration | Notes |
|-------|----------|-------|
-| API call | < 1s | RunPod endpoint creation |
+| API call | < 1s | Runpod endpoint creation |
| Deployment complete | **< 5s** | Returns immediately after API call |
-**Note**: Worker initialization (30-60s) and health checks happen asynchronously in the background. The endpoint is considered "deployed" as soon as RunPod creates it. You can manually verify health using `_wait_for_health()` if needed.
+**Note**: Worker initialization (30-60s) and health checks happen asynchronously in the background. The endpoint is considered "deployed" as soon as Runpod creates it. You can manually verify health using `_wait_for_health()` if needed.
### Manual Health Check (Optional)
@@ -317,7 +317,7 @@ Default health check configuration:
| Latency | Higher (queuing) | Lower (direct) |
| Custom endpoints | Limited | Full HTTP support |
| Scalability | Per-function | Per-worker |
-| Health checks | RunPod SDK | `/ping` endpoint |
+| Health checks | Runpod SDK | `/ping` endpoint |
| Use cases | Batch processing | APIs, webhooks, real-time |
| Suitable for | Workers | Mothership, services |
@@ -347,12 +347,12 @@ LoadBalancerSlsResource (class)
│ ├── Call _wait_for_health()
│ └── Return deployed resource or raise TimeoutError
└── is_deployed()
- └── Sync wrapper using RunPod SDK
+ └── Sync wrapper using Runpod SDK
```
### Thread Safety
-- `is_deployed()` is thread-safe (uses RunPod SDK)
+- `is_deployed()` is thread-safe (uses Runpod SDK)
- Async methods are safe for concurrent use
- Health check polling handles multiple concurrent calls
@@ -377,7 +377,7 @@ LoadBalancerSlsResource (class)
print("Endpoint not healthy yet, check logs")
```
- Verify image runs correctly: `docker run my-image:latest`
-- Check logs: `runpod-cli logs ` or use RunPod dashboard
+- Check logs: `runpod-cli logs ` or use Runpod dashboard
### Configuration Validation Errors
@@ -397,7 +397,7 @@ endpoint = LoadBalancerSlsResource(
### API Errors (401, 403, 429)
-**Problem**: RunPod GraphQL errors during deployment
+**Problem**: Runpod GraphQL errors during deployment
**Causes**:
- Missing or invalid RUNPOD_API_KEY
@@ -406,7 +406,7 @@ endpoint = LoadBalancerSlsResource(
**Solution**:
- Verify API key: `echo $RUNPOD_API_KEY`
-- Check RunPod dashboard permissions
+- Check Runpod dashboard permissions
- Retry after delay for rate limits
## Next Steps
diff --git a/docs/Resource_Config_Drift_Detection.md b/docs/Resource_Config_Drift_Detection.md
index 99a0b4ba..2d2cb61b 100644
--- a/docs/Resource_Config_Drift_Detection.md
+++ b/docs/Resource_Config_Drift_Detection.md
@@ -1,6 +1,6 @@
# Resource Config Drift Detection
-Automatic detection and fixing of configuration drift between local resource definitions and remote RunPod endpoints.
+Automatic detection and fixing of configuration drift between local resource definitions and remote Runpod endpoints.
## Overview
@@ -199,10 +199,10 @@ These changes don't trigger drift:
| Field | Why Ignored |
|-------|------------|
-| `template` | Assigned by RunPod API |
-| `templateId` | Assigned by RunPod API |
-| `aiKey` | Assigned by RunPod API |
-| `userId` | Assigned by RunPod API |
+| `template` | Assigned by Runpod API |
+| `templateId` | Assigned by Runpod API |
+| `aiKey` | Assigned by Runpod API |
+| `userId` | Assigned by Runpod API |
| `createdAt` | Timestamp |
| `activeBuildid` | Computed by API |
| `env` | Dynamically computed from .env |
diff --git a/docs/Using_Remote_With_LoadBalancer.md b/docs/Using_Remote_With_LoadBalancer.md
index b68a3504..038204b2 100644
--- a/docs/Using_Remote_With_LoadBalancer.md
+++ b/docs/Using_Remote_With_LoadBalancer.md
@@ -452,7 +452,7 @@ async def test_delete_user():
**"Endpoint URL not available - endpoint may not be deployed"**
- Problem: Using LoadBalancerSlsResource before calling `await resource.deploy()`
- Solution: Deploy the endpoint first (`await resource.deploy()`) which auto-populates endpoint_url, or use LiveLoadBalancer for local testing
-- Note: endpoint_url is auto-generated by RunPod after deployment and cannot be manually specified
+- Note: endpoint_url is auto-generated by Runpod after deployment and cannot be manually specified
**"HTTP error from endpoint: 500"**
- Problem: Function raised an error during execution
diff --git a/pyproject.toml b/pyproject.toml
index b4abb707..d5a6f8fe 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
[project]
name = "runpod-flash"
-version = "1.1.1"
+version = "1.2.0"
description = "A Python library for distributed inference and serving of machine learning models"
authors = [
{ name = "Runpod", email = "engineer@runpod.io" },
diff --git a/scripts/test-image-constants.py b/scripts/test-image-constants.py
index eecfea1e..c348717f 100755
--- a/scripts/test-image-constants.py
+++ b/scripts/test-image-constants.py
@@ -62,7 +62,7 @@ def test_constants_exist():
test("FLASH_CPU_IMAGE defined", FLASH_CPU_IMAGE is not None)
test("FLASH_LB_IMAGE defined", FLASH_LB_IMAGE is not None)
test("FLASH_CPU_LB_IMAGE defined", FLASH_CPU_LB_IMAGE is not None)
- test("DEFAULT_WORKERS_MIN is 1", DEFAULT_WORKERS_MIN == 1)
+ test("DEFAULT_WORKERS_MIN is 0", DEFAULT_WORKERS_MIN == 0)
test("DEFAULT_WORKERS_MAX is 1", DEFAULT_WORKERS_MAX == 1)
print(f" Constants values (with FLASH_IMAGE_TAG={FLASH_IMAGE_TAG}):")
@@ -78,9 +78,9 @@ def test_manifest_builder():
from runpod_flash.cli.commands.build_utils.manifest import ManifestBuilder
from runpod_flash.core.resources.constants import (
- FLASH_CPU_LB_IMAGE,
- DEFAULT_WORKERS_MIN,
DEFAULT_WORKERS_MAX,
+ DEFAULT_WORKERS_MIN,
+ FLASH_CPU_LB_IMAGE,
)
builder = ManifestBuilder(project_name="test", remote_functions=[])
diff --git a/src/runpod_flash/__init__.py b/src/runpod_flash/__init__.py
index 0ea308c4..43b24057 100644
--- a/src/runpod_flash/__init__.py
+++ b/src/runpod_flash/__init__.py
@@ -33,6 +33,10 @@
ServerlessType,
FlashApp,
)
+ from .core.resources.constants import (
+ DEFAULT_WORKERS_MAX,
+ DEFAULT_WORKERS_MIN,
+ )
def __getattr__(name):
@@ -103,6 +107,17 @@ def __getattr__(name):
"FlashApp": FlashApp,
}
return attrs[name]
+ elif name in ("DEFAULT_WORKERS_MIN", "DEFAULT_WORKERS_MAX"):
+ from .core.resources.constants import (
+ DEFAULT_WORKERS_MAX,
+ DEFAULT_WORKERS_MIN,
+ )
+
+ attrs = {
+ "DEFAULT_WORKERS_MIN": DEFAULT_WORKERS_MIN,
+ "DEFAULT_WORKERS_MAX": DEFAULT_WORKERS_MAX,
+ }
+ return attrs[name]
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
@@ -126,4 +141,6 @@ def __getattr__(name):
"ServerlessEndpoint",
"ServerlessType",
"FlashApp",
+ "DEFAULT_WORKERS_MIN",
+ "DEFAULT_WORKERS_MAX",
]
diff --git a/src/runpod_flash/cli/commands/apps.py b/src/runpod_flash/cli/commands/apps.py
index a4f9b4fe..222d88c5 100644
--- a/src/runpod_flash/cli/commands/apps.py
+++ b/src/runpod_flash/cli/commands/apps.py
@@ -1,14 +1,11 @@
-"""Deployment environment management commands."""
+"""CLI commands for managing Flash apps (create, get, list, delete)."""
-import typer
-from rich.console import Console
-from rich.table import Table
-from rich.panel import Panel
import asyncio
-from runpod_flash.cli.utils.app import discover_flash_project
-
+import typer
+from rich.console import Console
+from runpod_flash.cli.utils.formatting import STATE_STYLE, format_datetime, state_dot
from runpod_flash.core.resources.app import FlashApp
console = Console()
@@ -35,91 +32,99 @@ def list_command():
"delete", short_help="Delete an existing flash app and all its associated resources"
)
def delete(
- app_name: str = typer.Option(..., "--app", "-a", help="Flash app name to delete"),
+ app_name: str = typer.Argument(..., help="Name of the flash app to delete"),
):
- if not app_name:
- _, app_name = discover_flash_project()
return asyncio.run(delete_flash_app(app_name))
async def list_flash_apps():
apps = await FlashApp.list()
if not apps:
- console.print("No Flash apps found.")
+ console.print("\nNo Flash apps found.")
+ console.print(" Run [bold]flash deploy[/bold] to create one.\n")
return
- table = Table(show_header=True, header_style="bold")
- table.add_column("Name", style="bold")
- table.add_column("ID", overflow="fold")
- table.add_column("Environments", overflow="fold")
- table.add_column("Builds", overflow="fold")
-
- for app in apps:
- environments = app.get("flashEnvironments") or []
- env_summary = ", ".join(env.get("name", "?") for env in environments) or "—"
- builds = app.get("flashBuilds") or []
- build_summary = ", ".join(build.get("id", "?") for build in builds) or "—"
- table.add_row(
- app.get("name", "(unnamed)"), app.get("id", "—"), env_summary, build_summary
+ console.print()
+ for app_data in apps:
+ name = app_data.get("name", "(unnamed)")
+ app_id = app_data.get("id", "")
+ environments = app_data.get("flashEnvironments") or []
+ builds = app_data.get("flashBuilds") or []
+
+ env_count = len(environments)
+ build_count = len(builds)
+ console.print(
+ f" [bold]{name}[/bold] "
+ f"{env_count} env{'s' if env_count != 1 else ''}, "
+ f"{build_count} build{'s' if build_count != 1 else ''} "
+ f"[dim]{app_id}[/dim]"
)
- console.print(table)
+ for env in environments:
+ state = env.get("state", "UNKNOWN")
+ env_name = env.get("name", "?")
+ console.print(
+ f" {state_dot(state)} {env_name} [dim]{state.lower()}[/dim]"
+ )
+
+ console.print()
async def create_flash_app(app_name: str):
with console.status(f"Creating flash app: {app_name}"):
app = await FlashApp.create(app_name)
- panel_content = (
- f"Flash app '[bold]{app_name}[/bold]' created successfully\n\nApp ID: {app.id}"
+ console.print(
+ f"[green]✓[/green] Created app [bold]{app_name}[/bold] [dim]{app.id}[/dim]"
)
- console.print(Panel(panel_content, title="✅ App Created", expand=False))
async def get_flash_app(app_name: str):
with console.status(f"Fetching flash app: {app_name}"):
app = await FlashApp.from_name(app_name)
- # Fetch environments and builds in parallel for better performance
envs, builds = await asyncio.gather(app.list_environments(), app.list_builds())
- main_info = f"Name: {app.name}\n"
- main_info += f"ID: {app.id}\n"
- main_info += f"Environments: {len(envs)}\n"
- main_info += f"Builds: {len(builds)}"
-
- console.print(Panel(main_info, title=f"📱 Flash App: {app_name}", expand=False))
+ console.print(f"\n [bold]{app.name}[/bold] [dim]{app.id}[/dim]")
+ # environments
+ console.print("\n [bold]Environments[/bold]")
if envs:
- env_table = Table(title="Environments")
- env_table.add_column("Name", style="cyan")
- env_table.add_column("ID", overflow="fold")
- env_table.add_column("State", style="yellow")
- env_table.add_column("Active Build", overflow="fold")
- env_table.add_column("Created", style="dim")
-
for env in envs:
- env_table.add_row(
- env.get("name"),
- env.get("id", "-"),
- env.get("state", "UNKNOWN"),
- env.get("activeBuildId", "-"),
- env.get("createdAt", "-"),
+ state = env.get("state", "UNKNOWN")
+ color = STATE_STYLE.get(state, "yellow")
+ name = env.get("name", "(unnamed)")
+ build_id = env.get("activeBuildId")
+ created = format_datetime(env.get("createdAt"))
+
+ console.print(
+ f" {state_dot(state)} [bold]{name}[/bold] "
+ f"[{color}]{state.lower()}[/{color}]"
)
- console.print(env_table)
+ parts = []
+ if build_id:
+ parts.append(f"build {build_id}")
+ parts.append(f"created {created}")
+ console.print(f" [dim]{' · '.join(parts)}[/dim]")
+ else:
+ console.print(" [dim]None yet — run [/dim][bold]flash deploy[/bold]")
+ # builds — show most recent, summarize the rest
+ max_shown = 5
+ console.print(f"\n [bold]Builds ({len(builds)})[/bold]")
if builds:
- build_table = Table(title="Builds")
- build_table.add_column("ID", overflow="fold")
- build_table.add_column("Object Key", overflow="fold")
- build_table.add_column("Created", style="dim")
-
- for build in builds:
- build_table.add_row(
- build.get("id"),
- build.get("objectKey", "-"),
- build.get("createdAt", "-"),
+ recent = builds[:max_shown]
+ for build in recent:
+ build_id = build.get("id", "")
+ created = format_datetime(build.get("createdAt"))
+ console.print(f" {build_id} [dim]{created}[/dim]")
+ if len(builds) > max_shown:
+ console.print(
+ f" [dim]… and {len(builds) - max_shown} older builds[/dim]"
)
- console.print(build_table)
+ else:
+ console.print(" [dim]None yet — run [/dim][bold]flash build[/bold]")
+
+ console.print()
async def delete_flash_app(app_name: str):
@@ -127,9 +132,9 @@ async def delete_flash_app(app_name: str):
success = await FlashApp.delete(app_name=app_name)
if success:
- console.print(f"✅ Flash app '{app_name}' deleted successfully")
+ console.print(f"[green]✓[/green] Deleted app [bold]{app_name}[/bold]")
else:
- console.print(f"❌ Failed to delete flash app '{app_name}'")
+ console.print(f"[red]✗[/red] Failed to delete app '{app_name}'")
raise typer.Exit(1)
diff --git a/src/runpod_flash/cli/commands/build.py b/src/runpod_flash/cli/commands/build.py
index 7be24fe5..44a1000e 100644
--- a/src/runpod_flash/cli/commands/build.py
+++ b/src/runpod_flash/cli/commands/build.py
@@ -14,9 +14,6 @@
import typer
from rich.console import Console
-from rich.panel import Panel
-from rich.progress import Progress, SpinnerColumn, TextColumn
-from rich.table import Table
try:
import tomllib # Python 3.11+
@@ -191,6 +188,7 @@ def run_build(
output_name: str | None = None,
exclude: str | None = None,
use_local_flash: bool = False,
+ verbose: bool = False,
) -> Path:
"""Run the build process and return the artifact path.
@@ -204,6 +202,7 @@ def run_build(
output_name: Custom archive name (default: artifact.tar.gz)
exclude: Comma-separated packages to exclude
use_local_flash: Bundle local runpod_flash source
+ verbose: Show archive and build directory paths in summary
Returns:
Path to the created artifact archive
@@ -224,260 +223,129 @@ def run_build(
if exclude:
excluded_packages = [pkg.strip().lower() for pkg in exclude.split(",")]
- # Display configuration
- _display_build_config(
- project_dir, app_name, no_deps, output_name, excluded_packages
- )
-
- # Execute build
- with Progress(
- SpinnerColumn(),
- TextColumn("[progress.description]{task.description}"),
- console=console,
- ) as progress:
- # Load ignore patterns
- ignore_task = progress.add_task("Loading ignore patterns...")
- spec = load_ignore_patterns(project_dir)
- progress.update(ignore_task, description="[green]✓ Loaded ignore patterns")
- progress.stop_task(ignore_task)
-
- # Collect files
- collect_task = progress.add_task("Collecting project files...")
- files = get_file_tree(project_dir, spec)
- progress.update(
- collect_task,
- description=f"[green]✓ Found {len(files)} files to package",
- )
- progress.stop_task(collect_task)
+ spec = load_ignore_patterns(project_dir)
+ files = get_file_tree(project_dir, spec)
- # Note: build directory already created before progress tracking
- build_task = progress.add_task("Creating build directory...")
- progress.update(
- build_task,
- description="[green]✓ Created .flash/.build/",
- )
- progress.stop_task(build_task)
+ try:
+ copy_project_files(files, project_dir, build_dir)
try:
- # Copy files
- copy_task = progress.add_task("Copying project files...")
- copy_project_files(files, project_dir, build_dir)
- progress.update(
- copy_task, description=f"[green]✓ Copied {len(files)} files"
- )
- progress.stop_task(copy_task)
-
- # Generate manifest
- manifest_task = progress.add_task("Generating service manifest...")
- try:
- scanner = RemoteDecoratorScanner(build_dir)
- remote_functions = scanner.discover_remote_functions()
+ scanner = RemoteDecoratorScanner(build_dir)
+ remote_functions = scanner.discover_remote_functions()
- # Always build manifest (includes mothership even without @remote functions)
- manifest_builder = ManifestBuilder(
- app_name, remote_functions, scanner, build_dir=build_dir
- )
- manifest = manifest_builder.build()
- manifest_path = build_dir / "flash_manifest.json"
- manifest_path.write_text(json.dumps(manifest, indent=2))
-
- # Copy manifest to .flash/ directory for deployment reference
- # This avoids needing to extract from tarball during deploy
- flash_dir = project_dir / ".flash"
- deployment_manifest_path = flash_dir / "flash_manifest.json"
- shutil.copy2(manifest_path, deployment_manifest_path)
-
- manifest_resources = manifest.get("resources", {})
-
- if manifest_resources:
- progress.update(
- manifest_task,
- description=f"[green]✓ Generated manifest with {len(manifest_resources)} resources",
- )
- else:
- progress.update(
- manifest_task,
- description="[yellow]⚠ No resources detected",
- )
-
- except (ImportError, SyntaxError) as e:
- progress.stop_task(manifest_task)
- console.print(f"[red]Error:[/red] Code analysis failed: {e}")
- logger.exception("Code analysis failed")
- raise typer.Exit(1)
- except ValueError as e:
- progress.stop_task(manifest_task)
- console.print(f"[red]Error:[/red] {e}")
- logger.exception("Handler generation validation failed")
- raise typer.Exit(1)
- except Exception as e:
- progress.stop_task(manifest_task)
- logger.exception("Handler generation failed")
- console.print(
- f"[yellow]Warning:[/yellow] Handler generation failed: {e}"
- )
+ manifest_builder = ManifestBuilder(
+ app_name, remote_functions, scanner, build_dir=build_dir
+ )
+ manifest = manifest_builder.build()
+ manifest_path = build_dir / "flash_manifest.json"
+ manifest_path.write_text(json.dumps(manifest, indent=2))
- progress.stop_task(manifest_task)
+ flash_dir = project_dir / ".flash"
+ deployment_manifest_path = flash_dir / "flash_manifest.json"
+ shutil.copy2(manifest_path, deployment_manifest_path)
- except typer.Exit:
- # Clean up on fatal errors (ImportError, SyntaxError, ValueError)
- if build_dir.exists():
- shutil.rmtree(build_dir)
- raise
- except Exception as e:
- # Clean up on unexpected errors
- if build_dir.exists():
- shutil.rmtree(build_dir)
- console.print(f"[red]Error:[/red] Build failed: {e}")
- logger.exception("Build failed")
+ except (ImportError, SyntaxError) as e:
+ console.print(f"[red]Error:[/red] Code analysis failed: {e}")
+ logger.exception("Code analysis failed")
raise typer.Exit(1)
+ except ValueError as e:
+ console.print(f"[red]Error:[/red] {e}")
+ logger.exception("Handler generation validation failed")
+ raise typer.Exit(1)
+ except Exception as e:
+ logger.exception("Handler generation failed")
+ console.print(f"[yellow]Warning:[/yellow] Handler generation failed: {e}")
- # Extract runpod_flash dependencies if bundling local version
- flash_deps = []
- if use_local_flash:
- flash_pkg = _find_local_runpod_flash()
- if flash_pkg:
- flash_deps = _extract_runpod_flash_dependencies(flash_pkg)
-
- # Install dependencies
- deps_task = progress.add_task("Installing dependencies...")
- requirements = collect_requirements(project_dir, build_dir)
-
- # Add runpod_flash dependencies if bundling local version
- # This ensures all runpod_flash runtime dependencies are available in the build
- requirements.extend(flash_deps)
-
- # Filter out excluded packages
- if excluded_packages:
- original_count = len(requirements)
- matched_exclusions = set()
- filtered_requirements = []
-
- for req in requirements:
- if should_exclude_package(req, excluded_packages):
- # Extract which exclusion matched
- pkg_name = extract_package_name(req)
- if pkg_name in excluded_packages:
- matched_exclusions.add(pkg_name)
- else:
- filtered_requirements.append(req)
-
- requirements = filtered_requirements
- excluded_count = original_count - len(requirements)
-
- if excluded_count > 0:
- console.print(
- f"[yellow]Excluded {excluded_count} package(s) "
- f"(assumed in base image)[/yellow]"
- )
+ except typer.Exit:
+ if build_dir.exists():
+ shutil.rmtree(build_dir)
+ raise
+ except Exception as e:
+ if build_dir.exists():
+ shutil.rmtree(build_dir)
+ console.print(f"[red]Error:[/red] Build failed: {e}")
+ logger.exception("Build failed")
+ raise typer.Exit(1)
- # Warn about exclusions that didn't match any packages
- unmatched = set(excluded_packages) - matched_exclusions
- if unmatched:
- console.print(
- f"[yellow]Warning: No packages matched exclusions: "
- f"{', '.join(sorted(unmatched))}[/yellow]"
- )
+ flash_deps = []
+ if use_local_flash:
+ flash_pkg = _find_local_runpod_flash()
+ if flash_pkg:
+ flash_deps = _extract_runpod_flash_dependencies(flash_pkg)
- if not requirements:
- progress.update(
- deps_task,
- description="[yellow]⚠ No dependencies found",
- )
- else:
- progress.update(
- deps_task,
- description=f"Installing {len(requirements)} packages...",
- )
+ # install dependencies
+ requirements = collect_requirements(project_dir, build_dir)
+ requirements.extend(flash_deps)
- success = install_dependencies(build_dir, requirements, no_deps)
+ # filter out excluded packages
+ if excluded_packages:
+ matched_exclusions = set()
+ filtered_requirements = []
+
+ for req in requirements:
+ if should_exclude_package(req, excluded_packages):
+ pkg_name = extract_package_name(req)
+ if pkg_name in excluded_packages:
+ matched_exclusions.add(pkg_name)
+ else:
+ filtered_requirements.append(req)
- if not success:
- progress.stop_task(deps_task)
- console.print("[red]Error:[/red] Failed to install dependencies")
- raise typer.Exit(1)
+ requirements = filtered_requirements
- progress.update(
- deps_task,
- description=f"[green]✓ Installed {len(requirements)} packages",
+ unmatched = set(excluded_packages) - matched_exclusions
+ if unmatched:
+ console.print(
+ f"[yellow]Warning:[/yellow] No packages matched exclusions: "
+ f"{', '.join(sorted(unmatched))}"
)
- progress.stop_task(deps_task)
-
- # Bundle local runpod_flash if requested
- if use_local_flash:
- flash_task = progress.add_task("Bundling local runpod_flash...")
- if _bundle_local_runpod_flash(build_dir):
- _remove_runpod_flash_from_requirements(build_dir)
- progress.update(
- flash_task,
- description="[green]✓ Bundled local runpod_flash",
- )
- else:
- progress.update(
- flash_task,
- description="[yellow]⚠ Using PyPI runpod_flash",
- )
- progress.stop_task(flash_task)
+ if requirements:
+ with console.status(f"Installing {len(requirements)} packages..."):
+ success = install_dependencies(build_dir, requirements, no_deps)
- # Generate resource configuration files
- # IMPORTANT: Must happen AFTER bundle_local_runpod_flash to avoid being overwritten
- # These files tell each resource which functions are local vs remote
- from .build_utils.resource_config_generator import (
- generate_all_resource_configs,
- )
+ if not success:
+ console.print("[red]Error:[/red] Failed to install dependencies")
+ raise typer.Exit(1)
- generate_all_resource_configs(manifest, build_dir)
+ # bundle local runpod_flash if requested
+ if use_local_flash:
+ if _bundle_local_runpod_flash(build_dir):
+ _remove_runpod_flash_from_requirements(build_dir)
- # Clean up Python bytecode before archiving
- cleanup_python_bytecode(build_dir)
+ # clean up and create archive
+ cleanup_python_bytecode(build_dir)
- # Create archive
- archive_task = progress.add_task("Creating archive...")
- archive_name = output_name or "artifact.tar.gz"
- archive_path = project_dir / ".flash" / archive_name
+ archive_name = output_name or "artifact.tar.gz"
+ archive_path = project_dir / ".flash" / archive_name
+ with console.status("Creating archive..."):
create_tarball(build_dir, archive_path, app_name)
- # Get archive size
- size_mb = archive_path.stat().st_size / (1024 * 1024)
+ size_mb = archive_path.stat().st_size / (1024 * 1024)
- progress.update(
- archive_task,
- description=f"[green]✓ Created {archive_name} ({size_mb:.1f} MB)",
+ # fail build if archive exceeds size limit
+ if size_mb > MAX_TARBALL_SIZE_MB:
+ console.print()
+ console.print(
+ f"[red]Error:[/red] Archive exceeds RunPod limit "
+ f"({size_mb:.1f} MB / {MAX_TARBALL_SIZE_MB} MB)"
+ )
+ console.print(
+ " Use --exclude to skip packages in base image: "
+ "[dim]flash deploy --exclude torch,torchvision,torchaudio[/dim]"
)
- progress.stop_task(archive_task)
-
- # Fail build if archive exceeds size limit
- if size_mb > MAX_TARBALL_SIZE_MB:
- console.print()
- console.print(
- Panel(
- f"[red bold]✗ BUILD FAILED: Archive exceeds RunPod limit[/red bold]\n\n"
- f"[red]Archive size:[/red] {size_mb:.1f} MB\n"
- f"[red]RunPod limit:[/red] {MAX_TARBALL_SIZE_MB} MB\n"
- f"[red]Over by:[/red] {size_mb - MAX_TARBALL_SIZE_MB:.1f} MB\n\n"
- f"[bold]Solutions:[/bold]\n"
- f" 1. Use --exclude to skip packages in base image:\n"
- f" [dim]flash deploy --exclude torch,torchvision,torchaudio[/dim]\n\n"
- f" 2. Reduce dependencies in requirements.txt",
- title="Build Artifact Too Large",
- border_style="red",
- )
- )
- console.print()
- # Cleanup: Remove invalid artifacts
- console.print("[dim]Cleaning up invalid artifacts...[/dim]")
- if archive_path.exists():
- archive_path.unlink()
- if build_dir.exists():
- shutil.rmtree(build_dir)
+ if archive_path.exists():
+ archive_path.unlink()
+ if build_dir.exists():
+ shutil.rmtree(build_dir)
- raise typer.Exit(1)
+ raise typer.Exit(1)
# Success summary
- _display_build_summary(archive_path, app_name, len(files), len(requirements))
+ _display_build_summary(
+ archive_path, app_name, len(files), len(requirements), size_mb, verbose=verbose
+ )
return archive_path
@@ -522,6 +390,7 @@ def build_command(
output_name=output_name,
exclude=exclude,
use_local_flash=use_local_flash,
+ verbose=True,
)
except KeyboardInterrupt:
@@ -948,7 +817,7 @@ def install_dependencies(
platform_str = "x86_64-unknown-linux-gnu"
else:
platform_str = f"{len(RUNPOD_PLATFORMS)} manylinux variants"
- console.print(f"[dim]Installing for: {platform_str}, Python {python_version}[/dim]")
+ logger.debug(f"Installing for: {platform_str}, Python {python_version}")
try:
result = subprocess.run(
@@ -1003,64 +872,21 @@ def cleanup_build_directory(build_base: Path) -> None:
shutil.rmtree(build_base)
-def _display_build_config(
- project_dir: Path,
- app_name: str,
- no_deps: bool,
- output_name: str | None,
- excluded_packages: list[str],
-):
- """Display build configuration."""
- archive_name = output_name or "artifact.tar.gz"
-
- config_text = (
- f"[bold]Project:[/bold] {app_name}\n"
- f"[bold]Directory:[/bold] {project_dir}\n"
- f"[bold]Archive:[/bold] .flash/{archive_name}\n"
- f"[bold]Skip transitive deps:[/bold] {no_deps}"
- )
-
- if excluded_packages:
- config_text += (
- f"\n[bold]Excluded packages:[/bold] {', '.join(excluded_packages)}"
- )
-
- console.print(
- Panel(
- config_text,
- title="Flash Build Configuration",
- expand=False,
- )
- )
-
-
def _display_build_summary(
- archive_path: Path, app_name: str, file_count: int, dep_count: int
+ archive_path: Path,
+ app_name: str,
+ file_count: int,
+ dep_count: int,
+ size_mb: float,
+ verbose: bool = False,
):
"""Display build summary."""
- size_mb = archive_path.stat().st_size / (1024 * 1024)
-
- summary = Table(show_header=False, box=None)
- summary.add_column("Item", style="bold")
- summary.add_column("Value", style="cyan")
-
- summary.add_row("Application", app_name)
- summary.add_row("Files packaged", str(file_count))
- summary.add_row("Dependencies", str(dep_count))
- summary.add_row("Archive", str(archive_path.relative_to(Path.cwd())))
- summary.add_row("Size", f"{size_mb:.1f} MB")
-
- console.print("\n")
- console.print(summary)
-
- archive_rel = archive_path.relative_to(Path.cwd())
-
console.print(
- Panel(
- f"[bold]{app_name}[/bold] built successfully!\n\n"
- f"[bold]Archive:[/bold] {archive_rel}",
- title="Build Complete",
- expand=False,
- border_style="green",
- )
+ f"[green]Built[/green] [bold]{app_name}[/bold] "
+ f"[dim]{file_count} files, {dep_count} deps, {size_mb:.1f} MB[/dim]"
)
+ if verbose:
+ console.print(f" [dim]Archive:[/dim] {archive_path}")
+ build_dir = archive_path.parent / ".build"
+ if build_dir.exists():
+ console.print(f" [dim]Build:[/dim] {build_dir}")
diff --git a/src/runpod_flash/cli/commands/build_utils/manifest.py b/src/runpod_flash/cli/commands/build_utils/manifest.py
index c4dbc407..b67ce9bd 100644
--- a/src/runpod_flash/cli/commands/build_utils/manifest.py
+++ b/src/runpod_flash/cli/commands/build_utils/manifest.py
@@ -403,7 +403,7 @@ def build(self) -> Dict[str, Any]:
if explicit_mothership:
# Use explicit configuration
- logger.info("Found explicit mothership configuration in mothership.py")
+ logger.debug("Found explicit mothership configuration in mothership.py")
# Check for name conflict
mothership_name = explicit_mothership.get("name", "mothership")
diff --git a/src/runpod_flash/cli/commands/build_utils/scanner.py b/src/runpod_flash/cli/commands/build_utils/scanner.py
index 5ff5c4d2..2215ab9e 100644
--- a/src/runpod_flash/cli/commands/build_utils/scanner.py
+++ b/src/runpod_flash/cli/commands/build_utils/scanner.py
@@ -74,11 +74,11 @@ def discover_remote_functions(self) -> List[RemoteFunctionMetadata]:
tree = ast.parse(content)
self._extract_resource_configs(tree, py_file)
except UnicodeDecodeError:
- logger.debug(f"Skipping non-UTF-8 file: {py_file}")
+ pass
except SyntaxError as e:
logger.warning(f"Syntax error in {py_file}: {e}")
- except Exception as e:
- logger.debug(f"Failed to parse {py_file}: {e}")
+ except Exception:
+ pass
# Second pass: extract @remote decorated functions
for py_file in self.py_files:
@@ -87,11 +87,11 @@ def discover_remote_functions(self) -> List[RemoteFunctionMetadata]:
tree = ast.parse(content)
functions.extend(self._extract_remote_functions(tree, py_file))
except UnicodeDecodeError:
- logger.debug(f"Skipping non-UTF-8 file: {py_file}")
+ pass
except SyntaxError as e:
logger.warning(f"Syntax error in {py_file}: {e}")
- except Exception as e:
- logger.debug(f"Failed to parse {py_file}: {e}")
+ except Exception:
+ pass
# Third pass: analyze function call graphs
remote_function_names = {f.function_name for f in functions}
@@ -115,11 +115,11 @@ def discover_remote_functions(self) -> List[RemoteFunctionMetadata]:
node, func_meta, remote_function_names
)
except UnicodeDecodeError:
- logger.debug(f"Skipping non-UTF-8 file: {py_file}")
+ pass
except SyntaxError as e:
logger.warning(f"Syntax error in {py_file}: {e}")
- except Exception as e:
- logger.debug(f"Failed to parse {py_file}: {e}")
+ except Exception:
+ pass
return functions
diff --git a/src/runpod_flash/cli/commands/deploy.py b/src/runpod_flash/cli/commands/deploy.py
index d128d768..1a34a3aa 100644
--- a/src/runpod_flash/cli/commands/deploy.py
+++ b/src/runpod_flash/cli/commands/deploy.py
@@ -4,14 +4,13 @@
import json
import logging
import shutil
-import textwrap
import typer
from pathlib import Path
from rich.console import Console
from ..utils.app import discover_flash_project
-from ..utils.deployment import deploy_to_environment
+from ..utils.deployment import deploy_from_uploaded_build, validate_local_manifest
from .build import run_build
from runpod_flash.core.resources.app import FlashApp
@@ -95,11 +94,11 @@ def deploy_command(
raise typer.Exit(1)
-def _display_post_deployment_guidance(env_name: str) -> None:
+def _display_post_deployment_guidance(
+ env_name: str, mothership_url: str | None = None
+) -> None:
"""Display helpful next steps after successful deployment."""
- # Try to read manifest for endpoint information
manifest_path = Path.cwd() / ".flash" / "flash_manifest.json"
- mothership_url = None
mothership_routes = {}
try:
@@ -109,90 +108,43 @@ def _display_post_deployment_guidance(env_name: str) -> None:
resources = manifest.get("resources", {})
routes = manifest.get("routes", {})
- # Find mothership URL and routes
- for resource_name, url in resources_endpoints.items():
+ for resource_name in resources_endpoints:
if resources.get(resource_name, {}).get("is_mothership", False):
- mothership_url = url
mothership_routes = routes.get(resource_name, {})
break
except (FileNotFoundError, json.JSONDecodeError) as e:
logger.debug(f"Could not read manifest: {e}")
- console.print("\n[bold]Next Steps:[/bold]\n")
-
- # 1. Authentication
- console.print("[bold cyan]1. Authentication Required[/bold cyan]")
- console.print(
- " All endpoints require authentication. Set your API key as an environment "
- "variable. Avoid typing secrets directly into shell commands, as they may be "
- "stored in your shell history."
- )
- console.print(
- " [dim]# Recommended: store RUNPOD_API_KEY in a .env file or your shell profile[/dim]"
- )
- console.print(
- " [dim]# Or securely prompt for it without echo (Bash example):[/dim]"
- )
- console.print(" [dim]read -s RUNPOD_API_KEY && export RUNPOD_API_KEY[/dim]\n")
-
- # 2. Calling functions
- console.print("[bold cyan]2. Call Your Functions[/bold cyan]")
-
- if mothership_url:
- console.print(
- f" Your mothership is deployed at:\n [link]{mothership_url}[/link]\n"
- )
-
- console.print(" [bold]Using HTTP/curl:[/bold]")
- if mothership_url:
- curl_example = textwrap.dedent(f"""
- curl -X POST {mothership_url}/YOUR_PATH \\
- -H "Authorization: Bearer $RUNPOD_API_KEY" \\
- -H "Content-Type: application/json" \\
- -d '{{"param1": "value1"}}'
- """).strip()
- else:
- curl_example = textwrap.dedent("""
- curl -X POST https://YOUR_ENDPOINT_URL/YOUR_PATH \\
- -H "Authorization: Bearer $RUNPOD_API_KEY" \\
- -H "Content-Type: application/json" \\
- -d '{"param1": "value1"}'
- """).strip()
- console.print(f" [dim]{curl_example}[/dim]\n")
-
- # 3. Available routes
- console.print("[bold cyan]3. Available Routes[/bold cyan]")
if mothership_routes:
+ console.print("\n[bold]Routes:[/bold]")
for route_key in sorted(mothership_routes.keys()):
- # route_key format: "POST /api/hello"
method, path = route_key.split(" ", 1)
- console.print(f" [cyan]{method:6s}[/cyan] {path}")
- console.print()
- else:
- # Routes not found - could mean manifest missing, no LB endpoints, or no routes defined
- if mothership_url:
- console.print(
- " [dim]No routes found in manifest. Check @remote decorators in your code.[/dim]\n"
- )
- else:
- console.print(
- " Check your code for @remote decorators to find available endpoints:"
- )
- console.print(
- ' [dim]@remote(mothership, method="POST", path="/api/process")[/dim]\n'
+ console.print(f" {method:6s} {path}")
+
+ # curl example using the first POST route
+ if mothership_url and mothership_routes:
+ post_routes = [
+ k.split(" ", 1)[1]
+ for k in sorted(mothership_routes.keys())
+ if k.startswith("POST ")
+ ]
+ if post_routes:
+ example_route = post_routes[0]
+ curl_cmd = (
+ f"curl -X POST {mothership_url}{example_route} \\\n"
+ f' -H "Content-Type: application/json" \\\n'
+ ' -H "Authorization: Bearer $RUNPOD_API_KEY" \\\n'
+ " -d '{\"input\": {}}'"
)
+ console.print("\n[bold]Try it:[/bold]")
+ console.print(f" [dim]{curl_cmd}[/dim]")
- # 4. Monitor & Debug
- console.print("[bold cyan]4. Monitor & Debug[/bold cyan]")
- console.print(f" [dim]flash env get {env_name}[/dim] - View environment status")
+ console.print("\n[bold]Useful commands:[/bold]")
console.print(
- " [dim]Runpod Console[/dim] - View logs and metrics at https://console.runpod.io/serverless\n"
+ f" [dim]flash env get {env_name}[/dim] View environment status"
)
-
- # 5. Update & Teardown
- console.print("[bold cyan]5. Update or Remove Deployment[/bold cyan]")
- console.print(f" [dim]flash deploy --env {env_name}[/dim] - Update deployment")
- console.print(f" [dim]flash env delete {env_name}[/dim] - Remove deployment\n")
+ console.print(f" [dim]flash deploy --env {env_name}[/dim] Update deployment")
+ console.print(f" [dim]flash env delete {env_name}[/dim] Remove deployment")
def _launch_preview(project_dir):
@@ -216,17 +168,43 @@ def _launch_preview(project_dir):
async def _resolve_and_deploy(
app_name: str, env_name: str | None, archive_path
) -> None:
- resolved_env_name = await _resolve_environment(app_name, env_name)
+ app, resolved_env_name = await _resolve_environment(app_name, env_name)
- console.print(f"\nDeploying to '[bold]{resolved_env_name}[/bold]'...")
+ local_manifest = validate_local_manifest()
- await deploy_to_environment(app_name, resolved_env_name, archive_path)
+ with console.status("Uploading build..."):
+ build = await app.upload_build(archive_path)
- # Display next steps guidance
- _display_post_deployment_guidance(resolved_env_name)
+ with console.status("Deploying resources..."):
+ result = await deploy_from_uploaded_build(
+ app, build["id"], resolved_env_name, local_manifest
+ )
+ console.print(f"[green]Deployed[/green] to [bold]{resolved_env_name}[/bold]")
+ resources_endpoints = result.get("resources_endpoints", {})
+ local_manifest = result.get("local_manifest", {})
+ resources = local_manifest.get("resources", {})
-async def _resolve_environment(app_name: str, env_name: str | None) -> str:
+ # mothership first, then workers
+ mothership_url = None
+ if resources_endpoints:
+ console.print()
+ other_items = []
+ for resource_name, url in resources_endpoints.items():
+ if resources.get(resource_name, {}).get("is_mothership", False):
+ mothership_url = url
+ console.print(f" [bold]{url}[/bold] [dim]({resource_name})[/dim]")
+ else:
+ other_items.append((resource_name, url))
+ for resource_name, url in other_items:
+ console.print(f" [dim]{url} ({resource_name})[/dim]")
+
+ _display_post_deployment_guidance(resolved_env_name, mothership_url=mothership_url)
+
+
+async def _resolve_environment(
+ app_name: str, env_name: str | None
+) -> tuple[FlashApp, str]:
try:
app = await FlashApp.from_name(app_name)
except Exception as exc:
@@ -236,8 +214,8 @@ async def _resolve_environment(app_name: str, env_name: str | None) -> str:
console.print(
f"[dim]No app '{app_name}' found. Creating app and '{target}' environment...[/dim]"
)
- await FlashApp.create_environment_and_app(app_name, target)
- return target
+ app, _ = await FlashApp.create_environment_and_app(app_name, target)
+ return app, target
if env_name:
envs = await app.list_environments()
@@ -247,21 +225,19 @@ async def _resolve_environment(app_name: str, env_name: str | None) -> str:
f"[dim]Environment '{env_name}' not found. Creating it...[/dim]"
)
await app.create_environment(env_name)
- return env_name
+ return app, env_name
envs = await app.list_environments()
if len(envs) == 1:
- resolved = envs[0].get("name")
- console.print(f"[dim]Auto-selected environment: {resolved}[/dim]")
- return resolved
+ return app, envs[0].get("name")
if len(envs) == 0:
console.print(
"[dim]No environments found. Creating 'production' environment...[/dim]"
)
await app.create_environment("production")
- return "production"
+ return app, "production"
env_names = [e.get("name", "?") for e in envs]
console.print(
diff --git a/src/runpod_flash/cli/commands/env.py b/src/runpod_flash/cli/commands/env.py
index 5e115c66..b064f1db 100644
--- a/src/runpod_flash/cli/commands/env.py
+++ b/src/runpod_flash/cli/commands/env.py
@@ -5,10 +5,9 @@
import questionary
import typer
from rich.console import Console
-from rich.panel import Panel
-from rich.table import Table
from ..utils.app import discover_flash_project
+from ..utils.formatting import STATE_STYLE, format_datetime, state_dot
from runpod_flash.core.resources.app import FlashApp
@@ -96,24 +95,31 @@ async def _list_environments(app_name: str):
envs = await app.list_environments()
if not envs:
- console.print(f"No environments found for '{app_name}'.")
+ console.print(f"\nNo environments for [bold]{app_name}[/bold].")
+ console.print(" Run [bold]flash deploy[/bold] to create one.\n")
return
- table = Table(show_header=True, header_style="bold")
- table.add_column("Name", style="bold")
- table.add_column("ID", overflow="fold")
- table.add_column("Active Build", overflow="fold")
- table.add_column("Created At", overflow="fold")
-
+ console.print(
+ f"\n [bold]{app_name}[/bold] {len(envs)} environment{'s' if len(envs) != 1 else ''}\n"
+ )
for env in envs:
- table.add_row(
- env.get("name"),
- env.get("id"),
- env.get("activeBuildId", "-"),
- env.get("createdAt"),
+ name = env.get("name", "(unnamed)")
+ state = env.get("state", "UNKNOWN")
+ color = STATE_STYLE.get(state, "yellow")
+ build = env.get("activeBuildId")
+ created = format_datetime(env.get("createdAt"))
+
+ console.print(
+ f" {state_dot(state)} [bold]{name}[/bold] "
+ f"[{color}]{state.lower()}[/{color}]"
)
+ parts = []
+ if build:
+ parts.append(f"build {build}")
+ parts.append(f"created {created}")
+ console.print(f" [dim]{' · '.join(parts)}[/dim]")
- console.print(table)
+ console.print()
def create_command(
@@ -134,27 +140,10 @@ def create_command(
async def _create_environment(app_name: str, env_name: str):
app, env = await FlashApp.create_environment_and_app(app_name, env_name)
- panel_content = (
- f"Environment '[bold]{env_name}[/bold]' created successfully\n\n"
- f"App: {app_name}\n"
- f"Environment ID: {env.get('id')}\n"
- f"Status: {env.get('state', 'PENDING')}"
+ console.print(
+ f"[green]✓[/green] Created environment [bold]{env_name}[/bold] "
+ f"[dim]{env.get('id')}[/dim]"
)
- console.print(Panel(panel_content, title="Environment Created", expand=False))
-
- table = Table(show_header=True, header_style="bold")
- table.add_column("Name", style="bold")
- table.add_column("ID", overflow="fold")
- table.add_column("Status", overflow="fold")
- table.add_column("Created At", overflow="fold")
-
- table.add_row(
- env.get("name"),
- env.get("id"),
- env.get("state", "PENDING"),
- env.get("createdAt", "Just now"),
- )
- console.print(table)
def get_command(
@@ -171,41 +160,45 @@ async def _get_environment(app_name: str, env_name: str):
app = await FlashApp.from_name(app_name)
env = await app.get_environment_by_name(env_name)
- main_info = f"Environment: {env.get('name')}\n"
- main_info += f"ID: {env.get('id')}\n"
- main_info += f"State: {env.get('state', 'UNKNOWN')}\n"
- main_info += f"Active Build: {env.get('activeBuildId', 'None')}\n"
-
- if env.get("createdAt"):
- main_info += f"Created: {env.get('createdAt')}\n"
+ state = env.get("state", "UNKNOWN")
+ color = STATE_STYLE.get(state, "yellow")
- console.print(Panel(main_info, title=f"Environment: {env_name}", expand=False))
+ console.print(
+ f"\n {state_dot(state)} [bold]{env.get('name')}[/bold] "
+ f"[{color}]{state.lower()}[/{color}]"
+ )
+ console.print(f" [dim]id[/dim] {env.get('id')}")
+ console.print(f" [dim]app[/dim] {app_name}")
+ console.print(f" [dim]build[/dim] {env.get('activeBuildId') or 'none'}")
endpoints = env.get("endpoints") or []
+ network_volumes = env.get("networkVolumes") or []
+
if endpoints:
- endpoint_table = Table(title="Associated Endpoints")
- endpoint_table.add_column("Name", style="cyan")
- endpoint_table.add_column("ID", overflow="fold")
-
- for endpoint in endpoints:
- endpoint_table.add_row(
- endpoint.get("name", "-"),
- endpoint.get("id", "-"),
+ console.print("\n [bold]Endpoints[/bold]")
+ for ep in endpoints:
+ console.print(
+ f" ▸ [bold]{ep.get('name', '-')}[/bold] [dim]{ep.get('id', '')}[/dim]"
)
- console.print(endpoint_table)
- network_volumes = env.get("networkVolumes") or []
if network_volumes:
- nv_table = Table(title="Associated Network Volumes")
- nv_table.add_column("Name", style="cyan")
- nv_table.add_column("ID", overflow="fold")
-
+ console.print("\n [bold]Network Volumes[/bold]")
for nv in network_volumes:
- nv_table.add_row(
- nv.get("name", "-"),
- nv.get("id", "-"),
+ console.print(
+ f" ▸ [bold]{nv.get('name', '-')}[/bold] [dim]{nv.get('id', '')}[/dim]"
)
- console.print(nv_table)
+
+ if not endpoints and not network_volumes:
+ console.print("\n No resources deployed yet.")
+ console.print(f" Run [bold]flash deploy --env {env_name}[/bold] to deploy.")
+ else:
+ console.print("\n [bold]Commands[/bold]")
+ console.print(
+ f" [dim]flash deploy --env {env_name}[/dim] Update deployment"
+ )
+ console.print(f" [dim]flash env delete {env_name}[/dim] Tear down")
+
+ console.print()
def delete_command(
@@ -221,16 +214,10 @@ def delete_command(
try:
env = asyncio.run(_fetch_environment_info(app_name, env_name))
except Exception as e:
- console.print(f"[red]Error:[/red] Failed to fetch environment info: {e}")
+ console.print(f"[red]✗[/red] Failed to fetch environment info: {e}")
raise typer.Exit(1)
- panel_content = (
- f"Environment '[bold]{env_name}[/bold]' will be deleted\n\n"
- f"Environment ID: {env.get('id')}\n"
- f"App: {app_name}\n"
- f"Active Build: {env.get('activeBuildId', 'None')}"
- )
- console.print(Panel(panel_content, title="Delete Confirmation", expand=False))
+ console.print(f"\nDeleting [bold]{env_name}[/bold] [dim]{env.get('id')}[/dim]")
try:
confirmed = questionary.confirm(
@@ -239,10 +226,10 @@ def delete_command(
).ask()
if not confirmed:
- console.print("Deletion cancelled")
+ console.print("[yellow]Cancelled[/yellow]")
raise typer.Exit(0)
except KeyboardInterrupt:
- console.print("\nDeletion cancelled")
+ console.print("\n[yellow]Cancelled[/yellow]")
raise typer.Exit(0)
asyncio.run(_delete_environment(app_name, env_name))
@@ -263,7 +250,7 @@ async def _delete_environment(app_name: str, env_name: str):
success = await app.delete_environment(env_name)
if success:
- console.print(f"Environment '{env_name}' deleted successfully")
+ console.print(f"[green]✓[/green] Deleted environment [bold]{env_name}[/bold]")
else:
- console.print(f"[red]Failed to delete environment '{env_name}'[/red]")
+ console.print(f"[red]✗[/red] Failed to delete environment '{env_name}'")
raise typer.Exit(1)
diff --git a/src/runpod_flash/cli/commands/init.py b/src/runpod_flash/cli/commands/init.py
index f684e537..15a96d3d 100644
--- a/src/runpod_flash/cli/commands/init.py
+++ b/src/runpod_flash/cli/commands/init.py
@@ -5,8 +5,6 @@
import typer
from rich.console import Console
-from rich.panel import Panel
-from rich.table import Table
from ..utils.skeleton import create_project_skeleton, detect_file_conflicts
@@ -21,103 +19,70 @@ def init_command(
):
"""Create new Flash project with Flash Server and GPU workers."""
- # Determine target directory and initialization mode
if project_name is None or project_name == ".":
- # Initialize in current directory
project_dir = Path.cwd()
is_current_dir = True
- # Use current directory name as project name
actual_project_name = project_dir.name
else:
- # Create new directory
project_dir = Path(project_name)
is_current_dir = False
actual_project_name = project_name
- # Create project directory if needed
if not is_current_dir:
project_dir.mkdir(parents=True, exist_ok=True)
- # Check for file conflicts in target directory
conflicts = detect_file_conflicts(project_dir)
- should_overwrite = force # Start with force flag value
+ should_overwrite = force
if conflicts and not force:
- # Show warning and prompt user
console.print(
- Panel(
- "[yellow]Warning: The following files will be overwritten:[/yellow]\n\n"
- + "\n".join(f" • {conflict}" for conflict in conflicts),
- title="File Conflicts Detected",
- expand=False,
- )
+ "[yellow]Warning:[/yellow] The following files will be overwritten:\n"
)
+ for conflict in conflicts:
+ console.print(f" {conflict}")
+ console.print()
- # Prompt user for confirmation
proceed = typer.confirm("Continue and overwrite these files?", default=False)
if not proceed:
- console.print("[yellow]Initialization aborted.[/yellow]")
+ console.print("[yellow]Cancelled[/yellow]")
raise typer.Exit(0)
- # User confirmed, so we should overwrite
should_overwrite = True
- # Create project skeleton
status_msg = (
- "Initializing Flash project in current directory..."
+ "Initializing Flash project..."
if is_current_dir
else f"Creating Flash project '{project_name}'..."
)
with console.status(status_msg):
create_project_skeleton(project_dir, should_overwrite)
- # Success output
- if is_current_dir:
- panel_content = f"Flash project '[bold]{actual_project_name}[/bold]' initialized in current directory!\n\n"
- panel_content += "Project structure:\n"
- panel_content += " ./\n"
- else:
- panel_content = f"Flash project '[bold]{actual_project_name}[/bold]' created successfully!\n\n"
- panel_content += "Project structure:\n"
- panel_content += f" {actual_project_name}/\n"
-
- panel_content += " ├── main.py # Flash Server (FastAPI)\n"
- panel_content += " ├── mothership.py # Mothership endpoint config\n"
- panel_content += " ├── pyproject.toml # Python project config\n"
- panel_content += " ├── workers/\n"
- panel_content += " │ ├── gpu/ # GPU worker\n"
- panel_content += " │ └── cpu/ # CPU worker\n"
- panel_content += " ├── .env.example\n"
- panel_content += " ├── requirements.txt\n"
- panel_content += " └── README.md\n"
-
- title = "Project Initialized" if is_current_dir else "Project Created"
- console.print(Panel(panel_content, title=title, expand=False))
-
- # Next steps
- console.print("\n[bold]Next steps:[/bold]")
- steps_table = Table(show_header=False, box=None, padding=(0, 1))
- steps_table.add_column("Step", style="bold cyan")
- steps_table.add_column("Description")
+ console.print(f"[green]Created[/green] [bold]{actual_project_name}[/bold]\n")
+
+ prefix = "./" if is_current_dir else f"{actual_project_name}/"
+ console.print(f" {prefix}")
+ console.print(" ├── main.py FastAPI server")
+ console.print(" ├── mothership.py Mothership config")
+ console.print(" ├── pyproject.toml")
+ console.print(" ├── workers/")
+ console.print(" │ ├── gpu/")
+ console.print(" │ └── cpu/")
+ console.print(" ├── .env.example")
+ console.print(" ├── requirements.txt")
+ console.print(" └── README.md")
+ console.print("\n[bold]Next steps:[/bold]")
step_num = 1
if not is_current_dir:
- steps_table.add_row(f"{step_num}.", f"cd {actual_project_name}")
+ console.print(f" {step_num}. cd {actual_project_name}")
step_num += 1
-
- steps_table.add_row(f"{step_num}.", "Review and customize mothership.py (optional)")
+ console.print(f" {step_num}. pip install -r requirements.txt")
step_num += 1
- steps_table.add_row(f"{step_num}.", "pip install -r requirements.txt")
+ console.print(f" {step_num}. cp .env.example .env && add RUNPOD_API_KEY")
step_num += 1
- steps_table.add_row(f"{step_num}.", "cp .env.example .env")
- step_num += 1
- steps_table.add_row(f"{step_num}.", "Add your RUNPOD_API_KEY to .env")
- step_num += 1
- steps_table.add_row(f"{step_num}.", "flash run")
+ console.print(f" {step_num}. flash run")
- console.print(steps_table)
-
- console.print("\n[bold]Get your API key:[/bold]")
- console.print(" https://docs.runpod.io/get-started/api-keys")
- console.print("\nVisit http://localhost:8888/docs after running")
- console.print("\nCheck out the README.md for more")
+ console.print(
+ "\n [dim]API keys: https://docs.runpod.io/get-started/api-keys[/dim]"
+ )
+ console.print(" [dim]Docs: http://localhost:8888/docs (after running)[/dim]")
diff --git a/src/runpod_flash/cli/commands/preview.py b/src/runpod_flash/cli/commands/preview.py
index cf4866c1..4770d61b 100644
--- a/src/runpod_flash/cli/commands/preview.py
+++ b/src/runpod_flash/cli/commands/preview.py
@@ -10,7 +10,6 @@
import typer
from rich.console import Console
-from rich.table import Table
from runpod_flash.core.resources.constants import FLASH_CPU_LB_IMAGE
@@ -379,42 +378,24 @@ def _display_preview_info(containers: list[ContainerInfo]) -> None:
Args:
containers: List of ContainerInfo objects
"""
- table = Table(title="Preview Environment Running", show_header=True)
- table.add_column("Resource", style="cyan")
- table.add_column("Port", style="magenta")
- table.add_column("URL", style="green")
- table.add_column("Type", style="blue")
-
- # Sort: mothership first, then others
sorted_containers = sorted(containers, key=lambda c: (not c.is_mothership, c.name))
+ console.print(f"\n[bold]Preview[/bold] ({len(containers)} containers)\n")
for container in sorted_containers:
- container_type = "Mothership" if container.is_mothership else "Worker"
- table.add_row(
- container.name, str(container.port), container.url, container_type
+ container_type = "mothership" if container.is_mothership else "worker"
+ console.print(
+ f" [bold]{container.name}[/bold] {container.url} {container_type}"
)
- console.print()
- console.print(table)
- console.print()
-
- # Display usage instructions
- console.print("[bold]Access your application:[/bold]")
mothership = next((c for c in containers if c.is_mothership), None)
if mothership:
- console.print(f" [dim]Main: {mothership.url}[/dim]")
- console.print(f" [dim]Health: curl {mothership.url}/ping[/dim]")
+ console.print("\n[bold]Try it:[/bold]")
+ console.print(f" curl {mothership.url}/ping")
- console.print()
- console.print("[bold]Container communication:[/bold]")
- console.print(
- " [dim]Containers communicate via Docker DNS on internal port 80[/dim]"
- )
- console.print(" [dim]Example: http://flash-preview-gpu_config:80[/dim]")
+ console.print("\n[bold]Networking:[/bold]")
+ console.print(" Containers communicate via Docker DNS on internal port 80")
- console.print()
- console.print("[bold][yellow]Press Ctrl+C to stop and cleanup[/yellow][/bold]")
- console.print()
+ console.print("\n[yellow]Press Ctrl+C to stop[/yellow]\n")
def _wait_for_shutdown() -> None:
diff --git a/src/runpod_flash/cli/commands/resource.py b/src/runpod_flash/cli/commands/resource.py
index 6bc28739..095a80c4 100644
--- a/src/runpod_flash/cli/commands/resource.py
+++ b/src/runpod_flash/cli/commands/resource.py
@@ -3,8 +3,6 @@
import time
import typer
from rich.console import Console
-from rich.table import Table
-from rich.panel import Panel
from rich.live import Live
from ...core.resources.resource_manager import ResourceManager
@@ -25,84 +23,71 @@ def report_command(
if live:
try:
with Live(
- generate_resource_table(resource_manager),
+ _render_resource_report(resource_manager),
console=console,
refresh_per_second=1 / refresh,
screen=True,
) as live_display:
while True:
time.sleep(refresh)
- live_display.update(generate_resource_table(resource_manager))
+ live_display.update(_render_resource_report(resource_manager))
except KeyboardInterrupt:
- console.print("\n📊 Live monitoring stopped")
+ console.print("\nStopped")
else:
- table = generate_resource_table(resource_manager)
- console.print(table)
+ output = _render_resource_report(resource_manager)
+ console.print(output)
-def generate_resource_table(resource_manager: ResourceManager) -> Panel:
- """Generate a formatted table of resources."""
+def _render_resource_report(resource_manager: ResourceManager):
+ """Build a rich renderable for the current resource state."""
+ from rich.text import Text
resources = resource_manager._resources
if not resources:
- return Panel(
- "📊 No resources currently tracked\n\n"
- "Resources will appear here after running your Flash applications.",
- title="Resource Status Report",
- expand=False,
- )
-
- table = Table(title="Resource Status Report")
- table.add_column("Resource ID", style="cyan", no_wrap=True)
- table.add_column("Status", justify="center")
- table.add_column("Type", style="magenta")
- table.add_column("URL", style="blue")
- table.add_column("Health", justify="center")
+ return Text("No resources tracked.")
+
+ lines = Text()
+ lines.append("\nResources\n\n", style="bold")
active_count = 0
- error_count = 0
+ inactive_count = 0
for uid, resource in resources.items():
- # Determine status
try:
is_deployed = resource.is_deployed()
if is_deployed:
- status = "🟢 Active"
+ color, status_text = "green", "active"
active_count += 1
else:
- status = "🔴 Inactive"
- error_count += 1
+ color, status_text = "red", "inactive"
+ inactive_count += 1
except Exception:
- status = "🟡 Unknown"
+ color, status_text = "yellow", "unknown"
- # Get resource info
resource_type = resource.__class__.__name__
-
try:
- url = resource.url if hasattr(resource, "url") else "N/A"
+ url = resource.url if hasattr(resource, "url") else ""
except Exception:
- url = "N/A"
+ url = ""
- # Health check (simplified for now)
- health = "✓" if status == "🟢 Active" else "✗"
+ display_uid = uid[:20] + "..." if len(uid) > 20 else uid
- table.add_row(
- uid[:20] + "..." if len(uid) > 20 else uid,
- status,
- resource_type,
- url,
- health,
- )
+ lines.append(f" {display_uid}", style="bold")
+ lines.append(f" {status_text}", style=color)
+ lines.append(f" {resource_type}")
+ if url:
+ lines.append(f" {url}")
+ lines.append("\n")
- # Summary
total = len(resources)
- idle_count = total - active_count - error_count
- summary = f"Total: {total} resources ({active_count} active"
- if idle_count > 0:
- summary += f", {idle_count} idle"
- if error_count > 0:
- summary += f", {error_count} error"
- summary += ")"
-
- return Panel(table, subtitle=summary, expand=False)
+ unknown_count = total - active_count - inactive_count
+ parts = [f"{active_count} active"]
+ if inactive_count > 0:
+ parts.append(f"{inactive_count} inactive")
+ if unknown_count > 0:
+ parts.append(f"{unknown_count} unknown")
+
+ lines.append(f"\n{total} resources ({', '.join(parts)})\n")
+
+ return lines
diff --git a/src/runpod_flash/cli/commands/undeploy.py b/src/runpod_flash/cli/commands/undeploy.py
index 73974f94..86c7a4bd 100644
--- a/src/runpod_flash/cli/commands/undeploy.py
+++ b/src/runpod_flash/cli/commands/undeploy.py
@@ -6,8 +6,6 @@
from typing import TYPE_CHECKING, Dict, Optional, Tuple
import typer
from rich.console import Console
-from rich.table import Table
-from rich.panel import Panel
from rich.prompt import Confirm
import questionary
@@ -56,20 +54,20 @@ def _get_serverless_resources(
def _get_resource_status(resource) -> Tuple[str, str]:
- """Get resource status with icon and text.
+ """Get resource status color and text.
Args:
resource: DeployableResource to check
Returns:
- Tuple of (status_icon, status_text)
+ Tuple of (color, status_text)
"""
try:
if resource.is_deployed():
- return "🟢", "Active"
- return "🔴", "Inactive"
+ return "green", "active"
+ return "red", "inactive"
except Exception:
- return "❓", "Unknown"
+ return "yellow", "unknown"
def _get_resource_type(resource) -> str:
@@ -94,81 +92,48 @@ def list_command():
resources = _get_serverless_resources(all_resources)
if not resources:
- console.print(
- Panel(
- "No endpoints found\n\n"
- "Endpoints are automatically tracked when you use @remote decorator.",
- title="Tracked Endpoints",
- expand=False,
- )
- )
+ console.print("No endpoints found.")
return
- table = Table(title="Tracked RunPod Serverless Endpoints")
- table.add_column("Name", style="cyan", no_wrap=True)
- table.add_column("Endpoint ID", style="magenta")
- table.add_column("Status", justify="center")
- table.add_column("Type", style="yellow")
- table.add_column("Resource ID", style="dim", no_wrap=True)
-
active_count = 0
inactive_count = 0
+ console.print()
for resource_id, resource in resources.items():
- status_icon, status_text = _get_resource_status(resource)
- if status_text == "Active":
+ color, status_text = _get_resource_status(resource)
+ if status_text == "active":
active_count += 1
- elif status_text == "Inactive":
+ elif status_text == "inactive":
inactive_count += 1
- # Get name if available
name = getattr(resource, "name", "N/A")
endpoint_id = getattr(resource, "id", "N/A")
- resource_type = _get_resource_type(resource)
-
- # Truncate resource_id for display
- display_resource_id = (
- resource_id[:12] + "..." if len(resource_id) > 12 else resource_id
- )
- table.add_row(
- name,
- endpoint_id,
- f"{status_icon} {status_text}",
- resource_type,
- display_resource_id,
+ console.print(
+ f" [{color}]●[/{color}] [bold]{name}[/bold] "
+ f"[{color}]{status_text}[/{color}] [dim]{endpoint_id}[/dim]"
)
- console.print(table)
-
- # Summary
total = len(resources)
unknown_count = total - active_count - inactive_count
- summary = f"Total: {total} endpoint{'s' if total != 1 else ''}"
+ parts = []
if active_count > 0:
- summary += f" ({active_count} active"
+ parts.append(f"[green]{active_count} active[/green]")
if inactive_count > 0:
- summary += (
- f", {inactive_count} inactive"
- if active_count > 0
- else f" ({inactive_count} inactive"
- )
+ parts.append(f"[red]{inactive_count} inactive[/red]")
if unknown_count > 0:
- summary += (
- f", {unknown_count} unknown"
- if (active_count > 0 or inactive_count > 0)
- else f" ({unknown_count} unknown"
- )
- if active_count > 0 or inactive_count > 0 or unknown_count > 0:
- summary += ")"
+ parts.append(f"[yellow]{unknown_count} unknown[/yellow]")
- console.print(f"\n{summary}\n")
- console.print("💡 Use [bold]flash undeploy [/bold] to remove an endpoint")
- console.print("💡 Use [bold]flash undeploy --all[/bold] to remove all endpoints")
console.print(
- "💡 Use [bold]flash undeploy --interactive[/bold] for checkbox selection"
+ f"\n {total} endpoint{'s' if total != 1 else ''} {', '.join(parts)}"
)
+ console.print("\n [bold]Commands[/bold]")
+ console.print(" [dim]flash undeploy [/dim] Remove an endpoint")
+ console.print(" [dim]flash undeploy --all[/dim] Remove all endpoints")
+ console.print(" [dim]flash undeploy --interactive[/dim] Checkbox selection")
+ console.print()
+
def _cleanup_stale_endpoints(
resources: Dict[str, DeployableResource], manager: ResourceManager
@@ -179,61 +144,45 @@ def _cleanup_stale_endpoints(
resources: Dictionary of resource_id -> DeployableResource
manager: ResourceManager instance for removing resources
"""
- console.print(
- Panel(
- "Checking for inactive endpoints...\n\n"
- "This will remove endpoints from tracking that are no longer active\n"
- "(already deleted via RunPod UI or API).",
- title="Cleanup Stale Endpoints",
- expand=False,
- )
- )
+ console.print("[bold]Cleanup stale endpoints[/bold]\n")
- # Find inactive endpoints
inactive = []
with console.status("Checking endpoint status..."):
for resource_id, resource in resources.items():
- status_icon, status_text = _get_resource_status(resource)
- if status_text == "Inactive":
+ color, status_text = _get_resource_status(resource)
+ if status_text == "inactive":
inactive.append((resource_id, resource))
if not inactive:
- console.print("\n[green]✓[/green] No inactive endpoints found")
+ console.print("[green]No inactive endpoints found[/green]")
return
- # Show what will be removed
- console.print(f"\nFound [yellow]{len(inactive)}[/yellow] inactive endpoint(s):")
+ console.print(f"Found [yellow]{len(inactive)}[/yellow] inactive endpoint(s):")
for resource_id, resource in inactive:
- console.print(f" • {resource.name} ({getattr(resource, 'id', 'N/A')})")
+ console.print(f" {resource.name} {getattr(resource, 'id', 'N/A')}")
- # Confirm removal
if not Confirm.ask(
- "\n[yellow]⚠️ Remove these from tracking?[/yellow]",
+ "\n[yellow]Remove these from tracking?[/yellow]",
default=False,
):
console.print("[yellow]Cancelled[/yellow]")
return
- # Undeploy inactive endpoints (force remove from tracking even if already deleted remotely)
removed_count = 0
for resource_id, resource in inactive:
result = asyncio.run(
manager.undeploy_resource(resource_id, resource.name, force_remove=True)
)
- if result["success"]:
+ if result.get("success"):
removed_count += 1
- console.print(
- f"[green]✓[/green] Removed [cyan]{resource.name}[/cyan] from tracking"
- )
+ console.print(f" [green]Removed[/green] {resource.name}")
else:
- # Resource already deleted remotely, but force_remove cleaned up tracking
- removed_count += 1
console.print(
- f"[yellow]⚠[/yellow] {resource.name}: Already deleted remotely, removed from tracking"
+ f" [red]Failed[/red] {resource.name}: {result.get('message', 'unknown error')}"
)
- console.print(f"\n[green]✓[/green] Cleaned up {removed_count} inactive endpoint(s)")
+ console.print(f"\n[green]Cleaned up {removed_count} endpoint(s)[/green]")
def undeploy_command(
@@ -275,7 +224,6 @@ def undeploy_command(
# Remove stale endpoint tracking (already deleted externally)
flash undeploy --cleanup-stale
"""
- # Handle "list" as a special case
if name == "list":
list_command()
return
@@ -284,22 +232,13 @@ def undeploy_command(
resources = manager.list_all_resources()
if not resources:
- console.print(
- Panel(
- "No endpoints found to undeploy\n\n"
- "Use @remote decorator to deploy endpoints.",
- title="No Endpoints",
- expand=False,
- )
- )
+ console.print("No endpoints found to undeploy.")
return
- # Handle cleanup-stale mode
if cleanup_stale:
_cleanup_stale_endpoints(resources, manager)
return
- # Handle different modes
if interactive:
_interactive_undeploy(resources, skip_confirm=force)
elif all:
@@ -307,17 +246,9 @@ def undeploy_command(
elif name:
_undeploy_by_name(name, resources, skip_confirm=force)
else:
- console.print(
- Panel(
- "Usage: flash undeploy [name | list | --all | --interactive | --cleanup-stale]",
- title="Undeploy Help",
- expand=False,
- )
- )
console.print(
"[red]Error:[/red] Please specify a name, use --all/--interactive, or run `flash undeploy list`"
)
- # Exit 0: Treat usage help display as successful operation for better UX
raise typer.Exit(0)
@@ -329,7 +260,6 @@ def _undeploy_by_name(name: str, resources: dict, skip_confirm: bool = False):
resources: Dict of all resources
skip_confirm: Skip confirmation prompts
"""
- # Find matching resources
matches = []
for resource_id, resource in resources.items():
if hasattr(resource, "name") and resource.name == name:
@@ -337,32 +267,14 @@ def _undeploy_by_name(name: str, resources: dict, skip_confirm: bool = False):
if not matches:
console.print(f"[red]Error:[/red] No endpoint found with name '{name}'")
- console.print(
- "\n💡 Use [bold]flash undeploy list[/bold] to see available endpoints"
- )
+ console.print("\n [dim]flash undeploy list[/dim] Show available endpoints")
raise typer.Exit(1)
- # Show what will be deleted
- console.print(
- Panel(
- "[yellow]⚠️ The following endpoint(s) will be deleted:[/yellow]\n",
- title="Undeploy Confirmation",
- expand=False,
- )
- )
-
+ console.print()
for resource_id, resource in matches:
endpoint_id = getattr(resource, "id", "N/A")
- resource_type = _get_resource_type(resource)
- status_icon, status_text = _get_resource_status(resource)
-
- console.print(f" • [bold]{resource.name}[/bold]")
- console.print(f" Endpoint ID: {endpoint_id}")
- console.print(f" Type: {resource_type}")
- console.print(f" Status: {status_icon} {status_text}")
- console.print()
-
- console.print("[red]🚨 This action cannot be undone![/red]\n")
+ console.print(f" [bold]{resource.name}[/bold] {endpoint_id}")
+ console.print("\n [yellow]This action cannot be undone.[/yellow]\n")
if not skip_confirm:
try:
@@ -371,34 +283,25 @@ def _undeploy_by_name(name: str, resources: dict, skip_confirm: bool = False):
).ask()
if not confirmed:
- console.print("Undeploy cancelled")
+ console.print("[yellow]Cancelled[/yellow]")
raise typer.Exit(0)
except KeyboardInterrupt:
- console.print("\nUndeploy cancelled")
+ console.print("\n[yellow]Cancelled[/yellow]")
raise typer.Exit(0)
- # Delete endpoints
+ console.print()
manager = _get_resource_manager()
- with console.status("Deleting endpoint(s)..."):
- results = []
- for resource_id, resource in matches:
+ results = []
+ for resource_id, resource in matches:
+ with console.status(f"Deleting {resource.name}..."):
result = asyncio.run(manager.undeploy_resource(resource_id, resource.name))
- results.append(result)
+ if result["success"]:
+ console.print(f" [green]Deleted[/green] {resource.name}")
+ else:
+ console.print(f" [red]Failed[/red] {resource.name}")
+ results.append(result)
- # Show results
- success_count = sum(1 for r in results if r["success"])
- fail_count = len(results) - success_count
-
- if success_count > 0:
- console.print(
- f"\n[green]✓[/green] Successfully deleted {success_count} endpoint(s)"
- )
- if fail_count > 0:
- console.print(f"[red]✗[/red] Failed to delete {fail_count} endpoint(s)")
- console.print("\nErrors:")
- for result in results:
- if not result["success"]:
- console.print(f" • {result['message']}")
+ _print_undeploy_summary(results)
def _undeploy_all(resources: dict, skip_confirm: bool = False):
@@ -408,21 +311,15 @@ def _undeploy_all(resources: dict, skip_confirm: bool = False):
resources: Dict of all resources
skip_confirm: Skip confirmation prompts
"""
- # Show what will be deleted
- console.print(
- Panel(
- f"[yellow]⚠️ ALL {len(resources)} endpoint(s) will be deleted![/yellow]\n",
- title="Undeploy All Confirmation",
- expand=False,
- )
- )
-
+ console.print()
for resource_id, resource in resources.items():
name = getattr(resource, "name", "N/A")
endpoint_id = getattr(resource, "id", "N/A")
- console.print(f" • {name} ({endpoint_id})")
-
- console.print("\n[red]🚨 This action cannot be undone![/red]\n")
+ console.print(f" [bold]{name}[/bold] {endpoint_id}")
+ console.print(
+ f"\n [yellow]All {len(resources)} endpoint(s) will be deleted. "
+ f"This action cannot be undone.[/yellow]\n"
+ )
if not skip_confirm:
try:
@@ -431,43 +328,32 @@ def _undeploy_all(resources: dict, skip_confirm: bool = False):
).ask()
if not confirmed:
- console.print("Undeploy cancelled")
+ console.print("[yellow]Cancelled[/yellow]")
raise typer.Exit(0)
- # Double confirmation for --all
typed_confirm = questionary.text("Type 'DELETE ALL' to confirm:").ask()
if typed_confirm != "DELETE ALL":
- console.print("Confirmation failed - text does not match")
+ console.print("[red]Confirmation failed[/red] - text does not match")
raise typer.Exit(1)
except KeyboardInterrupt:
- console.print("\nUndeploy cancelled")
+ console.print("\n[yellow]Cancelled[/yellow]")
raise typer.Exit(0)
- # Delete all endpoints
+ console.print()
manager = _get_resource_manager()
- with console.status(f"Deleting {len(resources)} endpoint(s)..."):
- results = []
- for resource_id, resource in resources.items():
- name = getattr(resource, "name", "N/A")
+ results = []
+ for resource_id, resource in resources.items():
+ name = getattr(resource, "name", "N/A")
+ with console.status(f"Deleting {name}..."):
result = asyncio.run(manager.undeploy_resource(resource_id, name))
- results.append(result)
+ if result["success"]:
+ console.print(f" [green]Deleted[/green] {name}")
+ else:
+ console.print(f" [red]Failed[/red] {name}")
+ results.append(result)
- # Show results
- success_count = sum(1 for r in results if r["success"])
- fail_count = len(results) - success_count
-
- console.print("\n" + "=" * 50)
- if success_count > 0:
- console.print(
- f"[green]✓[/green] Successfully deleted {success_count} endpoint(s)"
- )
- if fail_count > 0:
- console.print(f"[red]✗[/red] Failed to delete {fail_count} endpoint(s)")
- console.print("\nErrors:")
- for result in results:
- if not result["success"]:
- console.print(f" • {result['message']}")
+ _print_undeploy_summary(results)
def _interactive_undeploy(resources: dict, skip_confirm: bool = False):
@@ -477,16 +363,15 @@ def _interactive_undeploy(resources: dict, skip_confirm: bool = False):
resources: Dict of all resources
skip_confirm: Skip confirmation prompts
"""
- # Create choices for questionary
choices = []
resource_map = {}
for resource_id, resource in resources.items():
name = getattr(resource, "name", "N/A")
endpoint_id = getattr(resource, "id", "N/A")
- status_icon, status_text = _get_resource_status(resource)
+ color, status_text = _get_resource_status(resource)
- choice_text = f"{name} ({endpoint_id}) - {status_icon} {status_text}"
+ choice_text = f"{name} ({endpoint_id}) - {status_text}"
choices.append(choice_text)
resource_map[choice_text] = (resource_id, resource)
@@ -500,24 +385,15 @@ def _interactive_undeploy(resources: dict, skip_confirm: bool = False):
console.print("No endpoints selected")
raise typer.Exit(0)
- # Show confirmation
- console.print(
- Panel(
- f"[yellow]⚠️ {len(selected)} endpoint(s) will be deleted:[/yellow]\n",
- title="Undeploy Confirmation",
- expand=False,
- )
- )
-
selected_resources = []
+ console.print()
for choice in selected:
resource_id, resource = resource_map[choice]
selected_resources.append((resource_id, resource))
name = getattr(resource, "name", "N/A")
endpoint_id = getattr(resource, "id", "N/A")
- console.print(f" • {name} ({endpoint_id})")
-
- console.print("\n[red]🚨 This action cannot be undone![/red]\n")
+ console.print(f" [bold]{name}[/bold] {endpoint_id}")
+ console.print("\n [yellow]This action cannot be undone.[/yellow]\n")
if not skip_confirm:
confirmed = questionary.confirm(
@@ -525,33 +401,42 @@ def _interactive_undeploy(resources: dict, skip_confirm: bool = False):
).ask()
if not confirmed:
- console.print("Undeploy cancelled")
+ console.print("[yellow]Cancelled[/yellow]")
raise typer.Exit(0)
except KeyboardInterrupt:
- console.print("\nUndeploy cancelled")
+ console.print("\n[yellow]Cancelled[/yellow]")
raise typer.Exit(0)
- # Delete selected endpoints
+ console.print()
manager = _get_resource_manager()
- with console.status(f"Deleting {len(selected_resources)} endpoint(s)..."):
- results = []
- for resource_id, resource in selected_resources:
- name = getattr(resource, "name", "N/A")
+ results = []
+ for resource_id, resource in selected_resources:
+ name = getattr(resource, "name", "N/A")
+ with console.status(f"Deleting {name}..."):
result = asyncio.run(manager.undeploy_resource(resource_id, name))
- results.append(result)
+ if result["success"]:
+ console.print(f" [green]Deleted[/green] {name}")
+ else:
+ console.print(f" [red]Failed[/red] {name}")
+ results.append(result)
+
+ _print_undeploy_summary(results)
+
- # Show results
+def _print_undeploy_summary(results: list[dict]):
+ """Print summary after undeploy operations."""
success_count = sum(1 for r in results if r["success"])
fail_count = len(results) - success_count
-
- console.print("\n" + "=" * 50)
- if success_count > 0:
+ console.print()
+ if fail_count == 0:
+ console.print(
+ f"[green]Deleted[/green] {success_count} "
+ f"endpoint{'s' if success_count != 1 else ''}"
+ )
+ else:
console.print(
- f"[green]✓[/green] Successfully deleted {success_count} endpoint(s)"
+ f"[red]{fail_count}[/red] of {len(results)} endpoint(s) failed to delete"
)
- if fail_count > 0:
- console.print(f"[red]✗[/red] Failed to delete {fail_count} endpoint(s)")
- console.print("\nErrors:")
for result in results:
if not result["success"]:
- console.print(f" • {result['message']}")
+ console.print(f" {result['message']}")
diff --git a/src/runpod_flash/cli/docs/README.md b/src/runpod_flash/cli/docs/README.md
index 3cbc8392..a9a70853 100644
--- a/src/runpod_flash/cli/docs/README.md
+++ b/src/runpod_flash/cli/docs/README.md
@@ -4,23 +4,38 @@ Command-line interface for Flash - distributed inference and serving framework.
## Quick Start
+If you haven't already, install Flash:
+
```bash
-# Create new project
-flash init my-project
+pip install runpod-flash
+```
-# Navigate to project
-cd my-project
+Create a new project, navigate to it, and install dependencies:
-# Install dependencies
+```bash
+flash init my-project
+cd my-project
pip install -r requirements.txt
+```
-# Add your Runpod API key to .env
-# RUNPOD_API_KEY=your_key_here
+Add your Runpod API key to `.env`:
+```bash
+echo "RUNPOD_API_KEY=your_api_key_here" > .env
+```
+
+Start the development server to test your `@remote` functions:
-# Run development server
+```bash
flash run
```
+When you're ready to deploy your application to Runpod, use:
+
+```bash
+flash deploy
+```
+
+
## Commands
### flash init
@@ -77,9 +92,45 @@ flash build --exclude torch,torchvision,torchaudio # Exclude large packages
---
+### flash deploy
+
+Build and deploy Flash applications to Runpod Serverless endpoints in one step.
+
+```bash
+flash deploy [OPTIONS]
+```
+
+**Options:**
+- `--env, -e`: Target environment name
+- `--app, -a`: Flash app name
+- `--no-deps`: Skip transitive dependencies during pip install
+- `--exclude`: Comma-separated packages to exclude (e.g., 'torch,torchvision')
+- `--use-local-flash`: Bundle local runpod_flash source (for development)
+- `--output, -o`: Custom archive name (default: artifact.tar.gz)
+- `--preview`: Build and launch local preview instead of deploying
+
+**Examples:**
+```bash
+# Build and deploy (auto-selects environment if only one exists)
+flash deploy
+
+# Deploy to specific environment
+flash deploy --env staging
+
+# Deploy with excluded packages
+flash deploy --exclude torch,torchvision,torchaudio
+
+# Build and test locally before deploying
+flash deploy --preview
+```
+
+[Full documentation](./flash-deploy.md)
+
+---
+
### flash run
-Run Flash development server.
+Start a Flash development server for testing/debugging/development.
```bash
flash run [OPTIONS]
@@ -89,7 +140,7 @@ flash run [OPTIONS]
- `--host`: Host to bind to (default: localhost)
- `--port, -p`: Port to bind to (default: 8888)
- `--reload/--no-reload`: Enable auto-reload (default: enabled)
-- `--auto-provision`: Auto-provision serverless endpoints on startup (default: disabled)
+- `--auto-provision`: Auto-provision Serverless endpoints on startup (default: disabled)
**Example:**
```bash
@@ -101,9 +152,81 @@ flash run --port 3000
---
+### flash env
+
+Manage deployment environments for your Flash applications.
+
+```bash
+flash env [OPTIONS]
+```
+
+**Subcommands:**
+- `list`: Show all available environments
+- `create `: Create a new environment
+- `get `: Get detailed environment information
+- `delete `: Delete an environment and its resources
+
+**Options:**
+- `--app, -a`: Flash app name (auto-detected if in project directory)
+
+**Examples:**
+```bash
+# List all environments
+flash env list
+
+# Create new environment
+flash env create staging
+
+# Get environment details
+flash env get production
+
+# Delete environment
+flash env delete dev
+```
+
+[Full documentation](./flash-env.md)
+
+---
+
+### flash app
+
+Manage Flash apps (cloud-side organizational units that group deployment environments, build artifacts, and configuration).
+
+```bash
+flash app [OPTIONS]
+```
+
+**Subcommands:**
+- `list`: Show all Flash apps
+- `create `: Create a new Flash app
+- `get `: Get detailed app information
+- `delete`: Delete an app and all associated resources
+
+**Options:**
+- `--app, -a`: Flash app name (required for delete)
+
+**Examples:**
+```bash
+# List all apps
+flash app list
+
+# Create new app
+flash app create my-project
+
+# Get app details
+flash app get my-project
+
+# Delete app
+flash app delete --app my-project
+```
+
+[Full documentation](./flash-app.md)
+
+---
+
### flash undeploy
-Manage and delete RunPod serverless endpoints.
+Manage and delete Runpod serverless endpoints.
```bash
flash undeploy [NAME|list] [OPTIONS]
@@ -115,6 +238,7 @@ flash undeploy [NAME|list] [OPTIONS]
- `--cleanup-stale`: Remove inactive endpoints from tracking
**Examples:**
+
```bash
# List all tracked endpoints
flash undeploy list
@@ -133,6 +257,7 @@ flash undeploy --cleanup-stale
```
**Status Indicators:**
+
- 🟢 **Active**: Endpoint is running and healthy
- 🔴 **Inactive**: Endpoint deleted externally (use --cleanup-stale to remove from tracking)
- ❓ **Unknown**: Health check failed
diff --git a/src/runpod_flash/cli/docs/flash-app.md b/src/runpod_flash/cli/docs/flash-app.md
new file mode 100644
index 00000000..3abc29a2
--- /dev/null
+++ b/src/runpod_flash/cli/docs/flash-app.md
@@ -0,0 +1,473 @@
+# flash app
+
+Manage Flash applications (top-level organizational units).
+
+## Overview
+
+A **Flash app** is a cloud-side container that groups everything related to a single project: your deployment environments, build artifacts, and configuration. Think of it as a project namespace in Runpod that keeps your `dev`, `staging`, and `production` deployments organized together.
+
+
+**When to use `flash app` commands:**
+- **`list` / `get`** — Viewing your apps and their status
+- **`delete`** — Cleaning up apps you no longer need
+- **`create`** — Pre-registering apps before deployment (rare, mainly for CI/CD)
+
+**What an app contains:**
+| Resource | Description |
+|----------|-------------|
+| Environments | Deployment contexts (dev, staging, production) |
+| Builds | Versioned artifacts created from your code |
+| Configuration | App-wide settings and metadata |
+
+## Subcommands
+
+### flash app list
+
+Show all Flash apps under your account.
+
+```bash
+flash app list
+```
+
+**Output:**
+```
+┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
+┃ Name ┃ ID ┃ Environments ┃ Builds ┃
+┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
+│ my-project │ app_abc123 │ dev, staging, prod │ build_1, build_2 │
+│ demo-api │ app_def456 │ production │ build_3 │
+│ ml-inference │ app_ghi789 │ dev, production │ build_4, build_5 │
+└────────────────┴──────────────────────┴─────────────────────────┴──────────────────┘
+```
+
+---
+
+### flash app create
+
+Create a new Flash app.
+
+```bash
+flash app create
+```
+
+**Arguments:**
+- `name`: Name for the new Flash app (e.g., my-project, api-service)
+
+**Example:**
+```bash
+# Create new app
+flash app create my-project
+```
+
+**Output:**
+```
+╭───────────────────────────────────────────────╮
+│ ✅ App Created │
+├───────────────────────────────────────────────┤
+│ Flash app 'my-project' created successfully │
+│ │
+│ App ID: app_abc123 │
+╰───────────────────────────────────────────────╯
+```
+
+App names must be unique within your account.
+
+> **Note:** Most users don't need to run `flash app create` explicitly. Apps are **automatically created** when you first run `flash deploy`. The `create` subcommand exists for CI/CD pipelines and administrative workflows that need to pre-register apps before deployment. See [Flash Deploy](./flash-deploy.md) for details.
+
+---
+
+### flash app get
+
+Get detailed information about a Flash app.
+
+```bash
+flash app get
+```
+
+**Arguments:**
+- `name`: Name of the Flash app to inspect
+
+**Example:**
+```bash
+# Get details for my-project app
+flash app get my-project
+```
+
+**Output:**
+```
+╭─────────────────────────────────╮
+│ 📱 Flash App: my-project │
+├─────────────────────────────────┤
+│ Name: my-project │
+│ ID: app_abc123 │
+│ Environments: 3 │
+│ Builds: 5 │
+╰─────────────────────────────────╯
+
+ Environments
+┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
+┃ Name ┃ ID ┃ State ┃ Active Build ┃ Created ┃
+┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
+│ dev │ env_dev123 │ DEPLOYED│ build_xyz789 │ 2024-01-15 10:30 │
+│ staging │ env_stg456 │ DEPLOYED│ build_xyz789 │ 2024-01-16 14:20 │
+│ production │ env_prd789 │ DEPLOYED│ build_abc123 │ 2024-01-20 09:15 │
+└────────────┴────────────────────┴─────────┴──────────────────┴──────────────────┘
+
+ Builds
+┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
+┃ ID ┃ Status ┃ Created ┃
+┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
+│ build_abc123 │ COMPLETED │ 2024-01-20 09:00 │
+│ build_xyz789 │ COMPLETED │ 2024-01-18 15:45 │
+│ build_def456 │ COMPLETED │ 2024-01-15 11:20 │
+└────────────────────┴──────────────────────────┴──────────────────┘
+```
+
+---
+
+### flash app delete
+
+Delete a Flash app and all its associated resources.
+
+```bash
+flash app delete --app
+```
+
+**Options:**
+- `--app, -a`: Flash app name to delete (required, must be explicit for safety)
+
+**Note:** Unlike other subcommands, `delete` requires the `--app` flag for safety on destructive operations.
+
+**Example:**
+```bash
+# Delete my-project app
+flash app delete --app my-project
+```
+
+**Process:**
+1. Shows app details and resources to be deleted
+2. Prompts for confirmation (required)
+3. Deletes all environments and their resources
+4. Deletes all builds
+5. Deletes the app
+
+**Warning:** This operation is irreversible. All environments, builds, endpoints, volumes, and configuration will be permanently deleted.
+
+## Common Workflows
+
+### Creating Your First App
+
+When starting a new Flash project:
+
+```bash
+# Create project with flash init
+flash init my-project
+cd my-project
+
+# First deployment automatically creates app
+flash deploy
+# Creates app 'my-project' if it doesn't exist
+
+# Or create app explicitly first
+flash app create my-project
+flash env create production
+flash deploy --env production
+```
+
+### Organizing Multiple Apps
+
+```bash
+# Create apps for different projects
+flash app create api-gateway
+flash app create ml-inference
+flash app create data-processing
+
+# Each app has its own environments
+flash env create dev --app api-gateway
+flash env create prod --app api-gateway
+
+flash env create dev --app ml-inference
+flash env create prod --app ml-inference
+
+# List all apps to see organization
+flash app list
+```
+
+### Viewing App Details
+
+```bash
+# Get comprehensive app information
+flash app get my-project
+
+# See all environments and builds
+# Check deployment status
+# View resource allocation
+```
+
+### Cleaning Up Apps
+
+```bash
+# List all apps
+flash app list
+
+# Delete unused app and all its resources
+flash app delete --app old-project
+```
+
+## App Concepts
+
+### What is a Flash App?
+
+A Flash app is the top-level container that organizes all deployment-related resources:
+
+```
+Flash App (my-project)
+│
+├── Environments
+│ ├── dev
+│ │ ├── Endpoints (ep1, ep2)
+│ │ └── Volumes (vol1)
+│ ├── staging
+│ │ ├── Endpoints (ep1, ep2)
+│ │ └── Volumes (vol1)
+│ └── production
+│ ├── Endpoints (ep1, ep2)
+│ └── Volumes (vol1)
+│
+└── Builds
+ ├── build_v1 (2024-01-15)
+ ├── build_v2 (2024-01-18)
+ └── build_v3 (2024-01-20)
+```
+
+### Relationship to Environments and Builds
+
+**Apps contain Environments:**
+- Each app can have multiple environments (dev, staging, prod)
+- Environments are isolated deployment contexts within an app
+- Use `flash env` commands to manage environments
+
+**Apps store Builds:**
+- Each deployment creates a build artifact
+- Builds are versioned and tracked within the app
+- Environments reference builds to know what code to run
+
+**Apps provide Isolation:**
+- Different apps don't share resources
+- Each app has its own quota and limits
+- Apps can have different access controls
+
+### App Discovery and Auto-Detection
+
+Flash CLI automatically detects the app name from your current directory:
+
+```bash
+# In project directory
+cd /path/to/my-project
+
+# App name auto-detected from directory or config
+flash deploy # Deploys to 'my-project' app
+flash env list # Lists 'my-project' environments
+```
+
+You can always override with the `--app` flag:
+
+```bash
+flash deploy --app other-project
+flash env list --app other-project
+```
+
+### App Hierarchy
+
+```
+Runpod Account
+├── Flash App: my-api
+│ ├── Environment: dev
+│ ├── Environment: prod
+│ └── Builds: [v1, v2, v3]
+│
+├── Flash App: ml-inference
+│ ├── Environment: staging
+│ ├── Environment: production
+│ └── Builds: [v1, v2]
+│
+└── Flash App: data-processor
+ ├── Environment: production
+ └── Builds: [v1]
+```
+
+## Best Practices
+
+### Naming Conventions
+
+Use clear, descriptive names that reflect the project:
+
+```bash
+# Good
+flash app create user-api
+flash app create ml-inference
+flash app create data-pipeline
+
+# Avoid
+flash app create app1
+flash app create test
+flash app create abc
+```
+
+### App Organization Strategies
+
+**Single app per project (recommended for most cases):**
+```bash
+my-project/
+└── Flash App: my-project
+ ├── Environment: dev
+ ├── Environment: staging
+ └── Environment: production
+```
+
+**Multiple apps for microservices:**
+```bash
+# Separate apps for each service
+flash app create auth-service
+flash app create payment-service
+flash app create notification-service
+
+# Each has its own lifecycle
+flash deploy --app auth-service --env prod
+flash deploy --app payment-service --env prod
+```
+
+**App per team or feature:**
+```bash
+# Team-based
+flash app create frontend-team-app
+flash app create backend-team-app
+
+# Feature-based (temporary)
+flash app create feature-search
+flash app create feature-recommendations
+```
+
+### App Lifecycle Management
+
+1. **Development Phase**:
+ - Create app: `flash app create my-project`
+ - Create dev environment: `flash env create dev`
+ - Deploy and test: `flash deploy --env dev`
+
+2. **Staging Phase**:
+ - Create staging environment: `flash env create staging`
+ - Deploy for QA: `flash deploy --env staging`
+
+3. **Production Phase**:
+ - Create production environment: `flash env create production`
+ - Deploy to prod: `flash deploy --env production`
+
+4. **Maintenance**:
+ - Monitor: `flash app get my-project`
+ - Update: `flash deploy --env `
+ - Scale: Adjust resource configs
+
+5. **Cleanup**:
+ - Delete unused environments: `flash env delete `
+ - Delete entire app: `flash app delete --app my-project`
+
+### Resource Management
+
+- **Monitor app usage**: Use `flash app get` to track environments and builds
+- **Clean up old builds**: Builds accumulate over time
+- **Delete unused apps**: Remove apps you're no longer using
+- **Check costs**: Each app's resources contribute to your Runpod usage
+
+### Safety Features
+
+App deletion includes safety features:
+- **Confirmation prompt**: Required for all app deletions
+- **Cascade delete**: Automatically removes all environments and resources
+- **Validation**: Ensures all resources are properly cleaned up
+- **Abort on failure**: If any resource fails to delete, operation is aborted
+
+## Troubleshooting
+
+### App Not Found
+
+**Problem**: `Error: App 'my-project' not found`
+
+**Solution**: List apps to verify name:
+```bash
+flash app list
+```
+
+Create if missing:
+```bash
+flash app create my-project
+```
+
+### App Name Conflict
+
+**Problem**: `Error: App 'my-project' already exists`
+
+**Solution**: Choose a different name or use existing app:
+```bash
+# Use existing app
+flash deploy --app my-project
+
+# Or create with different name
+flash app create my-project-v2
+```
+
+### Cannot Delete App
+
+**Problem**: App deletion fails with resource errors
+
+**Solution**: Manually delete environments first:
+```bash
+# List environments
+flash env list --app my-project
+
+# Delete each environment
+flash env delete dev --app my-project
+flash env delete staging --app my-project
+
+# Then delete app
+flash app delete --app my-project
+```
+
+### App Auto-Detection Fails
+
+**Problem**: Commands don't detect app from current directory
+
+**Solution**: Specify app explicitly:
+```bash
+flash env list --app my-project
+flash deploy --app my-project
+```
+
+Or ensure you're in a valid Flash project directory with:
+- `main.py` with Flash server
+- `workers/` directory
+- Proper project structure
+
+### Multiple Apps With Same Name
+
+**Problem**: Multiple people on team created apps with same name
+
+**Solution**: Apps are namespaced to your account, so this shouldn't happen. If confused:
+```bash
+# List all your apps
+flash app list
+
+# Use app ID instead of name if needed
+flash app get
+```
+
+## Related Commands
+
+- [flash deploy](./flash-deploy.md) - Build and deploy to app/environment
+- [flash env](./flash-env.md) - Manage app environments
+- [flash build](./flash-build.md) - Create build artifacts
+- [flash init](./flash-init.md) - Initialize new Flash project
+
+## Related Documentation
+
+- [Flash Apps & Environments](../../../docs/Flash_Apps_and_Environments.md) - Architectural details on apps and environments
diff --git a/src/runpod_flash/cli/docs/flash-build.md b/src/runpod_flash/cli/docs/flash-build.md
index 9fa94eb5..120fe60e 100644
--- a/src/runpod_flash/cli/docs/flash-build.md
+++ b/src/runpod_flash/cli/docs/flash-build.md
@@ -1,6 +1,20 @@
# flash build
-Build Flash application for deployment.
+Build a deployment-ready artifact for your Flash application.
+
+## Overview
+
+The `flash build` command packages your Flash project into a deployable archive (`.flash/artifact.tar.gz`). It scans your codebase for `@remote` decorated functions, resolves dependencies, and creates a manifest that tells Runpod how to provision your serverless endpoints.
+
+### What happens during build
+
+1. **Function discovery:** Finds all `@remote` functions and groups them by their `resource_config`
+2. **Manifest generation:** Creates `.flash/flash_manifest.json` with endpoint definitions and routing info
+3. **Dependency installation:** Installs Python packages for Linux x86_64 (cross-platform compatible)
+4. **Packaging** — Bundles everything into a compressed archive
+
+> **Tip:** Most users should use `flash deploy` instead, which runs build + deploy in one step. Use `flash build` when you need more control over the build process or want to inspect the artifact before deploying.
+
## Usage
@@ -38,23 +52,13 @@ flash build --output my-app.tar.gz
flash build --keep-build --output deploy.tar.gz
```
-## What It Does
-
-The build process packages your Flash application into a self-contained deployment package:
-
-1. **Discovery**: Scans your project for `@remote` decorated functions
-2. **Grouping**: Groups functions by their `resource_config`
-3. **Manifest Creation**: Generates `flash_manifest.json` for service discovery
-4. **Dependency Installation**: Installs all Python dependencies locally
-5. **Packaging**: Creates `.flash/artifact.tar.gz` ready for deployment
-
## Build Artifacts
After `flash build` completes:
| File/Directory | Purpose |
|---|---|
-| `.flash/artifact.tar.gz` | Deployment package (ready for RunPod) |
+| `.flash/artifact.tar.gz` | Deployment package (ready for Runpod) |
| `.flash/flash_manifest.json` | Service discovery configuration |
| `.flash/.build/` | Temporary build directory (removed unless `--keep-build` specified) |
@@ -62,13 +66,13 @@ After `flash build` completes:
### Cross-Platform Builds
-Flash automatically handles cross-platform builds, ensuring compatibility with RunPod's Linux x86_64 serverless infrastructure:
+Flash automatically handles cross-platform builds, ensuring compatibility with Runpod's Linux x86_64 serverless infrastructure:
- **Automatic Platform Targeting**: Dependencies are always installed for Linux x86_64, regardless of your build platform (macOS, Windows, or Linux)
- **Python Version Matching**: Uses your current Python version to ensure package compatibility
- **Binary Wheel Enforcement**: Only pre-built binary wheels are used, preventing platform-specific compilation issues
-This means you can safely build on macOS ARM64, Windows, or any platform, and the deployment will work correctly on RunPod.
+This means you can safely build on macOS ARM64, Windows, or any platform, and the deployment will work correctly on Runpod.
### Default Behavior
@@ -98,7 +102,7 @@ Only installs direct dependencies specified in `@remote` decorators:
flash build --preview
```
-Launch a local Docker-based test environment immediately after building. This allows you to test your distributed system locally before deploying to RunPod.
+Launch a local Docker-based test environment immediately after building. This allows you to test your distributed system locally before deploying to Runpod.
**What happens:**
1. Builds your project (creates archive, manifest)
@@ -231,7 +235,7 @@ If a package doesn't have pre-built Linux x86_64 wheels:
### Size Limits
-RunPod serverless enforces a **500MB limit** on deployment archives. Exceeding this will cause deployment failures.
+Runpod Serverless enforces a **500MB limit** on deployment archives. Exceeding this will cause your deployment to fail.
### Excluding Base Image Packages
@@ -284,10 +288,13 @@ Check the [worker-flash repository](https://github.com/runpod-workers/worker-fla
After building:
1. **Test Locally**: Run `flash run` to test the application
-2. **Deploy**: Push the archive to RunPod for deployment
-3. **Monitor**: Use `flash undeploy list` to check deployed endpoints
+2. **Deploy**: Use `flash deploy` to deploy to Runpod Serverless
+3. **Preview**: Test with `flash build --preview` before production deployment
+4. **Monitor**: Use `flash env get` to check deployment status
## Related Commands
-- `flash run` - Start development server
-- `flash undeploy` - Manage deployed endpoints
+- [flash deploy](./flash-deploy.md) - Build and deploy in one step
+- [flash run](./flash-run.md) - Start development server
+- [flash env](./flash-env.md) - Manage deployment environments
+- [flash undeploy](./flash-undeploy.md) - Manage deployed endpoints
diff --git a/src/runpod_flash/cli/docs/flash-deploy.md b/src/runpod_flash/cli/docs/flash-deploy.md
new file mode 100644
index 00000000..504ad874
--- /dev/null
+++ b/src/runpod_flash/cli/docs/flash-deploy.md
@@ -0,0 +1,458 @@
+# flash deploy
+
+Build and deploy your Flash application to Runpod Serverless endpoints in one step.
+
+## Overview
+
+The `flash deploy` command is the primary way to get your Flash application running in the cloud. It combines the build process with deployment, taking your local code and turning it into live serverless endpoints on Runpod.
+
+**When to use this command:**
+- Deploying your application for the first time
+- Pushing code updates to an existing environment
+- Setting up new environments (dev, staging, production)
+- Testing your full distributed system with `--preview` before going live
+
+**What happens during deployment:**
+1. **Build:** Packages your code, dependencies, and manifest (same as `flash build`)
+2. **Upload:** Sends the artifact to Runpod's storage
+3. **Provision:** Creates or updates serverless endpoints based on your resource configs
+4. **Configure:** Sets up environment variables, volumes, and service discovery
+5. **Verify:** Confirms endpoints are healthy and displays access information
+
+**Key features:**
+- **One command:** No need to run build and deploy separately
+- **Smart environment handling:** Auto-selects environment if only one exists, prompts if multiple
+- **Incremental updates:** Only updates what changed, preserving endpoint URLs
+- **Preview mode:** Test locally with Docker before deploying to production
+
+## Architecture: Fully Deployed to Runpod
+
+With `flash deploy`, your **entire application** runs on Runpod Serverless—both your FastAPI app (the "orchestrator") and all `@remote` worker functions:
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│ RUNPOD SERVERLESS │
+│ │
+│ ┌─────────────────────────────────────┐ │
+│ │ MOTHERSHIP ENDPOINT │ │
+│ │ (your FastAPI app from main.py) │ │
+│ │ - Your HTTP routes │ │
+│ │ - Orchestrates @remote calls │───────────┐ │
+│ │ - Public URL for users │ │ │
+│ └─────────────────────────────────────┘ │ │
+│ │ internal │
+│ ▼ │
+│ ┌─────────────────────────┐ ┌─────────────────────────┐ │
+│ │ gpu-worker │ │ cpu-worker │ │
+│ │ (your @remote function) │ │ (your @remote function) │ │
+│ └─────────────────────────┘ └─────────────────────────┘ │
+└─────────────────────────────────────────────────────────────────┘
+ ▲
+ │ HTTPS (authenticated)
+ │
+ ┌─────┴─────┐
+ │ USERS │
+ └───────────┘
+```
+
+**Key points:**
+- **Your FastAPI app runs on Runpod** as the "mothership" endpoint
+- **`@remote` functions run on Runpod** as separate worker endpoints
+- **Users call the mothership URL** directly (e.g., `https://xyz123.api.runpod.ai/api/hello`)
+- **No `live-` prefix** on endpoint names (these are production endpoints)
+- **No hot reload:** code changes require a new deployment
+
+This is different from `flash run`, where your FastAPI app runs locally on your machine. See [flash run](./flash-run.md) for the hybrid development architecture.
+
+### flash run vs flash deploy
+
+| Aspect | `flash run` | `flash deploy` |
+|--------|-------------|----------------|
+| **FastAPI app runs on** | Your machine (localhost) | Runpod Serverless (mothership) |
+| **`@remote` functions run on** | Runpod Serverless | Runpod Serverless |
+| **Endpoint naming** | `live-` prefix (e.g., `live-gpu-worker`) | No prefix (e.g., `gpu-worker`) |
+| **Hot reload** | Yes | No |
+| **Use case** | Development & testing | Production deployment |
+| **Build artifact created** | No | Yes (tarball + manifest) |
+
+## Usage
+
+```bash
+flash deploy [OPTIONS]
+```
+
+## Options
+
+- `--env, -e`: Target environment name (auto-selected if only one exists)
+- `--app, -a`: Flash app name (auto-detected from current directory)
+- `--no-deps`: Skip transitive dependencies during pip install (default: false)
+- `--exclude`: Comma-separated packages to exclude (e.g., 'torch,torchvision')
+- `--use-local-flash`: Bundle local runpod_flash source instead of PyPI version (for development/testing)
+- `--output, -o`: Custom archive name (default: artifact.tar.gz)
+- `--preview`: Build and launch local preview environment instead of deploying
+
+## Examples
+
+```bash
+# Build and deploy (auto-selects environment if only one exists)
+flash deploy
+
+# Deploy to specific environment
+flash deploy --env staging
+
+# Deploy to specific app and environment
+flash deploy --app my-project --env production
+
+# Deploy with excluded packages (reduces deployment size)
+flash deploy --exclude torch,torchvision,torchaudio
+
+# Build and test locally before deploying
+flash deploy --preview
+
+# Combine options
+flash deploy --env staging --exclude torch --no-deps
+```
+
+## What It Does
+
+The deploy command combines building and deploying your Flash application in a single step:
+
+1. **Build Phase**: Creates deployment artifact (see [flash build](./flash-build.md) for details)
+ - Scans project for `@remote` decorated functions
+ - Groups functions by resource configuration
+ - Creates `flash_manifest.json` for service discovery
+ - Installs dependencies with Linux x86_64 compatibility
+ - Packages everything into `.flash/artifact.tar.gz`
+
+2. **Environment Resolution**:
+ - Auto-detects app name from current directory
+ - If no app exists, creates it automatically
+ - If `--env` specified, uses that environment (creates if missing)
+ - If only one environment exists, uses it automatically
+ - If multiple environments exist, prompts for selection
+
+3. **Deployment Phase**:
+ - Uploads the build artifact to Runpod storage
+ - Provisions Serverless endpoints based on resource configs
+ - Configures endpoints with environment variables and volumes
+ - Sets up service discovery for cross-endpoint function calls
+ - Registers endpoints in environment tracking
+
+4. **Post-Deployment**:
+ - Displays deployment URLs and available routes
+ - Shows authentication and testing guidance
+ - Cleans up temporary build directory
+
+## Build Options
+
+The deploy command supports all build options from `flash build`:
+
+### Skip Transitive Dependencies
+
+```bash
+flash deploy --no-deps
+```
+
+Only installs direct dependencies specified in `@remote` decorators. Useful when your base image already includes common packages.
+
+### Exclude Packages
+
+```bash
+flash deploy --exclude torch,torchvision,torchaudio
+```
+
+Skips specified packages during dependency installation. Critical for staying under Runpod's 500MB deployment limit. See [flash build](./flash-build.md#managing-deployment-size) for base image package reference.
+
+### Local Flash Development
+
+```bash
+flash deploy --use-local-flash
+```
+
+Bundles your local `runpod_flash` source instead of the PyPI version. Only use this for development and testing.
+
+## Preview Mode
+
+```bash
+flash deploy --preview
+```
+
+Builds your project and launches a local Docker-based test environment instead of deploying to Runpod. This allows you to test your distributed system locally before production deployment.
+
+**What happens:**
+1. Builds your project (creates the archive and manifest)
+2. Creates a Docker network for inter-container communication
+3. Starts one Docker container per resource config:
+ - Mothership container (orchestrator)
+ - All worker containers (GPU, CPU, etc.)
+4. Exposes the mothership on `localhost:8000`
+5. All containers communicate via Docker DNS
+6. On shutdown (Ctrl+C), automatically stops and removes all containers
+
+**Use this when:**
+- Testing deployment before production
+- Validating manifest structure
+- Debugging resource provisioning
+- Verifying endpoint auto-discovery
+- Testing distributed function calls
+
+See [flash build](./flash-build.md#preview-environment) for more details on preview mode.
+
+## Environment Management
+
+### What Is an Environment?
+
+An **environment** is an isolated deployment context within a Flash app. Each environment is a separate "stage" (like `dev`, `staging`, or `production`) that contains its own deployed endpoints, build versions, network volumes (if used) and deployment status.
+
+For more details about environment management, see [flash env](./flash-env.md).
+
+### Automatic Environment Creation
+
+If the specified environment doesn't exist, `flash deploy` creates it automatically:
+
+```bash
+# Creates 'staging' if it doesn't exist
+flash deploy --env staging
+```
+
+If no environment is specified and none exist, it creates a 'production' environment by default.
+
+### Environment Auto-Selection
+
+When you have only one environment, it's selected automatically:
+
+```bash
+# Auto-selects the only available environment
+flash deploy
+```
+
+When multiple environments exist, you must specify which one:
+
+```bash
+# Error: Multiple environments found
+flash deploy
+
+# Solution: Specify environment
+flash deploy --env staging
+```
+
+### Managing Environments
+
+Use `flash env` commands to manage environments:
+
+```bash
+# List all environments
+flash env list
+
+# Create new environment
+flash env create staging
+
+# View environment details
+flash env get production
+
+# Delete environment
+flash env delete dev
+```
+
+## Post-Deployment
+
+After successful deployment, the command displays guidance for using your deployed application:
+
+### 1. Authentication
+
+All endpoints require authentication with your Runpod API key:
+
+```bash
+# Set API key as environment variable (recommended)
+export RUNPOD_API_KEY="your_key_here"
+
+# Or use a .env file
+echo "RUNPOD_API_KEY=your_key_here" >> .env
+```
+
+### 2. Calling Your Functions
+
+Using HTTP/curl:
+
+```bash
+curl -X POST https://YOUR_ENDPOINT_URL/YOUR_PATH \
+ -H "Authorization: Bearer $RUNPOD_API_KEY" \
+ -H "Content-Type: application/json" \
+ -d '{"param1": "value1"}'
+```
+
+### 3. Available Routes
+
+The deployment output shows all available routes registered from your `@remote` decorators:
+
+```
+POST /api/process
+GET /api/status
+POST /gpu/inference
+```
+
+### 4. Monitoring
+
+View deployment status and logs:
+
+```bash
+# Check environment status
+flash env get production
+
+# View in Runpod Console
+# https://console.runpod.io/serverless
+```
+
+### 5. Updates
+
+To update your deployment with new code:
+
+```bash
+# Deploy updated code to same environment
+flash deploy --env production
+```
+
+This creates a new build and updates all endpoints in the environment.
+
+## Output
+
+Successful deployment displays:
+
+```
+╭───────────────────────── Flash Build Configuration ──────────────────────────╮
+│ Project: my-project │
+│ Directory: /path/to/project │
+│ Archive: .flash/artifact.tar.gz │
+│ Skip transitive deps: False │
+│ Keep build dir: False │
+╰──────────────────────────────────────────────────────────────────────────────╯
+⠙ ✓ Loaded ignore patterns
+⠙ ✓ Found 42 files to package
+⠙ ✓ Created .flash/.build/my-project/
+⠙ ✓ Copied 42 files
+⠙ ✓ Created manifest and registered 3 resources
+⠙ ✓ Installed 5 packages
+⠙ ✓ Created artifact.tar.gz (45.2 MB)
+⠙ ✓ Removed .build directory
+
+Deploying to 'production'...
+
+⠙ Uploading build artifact...
+⠙ Provisioning serverless endpoints...
+⠙ Configuring endpoints...
+
+✓ Deployment Complete
+
+Next Steps:
+
+1. Authentication Required
+ All endpoints require authentication. Set your API key as an environment
+ variable...
+
+2. Call Your Functions
+ Your mothership is deployed at:
+ https://api-xxxxx.runpod.net
+
+3. Available Routes
+ POST /api/hello
+ POST /gpu/process
+
+4. Monitor & Debug
+ flash env get production - View environment status
+ Runpod Console - View logs at https://console.runpod.io/serverless
+
+5. Update or Remove Deployment
+ flash deploy --env production - Update deployment
+ flash env delete production - Remove deployment
+```
+
+## Troubleshooting
+
+### Multiple Environments Error
+
+**Problem**: `Error: Multiple environments found: dev, staging, production`
+
+**Solution**: Specify the target environment:
+
+```bash
+flash deploy --env staging
+```
+
+### Build Fails
+
+If the build phase fails, see [flash build troubleshooting](./flash-build.md#troubleshooting) for common build issues.
+
+### Deployment Size Limit
+
+**Problem**: Deployment exceeds Runpod's 500MB limit
+
+**Solution**: Use `--exclude` to skip packages already in your base image:
+
+```bash
+# Exclude PyTorch packages (pre-installed in GPU images)
+flash deploy --exclude torch,torchvision,torchaudio
+```
+
+See [flash build - Managing Deployment Size](./flash-build.md#managing-deployment-size) for details on base image packages.
+
+### Authentication Fails
+
+**Problem**: `401 Unauthorized` when calling endpoints
+
+**Solution**: Ensure your API key is set correctly:
+
+```bash
+# Check if API key is set
+echo $RUNPOD_API_KEY
+
+# Set API key
+export RUNPOD_API_KEY="your_key_here"
+
+# Or load from .env file
+source .env
+```
+
+### Environment Not Found After Creation
+
+If you just created an environment but it can't be found, wait a few seconds for the API to sync, then retry.
+
+## Performance Considerations
+
+### Build Time
+
+The build phase can take several minutes depending on:
+- The number of dependencies that must be installed
+- Project size and file count
+- Whether dependencies need cross-platform compilation
+
+### Deployment Time
+
+Endpoint provisioning typically takes 2-5 minutes:
+- Container image pull and initialization
+- Endpoint health checks and registration
+- Service discovery configuration
+
+### Optimization Tips
+
+1. **Use `--no-deps`** if base image has dependencies
+2. **Use `--exclude`** for packages in base image
+3. **Cache builds** by deploying to same environment
+4. **Test with `--preview`** before deploying to production
+
+## Next Steps
+
+After deploying:
+
+1. **Test Your Endpoints**: Call your functions to verify deployment
+2. **Monitor Performance**: Check logs and metrics in Runpod Console
+3. **Set Up CI/CD**: Automate deployments with GitHub Actions
+4. **Scale Resources**: Adjust resource configs for production load
+5. **Manage Environments**: Use `flash env` commands for environment lifecycle
+
+## Related Commands
+
+- [flash build](./flash-build.md) - Build without deploying
+- [flash env](./flash-env.md) - Manage deployment environments
+- [flash app](./flash-app.md) - Manage Flash applications
+- [flash undeploy](./flash-undeploy.md) - Remove deployed endpoints
+- [flash run](./flash-run.md) - Local development server
diff --git a/src/runpod_flash/cli/docs/flash-env.md b/src/runpod_flash/cli/docs/flash-env.md
new file mode 100644
index 00000000..c3f87744
--- /dev/null
+++ b/src/runpod_flash/cli/docs/flash-env.md
@@ -0,0 +1,481 @@
+# flash env
+
+Manage deployment environments for Flash applications.
+
+## Overview
+
+An **environment** is an isolated deployment context within a Flash app. Each environment is a separate "stage" (like `dev`, `staging`, or `production`) that contains its own:
+
+- **Deployed endpoints** — Serverless endpoints provisioned from your `@remote` functions
+- **Active build version** — The specific version of your code running in this environment
+- **Network volumes** — Persistent storage for models, caches, etc.
+- **Deployment state** — Current status (PENDING, DEPLOYING, DEPLOYED, etc.)
+
+Environments enable standard development workflows: test changes in `dev`, validate in `staging`, then deploy to `production`. Each environment is completely independent—deploying to one has no effect on others.
+
+> **Note:** You don't always need to create environments explicitly. When you run `flash deploy --env `, the environment is **automatically created** if it doesn't exist. The `create` subcommand is useful when you want to set up environments before deploying, or in CI/CD pipelines.
+
+**When to use `flash env` commands:**
+- **`list` / `get`** — Checking environment status and what's deployed (common)
+- **`delete`** — Removing environments and their resources (common)
+- **`create`** — Pre-creating environments before deployment (optional)
+
+## Subcommands
+
+### flash env list
+
+Show all available environments for an app.
+
+```bash
+flash env list [OPTIONS]
+```
+
+**Options:**
+- `--app, -a`: Flash app name (auto-detected from current directory)
+
+**Example:**
+```bash
+# List environments for current app
+flash env list
+
+# List environments for specific app
+flash env list --app my-project
+```
+
+**Output:**
+```
+┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
+┃ Name ┃ ID ┃ Active Build ┃ Created At ┃
+┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
+│ dev │ env_abc123 │ build_xyz789 │ 2024-01-15 10:30 │
+│ staging │ env_def456 │ build_uvw456 │ 2024-01-16 14:20 │
+│ production │ env_ghi789 │ build_rst123 │ 2024-01-20 09:15 │
+└────────────┴─────────────────────┴───────────────────┴──────────────────┘
+```
+
+---
+
+### flash env create
+
+Create a new deployment environment.
+
+```bash
+flash env create [OPTIONS]
+```
+
+**Arguments:**
+- `name`: Name for the new environment (e.g., staging, dev, prod)
+
+**Options:**
+- `--app, -a`: Flash app name (auto-detected from current directory)
+
+**Example:**
+
+```bash
+# Create staging environment
+flash env create staging
+
+# Create environment in specific app
+flash env create production --app my-project
+```
+
+**Output:**
+```
+╭───────────────────────────────────────────────╮
+│ Environment Created │
+├───────────────────────────────────────────────┤
+│ Environment 'staging' created successfully │
+│ │
+│ App: my-project │
+│ Environment ID: env_abc123 │
+│ Status: PENDING │
+╰───────────────────────────────────────────────╯
+```
+
+**Notes:**
+- If the app doesn't exist, it's created automatically
+- Environment names must be unique within an app
+- Newly created environments have no active build until first deployment
+
+---
+
+### flash env get
+
+Show detailed information about a deployment environment.
+
+```bash
+flash env get [OPTIONS]
+```
+
+**Arguments:**
+- `name`: Name of the environment to inspect
+
+**Options:**
+- `--app, -a`: Flash app name (auto-detected from current directory)
+
+**Example:**
+```bash
+# Get details for production environment
+flash env get production
+
+# Get details for specific app's environment
+flash env get staging --app my-project
+```
+
+**Output:**
+```
+╭────────────────────────────────────╮
+│ Environment: production │
+├────────────────────────────────────┤
+│ ID: env_ghi789 │
+│ State: DEPLOYED │
+│ Active Build: build_rst123 │
+│ Created: 2024-01-20 09:15:00 │
+╰────────────────────────────────────╯
+
+ Associated Endpoints
+┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
+┃ Name ┃ ID ┃
+┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
+│ my-gpu │ ep_abc123 │
+│ my-cpu │ ep_def456 │
+└────────────────┴────────────────────┘
+
+ Associated Network Volumes
+┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓
+┃ Name ┃ ID ┃
+┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩
+│ model-cache │ nv_xyz789 │
+└────────────────┴────────────────────┘
+```
+
+---
+
+### flash env delete
+
+Delete a deployment environment and all its associated resources.
+
+```bash
+flash env delete [OPTIONS]
+```
+
+**Arguments:**
+- `name`: Name of the environment to delete
+
+**Options:**
+- `--app, -a`: Flash app name (auto-detected from current directory)
+
+**Example:**
+```bash
+# Delete development environment
+flash env delete dev
+
+# Delete environment in specific app
+flash env delete staging --app my-project
+```
+
+**Process:**
+1. Shows environment details and resources to be deleted
+2. Prompts for confirmation (required)
+3. Undeploys all associated endpoints
+4. Removes all associated network volumes
+5. Deletes the environment from the app
+
+**Output:**
+```
+╭───────────────────────────────────╮
+│ Delete Confirmation │
+├───────────────────────────────────┤
+│ Environment 'dev' will be deleted │
+│ │
+│ Environment ID: env_abc123 │
+│ App: my-project │
+│ Active Build: build_xyz789 │
+╰───────────────────────────────────╯
+
+? Are you sure you want to delete environment 'dev'?
+ This will delete all resources associated with this environment! (Y/n)
+
+Undeploying resources for 'dev'...
+Undeployed 2 resource(s) for 'dev'
+Deleting environment 'dev'...
+Environment 'dev' deleted successfully
+```
+
+**Warning:** This operation is irreversible. All endpoints, volumes, and configuration associated with the environment will be permanently deleted.
+
+## Common Workflows
+
+### Creating Your First Environment
+
+When starting a new project:
+
+```bash
+# Create project with flash init
+flash init my-project
+cd my-project
+
+# First deployment automatically creates app and environment
+flash deploy
+# Creates app 'my-project' and environment 'production' if they don't exist
+
+# Or create environment explicitly
+flash env create dev
+```
+
+### Managing Multiple Environments (Dev/Staging/Prod)
+
+```bash
+# Create all environments
+flash env create dev
+flash env create staging
+flash env create production
+
+# Deploy to specific environments
+flash deploy --env dev # Deploy to development
+flash deploy --env staging # Deploy to staging
+flash deploy --env production # Deploy to production
+
+# Check status of each
+flash env get dev
+flash env get staging
+flash env get production
+
+# List all environments
+flash env list
+```
+
+### Switching Between Environments
+
+```bash
+# Deploy different code versions to different environments
+git checkout main
+flash deploy --env production # Production gets main branch
+
+git checkout feature-branch
+flash deploy --env dev # Dev gets feature branch
+
+# Update production with new changes
+git checkout main
+git pull
+flash deploy --env production
+```
+
+### Cleaning Up Environments
+
+```bash
+# List all environments
+flash env list
+
+# Delete old development environment
+flash env delete old-dev
+
+# Delete staging after testing completes
+flash env delete staging
+```
+
+## Environment Concepts
+
+### What is a Flash Environment?
+
+An environment is a logical deployment context that groups:
+- **Endpoints**: Serverless endpoints provisioned from your `@remote` functions
+- **Network Volumes**: Persistent storage for models, cache, etc.
+- **Build Version**: The active build artifact deployed to the environment
+- **State**: Current deployment status (PENDING, DEPLOYED, FAILED, etc.)
+
+### Relationship to Apps and Builds
+
+```
+Flash App (my-project)
+├── Environment: dev
+│ ├── Build: build_v1
+│ ├── Endpoints: [ep1, ep2]
+│ └── Volumes: [vol1]
+├── Environment: staging
+│ ├── Build: build_v2
+│ ├── Endpoints: [ep1, ep2]
+│ └── Volumes: [vol1]
+└── Environment: production
+ ├── Build: build_v2
+ ├── Endpoints: [ep1, ep2]
+ └── Volumes: [vol1]
+```
+
+Each environment can run a different build version, allowing you to test changes in dev before promoting to production.
+
+### Environment Lifecycle
+
+1. **Creation**: `flash env create staging`
+ - State: PENDING
+ - No active build
+ - No endpoints provisioned
+
+2. **First Deployment**: `flash deploy --env staging`
+ - State: DEPLOYING
+ - Provisions endpoints
+ - Registers build as active
+ - State: DEPLOYED
+
+3. **Updates**: `flash deploy --env staging`
+ - Creates new build
+ - Updates endpoints with new code
+ - Updates active build reference
+
+4. **Deletion**: `flash env delete staging`
+ - Undeploys all endpoints
+ - Removes all volumes
+ - Deletes environment record
+
+### Environment States
+
+- **PENDING**: Environment created but not deployed
+- **DEPLOYING**: Deployment in progress
+- **DEPLOYED**: Successfully deployed and running
+- **FAILED**: Deployment or health check failed
+- **DELETING**: Deletion in progress
+
+## Best Practices
+
+### Naming Conventions
+
+Use clear, descriptive names that indicate purpose:
+
+```bash
+# Good
+flash env create dev
+flash env create staging
+flash env create production
+flash env create testing
+
+# Avoid
+flash env create env1
+flash env create test123
+flash env create abc
+```
+
+### Environment Strategy
+
+**Three-tier approach (recommended):**
+```bash
+dev # Active development, frequent deploys
+staging # Pre-production testing, QA validation
+production # Live user-facing deployment
+```
+
+**Simple approach (small projects):**
+```bash
+dev # Development and testing
+production # Live deployment
+```
+
+**Feature-based approach (large teams):**
+```bash
+dev
+feature-auth # Testing authentication feature
+feature-search # Testing search feature
+staging
+production
+```
+
+### Deployment Workflow
+
+1. **Develop locally**: Test with `flash run` or `flash deploy --preview`
+2. **Deploy to dev**: `flash deploy --env dev` for initial testing
+3. **Deploy to staging**: `flash deploy --env staging` for QA validation
+4. **Deploy to production**: `flash deploy --env production` after approval
+
+### Resource Management
+
+- **Monitor environments regularly**: `flash env list` to track active environments
+- **Clean up unused environments**: Delete old feature environments after merge
+- **Check resource usage**: `flash env get ` to see associated resources
+- **Delete carefully**: Remember that deletion is irreversible
+
+### Safety Features
+
+The delete command includes safety features:
+- **Confirmation prompt**: Required for all deletions
+- **Resource cleanup**: Automatically undeploys endpoints and volumes
+- **Validation**: Checks that all resources are properly removed
+- **Abort on failure**: If any resource fails to undeploy, deletion is aborted
+
+## Troubleshooting
+
+### Environment Not Found
+
+**Problem**: `Error: Environment 'staging' not found`
+
+**Solution**: List environments to verify name:
+```bash
+flash env list
+```
+
+Create if missing:
+```bash
+flash env create staging
+```
+
+### Multiple Apps Conflict
+
+**Problem**: Running `flash env list` shows wrong app's environments
+
+**Solution**: Specify app explicitly:
+```bash
+flash env list --app my-project
+```
+
+Or navigate to project directory:
+```bash
+cd my-project
+flash env list
+```
+
+### Cannot Delete Environment
+
+**Problem**: `Failed to undeploy all resources; environment deletion aborted`
+
+**Solution**: Check resource status:
+```bash
+flash env get
+```
+
+Manually undeploy problematic resources:
+```bash
+flash undeploy
+```
+
+Then retry deletion:
+```bash
+flash env delete
+```
+
+### Environment Stuck in DEPLOYING State
+
+**Problem**: Environment shows DEPLOYING state but deployment completed
+
+**Solution**: Check endpoint status in Runpod Console:
+- Visit https://console.runpod.io/serverless
+- Check endpoint health and logs
+- If healthy, try deploying again to update state
+
+### App Not Auto-Detected
+
+**Problem**: Command requires `--app` flag even when in project directory
+
+**Solution**: Ensure you're in a Flash project directory with:
+- `main.py` with Flash server
+- `workers/` directory
+- `.env` file with `RUNPOD_API_KEY`
+
+Or specify app explicitly:
+```bash
+flash env list --app my-project
+```
+
+## Related Commands
+
+- [flash deploy](./flash-deploy.md) - Build and deploy to environment
+- [flash app](./flash-app.md) - Manage Flash applications
+- [flash build](./flash-build.md) - Build deployment artifact
+- [flash undeploy](./flash-undeploy.md) - Manage individual endpoints
diff --git a/src/runpod_flash/cli/docs/flash-init.md b/src/runpod_flash/cli/docs/flash-init.md
index 2776fca1..082b619a 100644
--- a/src/runpod_flash/cli/docs/flash-init.md
+++ b/src/runpod_flash/cli/docs/flash-init.md
@@ -1,6 +1,23 @@
# flash init
-Create a new Flash project with Flash Server and GPU/CPU workers.
+Create a new Flash project with a ready-to-use template structure.
+
+## Overview
+
+The `flash init` command scaffolds a new Flash project with everything you need to get started: a main server (mothership), example GPU and CPU workers, and the directory structure that Flash expects. It's the fastest way to go from zero to a working distributed application.
+
+> **Note:** This command only creates **local files**. It doesn't interact with Runpod or create any cloud resources. Cloud resources (apps, environments, endpoints) are created later when you run `flash deploy`.
+
+### When to use this command
+- Starting a new Flash project from scratch
+- Learning how Flash projects are structured
+- Creating a boilerplate to customize for your use case
+
+**After initialization:**
+1. Copy `.env.example` to `.env` and add your `RUNPOD_API_KEY`
+2. Run `flash run` to start the local development server
+3. Customize the workers for your use case
+4. Deploy with `flash deploy` when ready
## Usage
diff --git a/src/runpod_flash/cli/docs/flash-logging.md b/src/runpod_flash/cli/docs/flash-logging.md
index 84b2d3ba..0b586578 100644
--- a/src/runpod_flash/cli/docs/flash-logging.md
+++ b/src/runpod_flash/cli/docs/flash-logging.md
@@ -1,17 +1,32 @@
# File-Based Logging
-Flash automatically logs CLI activity to local files during development, providing a persistent record of operations for debugging and auditing.
+Automatic logging of Flash CLI activity for debugging and auditing.
## Overview
-File-based logging is enabled by default in local development mode and automatically disabled in deployed containers. Logs are written to daily rotating files with configurable retention.
+Flash automatically logs all CLI operations to local files during development. This gives you a persistent record of what happened, which isuseful for debugging issues, auditing deployments, or understanding what Flash did behind the scenes.
-**Key Features:**
-- Automatic daily log rotation at midnight
-- Configurable retention period (default: 30 days)
-- Same format as console output
-- Graceful degradation (continues with stdout-only if file logging fails)
-- Zero configuration required (sensible defaults)
+### How it works
+
+File-based logging is enabled by default in local development mode ([flash run](./flash-run.md)) and automatically disabled in deployed containers ([flash deploy](./flash-deploy.md)).
+
+When you run a `@remote` function, Flash logs the activity to a file:
+
+```
+flash run
+ │
+ ├── Console output (what you see)
+ └── .flash/logs/activity.log (persistent record)
+```
+
+Logs are written in the same format as console output, so you can grep through them or review them in any text editor. (See [Log Format](#log-format) for details.)
+
+### Key features
+
+- **Automatic rotation**: New log file each day at midnight
+- **Configurable retention**: Default 30 days, adjustable via environment
+- **Graceful degradation**: Continues with stdout-only if file logging fails
+- **Zero configuration**: Works out of the box with sensible defaults
## Log Location
@@ -151,9 +166,9 @@ flash build
## Behavior in Deployed Containers
-File-based logging is **automatically disabled** in deployed RunPod containers, regardless of environment variable settings. This prevents unnecessary disk I/O and storage usage in production.
+File-based logging is **automatically disabled** in deployed Runpod containers, regardless of environment variable settings. This prevents unnecessary disk I/O and storage usage in production.
-Only stdout/stderr logging is active in deployed environments, which is automatically captured by RunPod's logging infrastructure.
+Only stdout/stderr logging is active in deployed environments, which is automatically captured by Runpod's logging infrastructure.
## Troubleshooting
diff --git a/src/runpod_flash/cli/docs/flash-run.md b/src/runpod_flash/cli/docs/flash-run.md
index 36fc8fe7..0b9cfd73 100644
--- a/src/runpod_flash/cli/docs/flash-run.md
+++ b/src/runpod_flash/cli/docs/flash-run.md
@@ -1,6 +1,46 @@
# flash run
-Run Flash development server.
+Start the Flash development server for testing/debugging/development.
+
+## Overview
+
+The `flash run` command starts a local development server that hosts your FastAPI app on your machine while deploying `@remote` functions to Runpod Serverless. This hybrid architecture lets you rapidly iterate on your application with hot-reload while testing real GPU/CPU workloads in the cloud.
+
+Use `flash run` when you want to skip the build step and test/develop/debug your remote functions rapidly before deploying your full application with `flash deploy`. (See [Flash Deploy](./flash-deploy.md) for details.)
+
+## Architecture: Local App + Remote Workers
+
+With `flash run`, your system runs in a **hybrid architecture**:
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│ YOUR MACHINE (localhost:8888) │
+│ ┌─────────────────────────────────────┐ │
+│ │ FastAPI App (main.py) │ │
+│ │ - Your HTTP routes │ │
+│ │ - Orchestrates @remote calls │─────────┐ │
+│ │ - Hot-reload enabled │ │ │
+│ └─────────────────────────────────────┘ │ │
+└──────────────────────────────────────────────────│──────────────┘
+ │ HTTPS
+ ▼
+┌─────────────────────────────────────────────────────────────────┐
+│ RUNPOD SERVERLESS │
+│ ┌─────────────────────────┐ ┌─────────────────────────┐ │
+│ │ live-gpu-worker │ │ live-cpu-worker │ │
+│ │ (your @remote function) │ │ (your @remote function) │ │
+│ └─────────────────────────┘ └─────────────────────────┘ │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+**Key points:**
+- **Your FastAPI app runs locally** on your machine (uvicorn at `localhost:8888`)
+- **`@remote` functions run on Runpod** as serverless endpoints
+- **Your machine is the orchestrator** that calls remote endpoints when you invoke `@remote` functions
+- **Hot reload works** because your app code is local—changes are picked up instantly
+- **Endpoints are prefixed with `live-`** to distinguish development endpoints from production (e.g., `gpu-worker` becomes `live-gpu-worker`)
+
+This is different from `flash deploy`, where **everything** (including your FastAPI app) runs on Runpod. See [flash deploy](./flash-deploy.md) for the fully-deployed architecture.
## Usage
@@ -13,7 +53,7 @@ flash run [OPTIONS]
- `--host`: Host to bind to (default: localhost)
- `--port, -p`: Port to bind to (default: 8888)
- `--reload/--no-reload`: Enable auto-reload (default: enabled)
-- `--auto-provision`: Auto-provision deployable resources on startup (default: disabled)
+- `--auto-provision`: Auto-provision Serverless endpoints on startup (default: disabled)
## Examples
@@ -37,10 +77,32 @@ flash run --host 0.0.0.0 --port 8000
2. Checks for FastAPI app
3. Starts uvicorn server with hot reload
4. GPU workers use LiveServerless (no packaging needed)
+### How It Works
+
+When you call a `@remote` function using `flash run`, Flash deploys a **Serverless endpoint** to Runpod. (These are actual cloud resources that incur costs.)
+
+```
+flash run
+ │
+ ├── Starts local server (e.g. localhost:8888)
+ │ └── Hosts your FastAPI mothership
+ │
+ └── On @remote function call:
+ └── Deploys a Serverless endpoint (if not cached)
+ └── Executes on the Runpod cloud
+```
+
+### Provisioning Modes
+
+| Mode | When endpoints are deployed |
+|------|----------------------------|
+| Default | Lazily, on first `@remote` function call |
+| `--auto-provision` | Eagerly, at server startup |
+
## Auto-Provisioning
-Auto-provisioning discovers and deploys serverless endpoints before the development server starts, eliminating the cold-start delay on first request.
+Auto-provisioning discovers and deploys Serverless endpoints before the Flash development server starts, eliminating the cold-start delay on first request.
### How It Works
diff --git a/src/runpod_flash/cli/docs/flash-undeploy.md b/src/runpod_flash/cli/docs/flash-undeploy.md
index 949c5d15..2c03a313 100644
--- a/src/runpod_flash/cli/docs/flash-undeploy.md
+++ b/src/runpod_flash/cli/docs/flash-undeploy.md
@@ -1,6 +1,30 @@
# flash undeploy
-Manage and delete RunPod serverless endpoints deployed via Flash.
+Manage and delete Runpod serverless endpoints deployed via Flash.
+
+## Overview
+
+The `flash undeploy` command helps you clean up Serverless endpoints that Flash has created when you ran/deployed a `@remote` function using `flash run` or `flash deploy`. It manages endpoints recorded in `.runpod/resources.pkl` and ensures both the cloud resources and local tracking state stay in sync.
+
+### When To Use This Command
+
+- Cleaning up individual endpoints you no longer need
+- Removing endpoints after local development/testing
+
+### `flash undeploy` vs `flash env delete`
+
+| Command | Scope | When to use |
+|---------|-------|----------|
+| `flash undeploy` | Individual endpoints from local tracking | Granular cleanup, development endpoints |
+| `flash env delete` | Entire environment + all its resources | Production cleanup, full teardown |
+
+For production deployments, use `flash env delete` to remove the entire environment and all associated resources automatically.
+
+### How Endpoint Tracking Works
+
+Flash tracks deployed endpoints in `.runpod/resources.pkl`. Endpoints get added to this file when you:
+- Run `flash run --auto-provision` (local development)
+- Run `flash deploy` (production deployment)
## Synopsis
@@ -8,12 +32,6 @@ Manage and delete RunPod serverless endpoints deployed via Flash.
flash undeploy [NAME|list] [OPTIONS]
```
-## Description
-
-The `flash undeploy` command manages RunPod serverless endpoints that were deployed using the `@remote` decorator. It provides multiple ways to delete endpoints and clean up tracking state.
-
-When you deploy functions with `@remote`, Flash tracks them in `.runpod/resources.pkl`. The undeploy command helps you manage these endpoints through deletion and cleanup operations.
-
## Usage Modes
### List Endpoints
@@ -26,7 +44,7 @@ flash undeploy list
**Output includes:**
- Name: Endpoint name
-- Endpoint ID: RunPod endpoint identifier
+- Endpoint ID: Runpod endpoint identifier
- Status: Current health status (Active/Inactive/Unknown)
- Type: Resource type (Live Serverless, Cpu Live Serverless, etc.)
- Resource ID: Internal tracking identifier
@@ -48,7 +66,7 @@ flash undeploy my-api
1. Searches for endpoints matching the name
2. Shows endpoint details
3. Prompts for confirmation
-4. Deletes endpoint from RunPod
+4. Deletes endpoint from Runpod
5. Removes from local tracking
### Undeploy All
@@ -63,7 +81,7 @@ flash undeploy --all
1. Shows total count of endpoints
2. First confirmation: Yes/No prompt
3. Second confirmation: Type "DELETE ALL" exactly
-4. Deletes all endpoints from RunPod
+4. Deletes all endpoints from Runpod
5. Removes all from tracking
### Interactive Selection
@@ -90,7 +108,7 @@ Remove inactive endpoints from tracking without API deletion:
flash undeploy --cleanup-stale
```
-**Use case:** When endpoints are deleted via RunPod UI or API (not through Flash), the tracking file becomes stale. This command identifies and removes those orphaned entries.
+**Use case:** When endpoints are deleted via Runpod UI or API (not through Flash), the tracking file becomes stale. This command identifies and removes those orphaned entries.
**Behavior:**
1. Checks health status of all tracked endpoints
@@ -132,7 +150,7 @@ flash undeploy --interactive
### Managing External Deletions
-If you delete endpoints via RunPod UI:
+If you delete endpoints via Runpod UI:
```bash
# Check status - will show as "Inactive"
@@ -150,7 +168,7 @@ The Status column performs a health check API call for each endpoint. This:
- Identifies endpoints deleted externally
**Why it's valuable:**
-- Catches endpoints deleted via RunPod UI
+- Catches endpoints deleted via Runpod UI
- Identifies unhealthy endpoints
- Prevents stale tracking file issues
@@ -194,7 +212,7 @@ def my_function(data):
```
Flash automatically:
-1. Deploys endpoint to RunPod
+1. Deploys endpoint to Runpod
2. Tracks in `.runpod/resources.pkl`
3. Reuses endpoint on subsequent calls
@@ -207,7 +225,7 @@ flash undeploy my-api
### Endpoint shows as "Inactive"
-**Cause:** Endpoint was deleted via RunPod UI/API
+**Cause:** Endpoint was deleted via Runpod UI/API
**Solution:**
```bash
@@ -230,17 +248,17 @@ flash undeploy list
**Solution:**
1. Check `RUNPOD_API_KEY` in `.env`
2. Verify network connectivity
-3. Check endpoint still exists on RunPod
+3. Check endpoint still exists on Runpod
## Related Commands
- `flash init` - Initialize new project
- `flash run` - Run development server
- `flash build` - Build deployment packages
-- `flash deploy` - Deploy to RunPod
+- `flash deploy` - Deploy to Runpod
## See Also
- [Flash CLI Overview](./README.md)
-- [RunPod Serverless Documentation](https://docs.runpod.io/serverless/overview)
+- [Runpod Serverless Documentation](https://docs.runpod.io/serverless/overview)
- [Flash Documentation](../../../README.md)
diff --git a/src/runpod_flash/cli/utils/deployment.py b/src/runpod_flash/cli/utils/deployment.py
index 89c61f5f..b7333d12 100644
--- a/src/runpod_flash/cli/utils/deployment.py
+++ b/src/runpod_flash/cli/utils/deployment.py
@@ -192,7 +192,8 @@ async def reconcile_and_provision_resources(
environment_name: Name of environment (for logging)
local_manifest: Local manifest dictionary
environment_id: Optional environment ID for endpoint provisioning
- show_progress: Whether to show CLI progress
+ show_progress: Whether to display progress information during
+ reconciliation and provisioning
Returns:
Updated manifest with deployment information
@@ -216,7 +217,7 @@ async def reconcile_and_provision_resources(
to_delete = state_resources - local_resources # Removed resources
if show_progress:
- print(
+ log.debug(
f"Reconciliation: {len(to_provision)} new, "
f"{len(to_update)} existing, {len(to_delete)} to remove"
)
@@ -274,7 +275,7 @@ async def reconcile_and_provision_resources(
# Delete removed resources
for resource_name in sorted(to_delete):
- log.info(f"Resource {resource_name} marked for deletion (not implemented yet)")
+ log.debug(f"Resource {resource_name} marked for deletion (not implemented yet)")
# Execute all actions in parallel with timeout
if actions:
@@ -308,11 +309,10 @@ async def reconcile_and_provision_resources(
if endpoint_url:
local_manifest["resources_endpoints"][resource_name] = endpoint_url
- if show_progress:
- action_label = (
- "✓ Provisioned" if action_type == "provision" else "✓ Updated"
- )
- print(f"{action_label}: {resource_name} → {endpoint_url}")
+ log.debug(
+ f"{'Provisioned' if action_type == 'provision' else 'Updated'}: "
+ f"{resource_name} -> {endpoint_url}"
+ )
# Validate mothership was provisioned
mothership_resources = [
@@ -338,31 +338,11 @@ async def reconcile_and_provision_resources(
manifest_path = Path.cwd() / ".flash" / "flash_manifest.json"
manifest_path.write_text(json.dumps(local_manifest, indent=2))
- if show_progress:
- print(f"✓ Local manifest updated at {manifest_path.relative_to(Path.cwd())}")
+ log.debug(f"Local manifest updated at {manifest_path.relative_to(Path.cwd())}")
# Overwrite State Manager manifest with local manifest
await app.update_build_manifest(build_id, local_manifest)
- if show_progress:
- print("✓ State Manager manifest updated")
- print()
-
- # Display mothership in simplified format
- resources_endpoints = local_manifest.get("resources_endpoints", {})
- resources = local_manifest.get("resources", {})
-
- for resource_name in sorted(resources_endpoints.keys()):
- resource_config = resources.get(resource_name, {})
- is_mothership = resource_config.get("is_mothership", False)
-
- if is_mothership:
- print(f"🚀 Deployed: {app.name}")
- print(f" Environment: {environment_name}")
- print(f" URL: {resources_endpoints[resource_name]}")
- print()
- break
-
return local_manifest.get("resources_endpoints", {})
@@ -398,26 +378,24 @@ def validate_local_manifest() -> Dict[str, Any]:
return manifest
-async def deploy_to_environment(
- app_name: str, env_name: str, build_path: Path
+async def deploy_from_uploaded_build(
+ app: FlashApp,
+ build_id: str,
+ env_name: str,
+ local_manifest: Dict[str, Any],
) -> Dict[str, Any]:
- """Deploy current project to environment.
+ """Deploy an already-uploaded build to an environment.
- Raises:
- runpod_flash.core.resources.app.FlashEnvironmentNotFoundError: If the environment does not exist
- FileNotFoundError: If manifest not found
- ValueError: If manifest is invalid
- """
- # Validate manifest exists before proceeding
- local_manifest = validate_local_manifest()
+ Args:
+ app: FlashApp instance (already resolved)
+ build_id: ID of the uploaded build
+ env_name: Target environment name
+ local_manifest: Validated local manifest dict
- app = await FlashApp.from_name(app_name)
- # Verify environment exists (will raise FlashEnvironmentNotFoundError if not)
+ Returns:
+ Deployment result with resources_endpoints and local_manifest keys
+ """
environment = await app.get_environment_by_name(env_name)
-
- build = await app.upload_build(build_path)
- build_id = build["id"]
-
result = await app.deploy_build_to_environment(build_id, environment_name=env_name)
try:
@@ -427,13 +405,15 @@ async def deploy_to_environment(
env_name,
local_manifest,
environment_id=environment.get("id"),
- show_progress=True,
+ show_progress=False,
)
log.debug(f"Provisioned {len(resources_endpoints)} resources for {env_name}")
except Exception as e:
log.error(f"Resource provisioning failed: {e}")
raise
+ result["resources_endpoints"] = resources_endpoints
+ result["local_manifest"] = local_manifest
return result
diff --git a/src/runpod_flash/cli/utils/formatting.py b/src/runpod_flash/cli/utils/formatting.py
new file mode 100644
index 00000000..b4eb12c0
--- /dev/null
+++ b/src/runpod_flash/cli/utils/formatting.py
@@ -0,0 +1,33 @@
+"""CLI output formatting helpers."""
+
+from datetime import datetime
+
+STATE_STYLE = {"HEALTHY": "green", "BUILDING": "cyan", "ERROR": "red"}
+
+
+def state_dot(state: str) -> str:
+ """Colored ● indicator for a resource/environment state."""
+ color = STATE_STYLE.get(state, "yellow")
+ return f"[{color}]●[/{color}]"
+
+
+def format_datetime(value: str | None) -> str:
+ """Format an ISO 8601 datetime string into a human-readable local time.
+
+ Returns a string like "Thu, Feb 19 2026 1:33 PM PST".
+ Returns "-" for None/empty values, or the original value if unparseable.
+ """
+ if not value:
+ return "-"
+
+ try:
+ dt = datetime.fromisoformat(value.replace("Z", "+00:00"))
+ local_dt = dt.astimezone()
+ tz_name = local_dt.strftime("%Z")
+ # strftime with manual zero-strip for cross-platform compat
+ # (%-d and %-I are glibc extensions, not available on windows)
+ day = local_dt.day
+ hour = int(local_dt.strftime("%I"))
+ return local_dt.strftime(f"%a, %b {day} %Y {hour}:%M %p {tz_name}")
+ except (ValueError, TypeError):
+ return value
diff --git a/src/runpod_flash/cli/utils/ignore.py b/src/runpod_flash/cli/utils/ignore.py
index bd3b8c7b..b9634dc2 100644
--- a/src/runpod_flash/cli/utils/ignore.py
+++ b/src/runpod_flash/cli/utils/ignore.py
@@ -128,7 +128,6 @@ def get_file_tree(
for item in directory.iterdir():
# Check if should ignore
if should_ignore(item, spec, base_dir):
- log.debug(f"Ignoring: {item.relative_to(base_dir)}")
continue
if item.is_file():
diff --git a/src/runpod_flash/cli/utils/skeleton_template/README.md b/src/runpod_flash/cli/utils/skeleton_template/README.md
index be7b8d55..6c4801e5 100644
--- a/src/runpod_flash/cli/utils/skeleton_template/README.md
+++ b/src/runpod_flash/cli/utils/skeleton_template/README.md
@@ -128,7 +128,7 @@ The `@remote` decorator transparently executes functions on serverless infrastru
### Resource Scaling
Both workers scale to zero when idle to minimize costs:
-- **idleTimeout**: Minutes before scaling down (default: 5)
+- **idleTimeout**: Seconds before scaling down (default: 60)
- **workersMin**: 0 = completely scales to zero
- **workersMax**: Maximum concurrent workers
diff --git a/src/runpod_flash/cli/utils/skeleton_template/mothership.py b/src/runpod_flash/cli/utils/skeleton_template/mothership.py
index 85779bfc..2f1eb408 100644
--- a/src/runpod_flash/cli/utils/skeleton_template/mothership.py
+++ b/src/runpod_flash/cli/utils/skeleton_template/mothership.py
@@ -16,14 +16,14 @@
Documentation: https://docs.runpod.io/flash/mothership
"""
-from runpod_flash import CpuLiveLoadBalancer
+from runpod_flash import DEFAULT_WORKERS_MAX, DEFAULT_WORKERS_MIN, CpuLiveLoadBalancer
# Mothership endpoint configuration
# This serves your FastAPI app routes from main.py
mothership = CpuLiveLoadBalancer(
name="mothership",
- workersMin=1,
- workersMax=1,
+ workersMin=DEFAULT_WORKERS_MIN,
+ workersMax=DEFAULT_WORKERS_MAX,
)
# Examples of customization:
diff --git a/src/runpod_flash/cli/utils/skeleton_template/workers/cpu/endpoint.py b/src/runpod_flash/cli/utils/skeleton_template/workers/cpu/endpoint.py
index e025ed76..6d55f19c 100644
--- a/src/runpod_flash/cli/utils/skeleton_template/workers/cpu/endpoint.py
+++ b/src/runpod_flash/cli/utils/skeleton_template/workers/cpu/endpoint.py
@@ -1,9 +1,9 @@
-from runpod_flash import CpuLiveServerless, remote
+from runpod_flash import DEFAULT_WORKERS_MAX, DEFAULT_WORKERS_MIN, CpuLiveServerless, remote
cpu_config = CpuLiveServerless(
name="cpu_worker",
- workersMin=0,
- workersMax=1,
+ workersMin=DEFAULT_WORKERS_MIN,
+ workersMax=DEFAULT_WORKERS_MAX,
idleTimeout=5,
)
diff --git a/src/runpod_flash/cli/utils/skeleton_template/workers/gpu/endpoint.py b/src/runpod_flash/cli/utils/skeleton_template/workers/gpu/endpoint.py
index fc2bae4e..3f2085ec 100644
--- a/src/runpod_flash/cli/utils/skeleton_template/workers/gpu/endpoint.py
+++ b/src/runpod_flash/cli/utils/skeleton_template/workers/gpu/endpoint.py
@@ -1,10 +1,10 @@
-from runpod_flash import GpuGroup, LiveServerless, remote
+from runpod_flash import DEFAULT_WORKERS_MAX, DEFAULT_WORKERS_MIN, GpuGroup, LiveServerless, remote
gpu_config = LiveServerless(
name="gpu_worker",
gpus=[GpuGroup.ANY],
- workersMin=0,
- workersMax=1,
+ workersMin=DEFAULT_WORKERS_MIN,
+ workersMax=DEFAULT_WORKERS_MAX,
idleTimeout=5,
)
diff --git a/src/runpod_flash/client.py b/src/runpod_flash/client.py
index 1288e24f..ed68bc30 100644
--- a/src/runpod_flash/client.py
+++ b/src/runpod_flash/client.py
@@ -25,17 +25,9 @@ def _should_execute_locally(func_name: str) -> bool:
# Check if we're in a deployed environment
runpod_endpoint_id = os.getenv("RUNPOD_ENDPOINT_ID")
runpod_pod_id = os.getenv("RUNPOD_POD_ID")
- flash_resource_name = os.getenv("FLASH_RESOURCE_NAME")
-
- log.debug(
- f"@remote decorator for {func_name}: "
- f"RUNPOD_ENDPOINT_ID={runpod_endpoint_id}, "
- f"FLASH_RESOURCE_NAME={flash_resource_name}"
- )
if not runpod_endpoint_id and not runpod_pod_id:
# Local development - create stub for remote execution via ResourceManager
- log.debug(f"@remote {func_name}: local dev mode, creating stub")
return False
# In deployed environment - check build-time generated configuration
@@ -43,9 +35,6 @@ def _should_execute_locally(func_name: str) -> bool:
from .runtime._flash_resource_config import is_local_function
result = is_local_function(func_name)
- log.debug(
- f"@remote {func_name}: deployed mode, is_local_function returned {result}"
- )
return result
except ImportError as e:
# Configuration not generated (shouldn't happen in deployed env)
@@ -186,14 +175,10 @@ def decorator(func_or_class):
if should_execute_local:
# This function belongs to our resource - execute locally
- log.debug(
- f"@remote {func_name}: returning original function (local execution)"
- )
func_or_class.__remote_config__ = routing_config
return func_or_class
# Remote execution mode - create stub for calling other endpoints
- log.debug(f"@remote {func_name}: creating wrapper for remote execution")
if inspect.isclass(func_or_class):
# Handle class decoration
diff --git a/src/runpod_flash/core/api/runpod.py b/src/runpod_flash/core/api/runpod.py
index c3e04443..bc30219a 100644
--- a/src/runpod_flash/core/api/runpod.py
+++ b/src/runpod_flash/core/api/runpod.py
@@ -3,7 +3,7 @@
Bypasses the outdated runpod-python SDK limitations.
"""
-import json
+import json # noqa: F401 - used in commented debug logs
import logging
import os
from typing import Any, Dict, Optional, List
@@ -69,11 +69,14 @@ def __init__(self, api_key: Optional[str] = None):
async def _get_session(self) -> aiohttp.ClientSession:
"""Get or create an aiohttp session."""
if self.session is None or self.session.closed:
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
timeout = aiohttp.ClientTimeout(total=300) # 5 minute timeout
connector = aiohttp.TCPConnector(resolver=ThreadedResolver())
self.session = aiohttp.ClientSession(
timeout=timeout,
headers={
+ "User-Agent": get_user_agent(),
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json",
},
@@ -89,19 +92,19 @@ async def _execute_graphql(
payload = {"query": query, "variables": variables or {}}
- log.debug(f"GraphQL Query: {query}")
- sanitized_vars = _sanitize_for_logging(variables)
- log.debug(f"GraphQL Variables: {json.dumps(sanitized_vars, indent=2)}")
+ # log.debug(f"GraphQL Query: {query}")
+ # sanitized_vars = _sanitize_for_logging(variables)
+ # log.debug(f"GraphQL Variables: {json.dumps(sanitized_vars, indent=2)}")
try:
async with session.post(self.GRAPHQL_URL, json=payload) as response:
response_data = await response.json()
- log.debug(f"GraphQL Response Status: {response.status}")
- sanitized_response = _sanitize_for_logging(response_data)
- log.debug(
- f"GraphQL Response: {json.dumps(sanitized_response, indent=2)}"
- )
+ # log.debug(f"GraphQL Response Status: {response.status}")
+ # sanitized_response = _sanitize_for_logging(response_data)
+ # log.debug(
+ # f"GraphQL Response: {json.dumps(sanitized_response, indent=2)}"
+ # )
if response.status >= 400:
sanitized_err = _sanitize_for_logging(response_data)
@@ -153,7 +156,7 @@ async def update_template(self, input_data: Dict[str, Any]) -> Dict[str, Any]:
raise Exception("Unexpected GraphQL response structure")
template_data = result["saveTemplate"]
- log.info(
+ log.debug(
f"Updated template: {template_data.get('id', 'unknown')} - {template_data.get('name', 'unnamed')}"
)
@@ -207,7 +210,7 @@ async def save_endpoint(self, input_data: Dict[str, Any]) -> Dict[str, Any]:
raise Exception("Unexpected GraphQL response structure")
endpoint_data = result["saveEndpoint"]
- log.info(
+ log.debug(
f"Saved endpoint: {endpoint_data.get('id', 'unknown')} - {endpoint_data.get('name', 'unnamed')}"
)
@@ -278,7 +281,7 @@ async def delete_endpoint(self, endpoint_id: str) -> Dict[str, Any]:
"""
variables = {"id": endpoint_id}
- log.info(f"Deleting endpoint: {endpoint_id}")
+ log.debug(f"Deleting endpoint: {endpoint_id}")
result = await self._execute_graphql(mutation, variables)
@@ -351,8 +354,6 @@ async def finalize_artifact_upload(
"""
variables = {"input": input_data}
- log.debug(f"finalizing upload for flash app: {input_data}")
-
result = await self._execute_graphql(mutation, variables)
return result["finalizeFlashArtifactUpload"]
@@ -404,7 +405,6 @@ async def get_flash_app_by_name(self, app_name: str) -> Dict[str, Any]:
"""
variables = {"flashAppName": app_name}
- log.debug(f"Fetching flash app by name for input: {app_name}")
result = await self._execute_graphql(query, variables)
return result["flashAppByName"]
@@ -457,7 +457,6 @@ async def get_flash_environment_by_name(
"""
variables = {"input": input_data}
- log.debug(f"Fetching flash environment by name for input: {variables}")
result = await self._execute_graphql(query, variables)
return result["flashEnvironmentByName"]
@@ -510,8 +509,6 @@ async def deploy_build_to_environment(
variables = {"input": input_data}
- log.debug(f"Deploying flash environment with vars: {input_data}")
-
result = await self._execute_graphql(mutation, variables)
return result["deployBuildToEnvironment"]
@@ -741,7 +738,7 @@ async def delete_flash_app(self, app_id: str) -> Dict[str, Any]:
"""
variables = {"flashAppId": app_id}
- log.info(f"Deleting flash app: {app_id}")
+ log.debug(f"Deleting flash app: {app_id}")
result = await self._execute_graphql(mutation, variables)
return {"success": "deleteFlashApp" in result}
@@ -755,7 +752,7 @@ async def delete_flash_environment(self, environment_id: str) -> Dict[str, Any]:
"""
variables = {"flashEnvironmentId": environment_id}
- log.info(f"Deleting flash environment: {environment_id}")
+ log.debug(f"Deleting flash environment: {environment_id}")
result = await self._execute_graphql(mutation, variables)
return {"success": "deleteFlashEnvironment" in result}
@@ -781,7 +778,7 @@ async def endpoint_exists(self, endpoint_id: str) -> bool:
log.debug(f"Endpoint {endpoint_id} exists: {exists}")
return exists
except Exception as e:
- log.error(f"Error checking endpoint existence: {e}")
+ log.debug(f"Error checking endpoint existence: {e}")
return False
async def close(self):
@@ -812,10 +809,13 @@ def __init__(self, api_key: Optional[str] = None):
async def _get_session(self) -> aiohttp.ClientSession:
"""Get or create an aiohttp session."""
if self.session is None or self.session.closed:
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
timeout = aiohttp.ClientTimeout(total=300) # 5 minute timeout
self.session = aiohttp.ClientSession(
timeout=timeout,
headers={
+ "User-Agent": get_user_agent(),
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json",
},
@@ -828,15 +828,15 @@ async def _execute_rest(
"""Execute a REST API request."""
session = await self._get_session()
- log.debug(f"REST Request: {method} {url}")
- log.debug(f"REST Data: {json.dumps(data, indent=2) if data else 'None'}")
+ # log.debug(f"REST Request: {method} {url}")
+ # log.debug(f"REST Data: {json.dumps(data, indent=2) if data else 'None'}")
try:
async with session.request(method, url, json=data) as response:
response_data = await response.json()
- log.debug(f"REST Response Status: {response.status}")
- log.debug(f"REST Response: {json.dumps(response_data, indent=2)}")
+ # log.debug(f"REST Response Status: {response.status}")
+ # log.debug(f"REST Response: {json.dumps(response_data, indent=2)}")
if response.status >= 400:
raise Exception(
@@ -857,7 +857,7 @@ async def create_network_volume(self, payload: Dict[str, Any]) -> Dict[str, Any]
"POST", f"{RUNPOD_REST_API_URL}/networkvolumes", payload
)
- log.info(
+ log.debug(
f"Created network volume: {result.get('id', 'unknown')} - {result.get('name', 'unnamed')}"
)
diff --git a/src/runpod_flash/core/discovery.py b/src/runpod_flash/core/discovery.py
index 8ce4f3e5..06c5d57e 100644
--- a/src/runpod_flash/core/discovery.py
+++ b/src/runpod_flash/core/discovery.py
@@ -52,9 +52,6 @@ def discover(self) -> List[DeployableResource]:
resource = self._resolve_resource_variable(module, var_name)
if resource:
resources.append(resource)
- log.debug(
- f"Discovered resource: {var_name} -> {resource.__class__.__name__}"
- )
else:
log.warning(f"Failed to import {self.entry_point}")
@@ -405,10 +402,6 @@ def _scan_project_directory(self) -> List[DeployableResource]:
resource = self._resolve_resource_variable(module, var_name)
if resource:
resources.append(resource)
- log.debug(
- f"Discovered resource in {file_path.relative_to(project_root)}: "
- f"{var_name} -> {resource.__class__.__name__}"
- )
except Exception as e:
log.debug(f"Failed to scan {file_path}: {e}")
diff --git a/src/runpod_flash/core/resources/app.py b/src/runpod_flash/core/resources/app.py
index 9c17828b..2fdce8f3 100644
--- a/src/runpod_flash/core/resources/app.py
+++ b/src/runpod_flash/core/resources/app.py
@@ -185,10 +185,8 @@ async def _hydrate(self) -> None:
"""
async with self._hydrate_lock:
if self._hydrated:
- log.debug("App is already hydrated while calling hydrate. Returning")
return
- log.debug("Hydrating app")
async with RunpodGraphQLClient() as client:
try:
result = await client.get_flash_app_by_name(self.name)
@@ -337,11 +335,16 @@ async def download_tarball(self, environment_id: str, dest_file: str) -> None:
ValueError: If environment has no active artifact
requests.HTTPError: If download fails
"""
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
await self._hydrate()
result = await self._get_active_artifact(environment_id)
url = result["downloadUrl"]
+
+ headers = {"User-Agent": get_user_agent()}
+
with open(dest_file, "wb") as stream:
- with requests.get(url, stream=True) as resp:
+ with requests.get(url, stream=True, headers=headers) as resp:
resp.raise_for_status()
for chunk in resp.iter_content():
if chunk:
@@ -462,6 +465,8 @@ async def upload_build(self, tar_path: Union[str, Path]) -> Dict[str, Any]:
except json.JSONDecodeError as e:
raise ValueError(f"Invalid manifest JSON at {manifest_path}: {e}") from e
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
await self._hydrate()
tarball_size = tar_path.stat().st_size
@@ -469,7 +474,10 @@ async def upload_build(self, tar_path: Union[str, Path]) -> Dict[str, Any]:
url = result["uploadUrl"]
object_key = result["objectKey"]
- headers = {"Content-Type": TARBALL_CONTENT_TYPE}
+ headers = {
+ "User-Agent": get_user_agent(),
+ "Content-Type": TARBALL_CONTENT_TYPE,
+ }
with tar_path.open("rb") as fh:
resp = requests.put(url, data=fh, headers=headers)
diff --git a/src/runpod_flash/core/resources/constants.py b/src/runpod_flash/core/resources/constants.py
index e927c09b..86adad9c 100644
--- a/src/runpod_flash/core/resources/constants.py
+++ b/src/runpod_flash/core/resources/constants.py
@@ -31,7 +31,7 @@ def _endpoint_domain_from_base_url(base_url: str) -> str:
)
# Worker configuration defaults
-DEFAULT_WORKERS_MIN = 1
+DEFAULT_WORKERS_MIN = 0
DEFAULT_WORKERS_MAX = 1
# Flash app artifact upload constants
diff --git a/src/runpod_flash/core/resources/load_balancer_sls_resource.py b/src/runpod_flash/core/resources/load_balancer_sls_resource.py
index 56804e75..eb664ed0 100644
--- a/src/runpod_flash/core/resources/load_balancer_sls_resource.py
+++ b/src/runpod_flash/core/resources/load_balancer_sls_resource.py
@@ -198,7 +198,7 @@ async def _wait_for_health(
if not self.id:
raise ValueError("Cannot wait for health: endpoint not deployed")
- log.info(
+ log.debug(
f"Waiting for LB endpoint {self.name} ({self.id}) to become healthy... "
f"(max {max_retries} retries, {retry_interval}s interval)"
)
@@ -206,7 +206,7 @@ async def _wait_for_health(
for attempt in range(max_retries):
try:
if await self._check_ping_endpoint():
- log.info(
+ log.debug(
f"LB endpoint {self.name} is healthy (attempt {attempt + 1})"
)
return True
@@ -223,7 +223,7 @@ async def _wait_for_health(
if attempt < max_retries - 1:
await asyncio.sleep(retry_interval)
- log.error(
+ log.debug(
f"LB endpoint {self.name} failed to become healthy after "
f"{max_retries} attempts"
)
@@ -259,14 +259,14 @@ async def _do_deploy(self) -> "LoadBalancerSlsResource":
self.env["FLASH_IS_MOTHERSHIP"] = "true"
# Call parent deploy (creates endpoint via RunPod API)
- log.info(f"Deploying LB endpoint {self.name}...")
+ log.debug(f"Deploying LB endpoint {self.name}...")
deployed = await super()._do_deploy()
- log.info(f"LB endpoint {self.name} ({deployed.id}) deployed successfully")
+ log.debug(f"LB endpoint {self.name} ({deployed.id}) deployed successfully")
return deployed
except Exception as e:
- log.error(f"Failed to deploy LB endpoint {self.name}: {e}")
+ log.debug(f"Failed to deploy LB endpoint {self.name}: {e}")
raise
def is_deployed(self) -> bool:
diff --git a/src/runpod_flash/core/resources/network_volume.py b/src/runpod_flash/core/resources/network_volume.py
index 48851cb8..7c7af7ff 100644
--- a/src/runpod_flash/core/resources/network_volume.py
+++ b/src/runpod_flash/core/resources/network_volume.py
@@ -111,7 +111,7 @@ async def _find_existing_volume(self, client) -> Optional["NetworkVolume"]:
existing_volumes = self._normalize_volumes_response(volumes_response)
if matching_volume := self._find_matching_volume(existing_volumes):
- log.info(
+ log.debug(
f"Found existing network volume: {matching_volume.get('id')} with name '{self.name}'"
)
# Update our instance with the existing volume's ID
diff --git a/src/runpod_flash/core/resources/resource_manager.py b/src/runpod_flash/core/resources/resource_manager.py
index 447d48b1..0cd18f51 100644
--- a/src/runpod_flash/core/resources/resource_manager.py
+++ b/src/runpod_flash/core/resources/resource_manager.py
@@ -101,7 +101,7 @@ def _migrate_to_name_based_keys(self) -> None:
migrated_configs[key] = self._resource_configs.get(key, "")
if len(migrated) != len(self._resources):
- log.info(f"Migrated {len(self._resources)} resources to name-based keys")
+ log.debug(f"Migrated {len(self._resources)} resources to name-based keys")
self._resources = migrated
self._resource_configs = migrated_configs
self._save_resources() # Persist migration
@@ -136,7 +136,7 @@ def _refresh_config_hashes(self) -> None:
# Save if any hashes were updated
if updated:
- log.info("Refreshed config hashes after code changes")
+ log.debug("Refreshed config hashes after code changes")
self._save_resources()
def _save_resources(self) -> None:
@@ -152,7 +152,6 @@ def _save_resources(self) -> None:
data = (self._resources, self._resource_configs)
cloudpickle.dump(data, f)
f.flush() # Ensure data is written to disk
- log.debug(f"Saved resources in {RESOURCE_STATE_FILE}")
except (FileLockError, Exception) as e:
log.error(f"Failed to save resources to {RESOURCE_STATE_FILE}: {e}")
raise
@@ -224,15 +223,6 @@ async def get_or_deploy_resource(
resource_key = config.get_resource_key()
new_config_hash = config.config_hash
- log.debug(
- f"get_or_deploy_resource called:\n"
- f" Config type: {type(config).__name__}\n"
- f" Config name: {getattr(config, 'name', 'N/A')}\n"
- f" Resource key: {resource_key}\n"
- f" New config hash: {new_config_hash[:16]}...\n"
- f" Available keys in cache: {list(self._resources.keys())}"
- )
-
# Ensure global lock is initialized
assert ResourceManager._global_lock is not None, "Global lock not initialized"
@@ -247,7 +237,6 @@ async def get_or_deploy_resource(
existing = self._resources.get(resource_key)
if existing:
- log.debug(f"Resource found in cache: {resource_key}")
# Resource exists - check if still valid
if not existing.is_deployed():
log.warning(f"{existing} is no longer valid, redeploying.")
@@ -256,7 +245,7 @@ async def get_or_deploy_resource(
deployed_resource = await self._deploy_with_error_context(
config
)
- log.info(f"URL: {deployed_resource.url}")
+ log.debug(f"URL: {deployed_resource.url}")
self._add_resource(resource_key, deployed_resource)
return deployed_resource
except Exception:
@@ -273,21 +262,6 @@ async def get_or_deploy_resource(
stored_config_hash = self._resource_configs.get(resource_key, "")
if stored_config_hash != new_config_hash:
- # Detailed drift debugging
- log.debug(
- f"DRIFT DEBUG for '{config.name}':\n"
- f" Stored hash: {stored_config_hash}\n"
- f" New hash: {new_config_hash}\n"
- f" Stored resource type: {type(existing).__name__}\n"
- f" New resource type: {type(config).__name__}\n"
- f" Existing config fields: {existing.model_dump(exclude_none=True, exclude={'id'}) if hasattr(existing, 'model_dump') else 'N/A'}\n"
- f" New config fields: {config.model_dump(exclude_none=True, exclude={'id'}) if hasattr(config, 'model_dump') else 'N/A'}"
- )
- log.info(
- f"Config drift detected for '{config.name}': "
- f"Automatically updating endpoint"
- )
-
# Attempt update (will redeploy if structural changes detected)
if hasattr(existing, "update"):
updated_resource = await existing.update(config)
@@ -304,7 +278,7 @@ async def get_or_deploy_resource(
deployed_resource = await self._deploy_with_error_context(
config
)
- log.info(f"URL: {deployed_resource.url}")
+ log.debug(f"URL: {deployed_resource.url}")
self._add_resource(resource_key, deployed_resource)
return deployed_resource
except Exception:
@@ -318,18 +292,13 @@ async def get_or_deploy_resource(
raise
# Config unchanged, reuse existing
- log.debug(f"{existing} exists, reusing (config unchanged)")
log.info(f"URL: {existing.url}")
return existing
# No existing resource, deploy new one
- log.debug(
- f"Resource NOT found in cache, deploying new: {resource_key}\n"
- f" Searched in keys: {list(self._resources.keys())}"
- )
try:
deployed_resource = await self._deploy_with_error_context(config)
- log.info(f"URL: {deployed_resource.url}")
+ log.debug(f"URL: {deployed_resource.url}")
self._add_resource(resource_key, deployed_resource)
return deployed_resource
except Exception:
diff --git a/src/runpod_flash/core/resources/serverless.py b/src/runpod_flash/core/resources/serverless.py
index be8a56b4..8a9bde52 100644
--- a/src/runpod_flash/core/resources/serverless.py
+++ b/src/runpod_flash/core/resources/serverless.py
@@ -1,7 +1,9 @@
import asyncio
+import json
import logging
import os
from enum import Enum
+from pathlib import Path
from typing import Any, ClassVar, Dict, List, Optional, Set
from pydantic import (
@@ -17,7 +19,7 @@
from ..utils.backoff import get_backoff_delay
from .base import DeployableResource
from .cloud import runpod
-from .constants import CONSOLE_URL
+from .constants import CONSOLE_URL, DEFAULT_WORKERS_MAX, DEFAULT_WORKERS_MIN
from .environment import EnvironmentVars
from .cpu import CpuInstanceType
from .gpu import GpuGroup, GpuType
@@ -154,7 +156,7 @@ class ServerlessResource(DeployableResource):
# === Input Fields ===
executionTimeoutMs: Optional[int] = 0
gpuCount: Optional[int] = 1
- idleTimeout: Optional[int] = 5
+ idleTimeout: Optional[int] = 60
instanceIds: Optional[List[CpuInstanceType]] = None
locations: Optional[str] = None
name: str
@@ -164,8 +166,8 @@ class ServerlessResource(DeployableResource):
scalerValue: Optional[int] = 4
templateId: Optional[str] = None
type: Optional[ServerlessType] = ServerlessType.QB
- workersMax: Optional[int] = 1
- workersMin: Optional[int] = 0
+ workersMax: Optional[int] = DEFAULT_WORKERS_MAX
+ workersMin: Optional[int] = DEFAULT_WORKERS_MIN
workersPFBTarget: Optional[int] = 0
# === Runtime Fields ===
@@ -364,7 +366,7 @@ def _apply_smart_disk_sizing(self, template: PodTemplate) -> None:
# Auto-size if using default value
default_disk_size = PodTemplate.model_fields["containerDiskInGb"].default
if template.containerDiskInGb == default_disk_size:
- log.info(
+ log.debug(
f"Auto-sizing containerDiskInGb from {default_disk_size}GB "
f"to {cpu_limit}GB (CPU instance limit)"
)
@@ -475,7 +477,7 @@ def is_deployed(self) -> bool:
response = self.endpoint.health()
return response is not None
except Exception as e:
- log.error(f"Error checking {self}: {e}")
+ log.debug(f"Error checking {self}: {e}")
return False
def _payload_exclude(self) -> Set[str]:
@@ -510,9 +512,65 @@ def _build_template_update_payload(
payload["id"] = template_id
return payload
+ def _check_makes_remote_calls(self) -> bool:
+ """Check if resource makes remote calls from build manifest.
+
+ Reads flash_manifest.json to determine if this resource config
+ has makes_remote_calls=True.
+
+ Returns:
+ True if makes remote calls, False if local-only,
+ True (safe default) if manifest not found.
+ """
+ try:
+ manifest_path = Path.cwd() / "flash_manifest.json"
+ if not manifest_path.exists():
+ # Try alternative locations
+ manifest_path = Path("/flash_manifest.json") # Container path
+
+ if not manifest_path.exists():
+ log.debug("Manifest not found, assuming makes_remote_calls=True")
+ return True # Safe default
+
+ with open(manifest_path) as f:
+ manifest_data = json.load(f)
+
+ resources = manifest_data.get("resources", {})
+
+ # Strip -fb suffix and live- prefix to match manifest name
+ lookup_name = self.name
+ if lookup_name.endswith("-fb"):
+ lookup_name = lookup_name[:-3]
+ if lookup_name.startswith(LIVE_PREFIX):
+ lookup_name = lookup_name[len(LIVE_PREFIX) :]
+
+ resource_config = resources.get(lookup_name)
+
+ if not resource_config:
+ log.debug(
+ f"Resource '{lookup_name}' (from '{self.name}') not in manifest, assuming makes_remote_calls=True"
+ )
+ return True # Safe default
+
+ makes_remote_calls = resource_config.get("makes_remote_calls", True)
+ log.debug(
+ f"Resource '{lookup_name}' (from '{self.name}') makes_remote_calls={makes_remote_calls}"
+ )
+ return makes_remote_calls
+
+ except Exception as e:
+ log.warning(
+ f"Failed to read manifest: {e}, assuming makes_remote_calls=True"
+ )
+ return True # Safe default on error
+
async def _do_deploy(self) -> "DeployableResource":
"""
Deploys the serverless resource using the provided configuration.
+
+ For queue-based endpoints that make remote calls, injects RUNPOD_API_KEY
+ into environment variables if not already set.
+
Returns a DeployableResource object.
"""
try:
@@ -521,7 +579,32 @@ async def _do_deploy(self) -> "DeployableResource":
log.debug(f"{self} exists")
return self
- # NEW: Ensure network volume is deployed first
+ # Inject API key for queue-based endpoints that make remote calls
+ if self.type == ServerlessType.QB:
+ env_dict = self.env or {}
+
+ # Check if this resource makes remote calls (from build manifest)
+ makes_remote_calls = self._check_makes_remote_calls()
+
+ if makes_remote_calls:
+ # Inject RUNPOD_API_KEY if not already set
+ if "RUNPOD_API_KEY" not in env_dict:
+ api_key = os.getenv("RUNPOD_API_KEY")
+ if api_key:
+ env_dict["RUNPOD_API_KEY"] = api_key
+ log.info(
+ f"{self.name}: Injected RUNPOD_API_KEY for remote calls "
+ f"(makes_remote_calls=True)"
+ )
+ else:
+ log.warning(
+ f"{self.name}: makes_remote_calls=True but RUNPOD_API_KEY not set. "
+ f"Remote calls to other endpoints will fail."
+ )
+
+ self.env = env_dict
+
+ # Ensure network volume is deployed first
await self._ensure_network_volume_deployed()
async with RunpodGraphQLClient() as client:
@@ -563,13 +646,8 @@ async def update(self, new_config: "ServerlessResource") -> "ServerlessResource"
try:
resolved_template_id = self.templateId or new_config.templateId
- # Log if version-triggering changes detected (informational only)
- if self._has_structural_changes(new_config):
- log.info(
- f"{self.name}: Version-triggering changes detected. "
- "Server will increment version and recreate workers."
- )
- else:
+ # Check for version-triggering changes
+ if not self._has_structural_changes(new_config):
log.info(f"Updating endpoint '{self.name}' (ID: {self.id})")
# Ensure network volume is deployed if specified
@@ -595,7 +673,7 @@ async def update(self, new_config: "ServerlessResource") -> "ServerlessResource"
new_config.template, resolved_template_id
)
await client.update_template(template_payload)
- log.info(
+ log.debug(
f"Updated template '{resolved_template_id}' for endpoint '{self.name}'"
)
else:
@@ -614,7 +692,9 @@ async def update(self, new_config: "ServerlessResource") -> "ServerlessResource"
# env, networkVolume, datacenter), and dropping them causes
# repeated false drift on subsequent deploys.
updated = await new_config._sync_graphql_object_with_inputs(updated)
- log.info(f"Successfully updated endpoint '{self.name}' (ID: {self.id})")
+ log.debug(
+ f"Successfully updated endpoint '{self.name}' (ID: {self.id})"
+ )
return updated
raise ValueError("Update failed, no endpoint was returned.")
@@ -669,11 +749,9 @@ def _has_structural_changes(self, new_config: "ServerlessResource") -> bool:
# Handle list comparison
if isinstance(old_val, list) and isinstance(new_val, list):
if sorted(str(v) for v in old_val) != sorted(str(v) for v in new_val):
- log.debug(f"Structural change in '{field}': {old_val} → {new_val}")
return True
# Handle other types
elif old_val != new_val:
- log.debug(f"Structural change in '{field}': {old_val} → {new_val}")
return True
return False
@@ -707,21 +785,21 @@ async def _do_undeploy(self) -> bool:
success = result.get("success", False)
if success:
- log.info(f"{self} successfully undeployed")
+ log.debug(f"{self} successfully undeployed")
return True
else:
- log.error(f"{self} failed to undeploy")
+ log.debug(f"{self} failed to undeploy")
return False
except Exception as e:
- log.error(f"{self} failed to undeploy: {e}")
+ log.debug(f"{self} failed to undeploy: {e}")
# Deletion failed. Check if endpoint still exists.
# If it doesn't exist, treat as successful cleanup (orphaned endpoint).
try:
async with RunpodGraphQLClient() as client:
if not await client.endpoint_exists(self.id):
- log.info(
+ log.debug(
f"{self} no longer exists on RunPod, removing from cache"
)
return True
@@ -752,14 +830,14 @@ def _fetch_job():
try:
# log.debug(f"[{self}] Payload: {payload}")
- log.info(f"{self} | API /run_sync")
+ log.debug(f"{self} | API /run_sync")
response = await asyncio.to_thread(_fetch_job)
return JobOutput(**response)
except Exception as e:
health = await asyncio.to_thread(self.endpoint.health)
health = ServerlessHealth(**health)
- log.info(f"{self} | Health {health.workers.status}")
+ log.debug(f"{self} | Health {health.workers.status}")
log.error(f"{self} | Exception: {e}")
raise
@@ -777,12 +855,12 @@ async def run(self, payload: Dict[str, Any]) -> "JobOutput":
# log.debug(f"[{self}] Payload: {payload}")
# Create a job using the endpoint
- log.info(f"{self} | API /run")
+ log.debug(f"{self} | API /run")
job = await asyncio.to_thread(self.endpoint.run, request_input=payload)
log_subgroup = f"Job:{job.job_id}"
- log.info(f"{self} | Started {log_subgroup}")
+ log.debug(f"{self} | Started {log_subgroup}")
current_pace = 0
attempt = 0
@@ -801,10 +879,10 @@ async def run(self, payload: Dict[str, Any]) -> "JobOutput":
attempt += 1
indicator = "." * (attempt // 2) if attempt % 2 == 0 else ""
if indicator:
- log.info(f"{log_subgroup} | {indicator}")
+ log.debug(f"{log_subgroup} | {indicator}")
else:
# status changed, reset the gap
- log.info(f"{log_subgroup} | Status: {job_status}")
+ log.debug(f"{log_subgroup} | Status: {job_status}")
attempt = 0
last_status = job_status
@@ -818,7 +896,7 @@ async def run(self, payload: Dict[str, Any]) -> "JobOutput":
except Exception as e:
if job and job.job_id:
- log.info(f"{self} | Cancelling job {job.job_id}")
+ log.debug(f"{self} | Cancelling job {job.job_id}")
await asyncio.to_thread(job.cancel)
log.error(f"{self} | Exception: {e}")
@@ -891,8 +969,8 @@ class JobOutput(BaseModel):
def model_post_init(self, _: Any) -> None:
log_group = f"Worker:{self.workerId}"
- log.info(f"{log_group} | Delay Time: {self.delayTime} ms")
- log.info(f"{log_group} | Execution Time: {self.executionTime} ms")
+ log.debug(f"{log_group} | Delay Time: {self.delayTime} ms")
+ log.debug(f"{log_group} | Execution Time: {self.executionTime} ms")
class Status(str, Enum):
diff --git a/src/runpod_flash/core/utils/file_lock.py b/src/runpod_flash/core/utils/file_lock.py
index c104cfd8..b1866c34 100644
--- a/src/runpod_flash/core/utils/file_lock.py
+++ b/src/runpod_flash/core/utils/file_lock.py
@@ -102,7 +102,6 @@ def file_lock(
_acquire_fallback_lock(file_handle, exclusive, timeout)
lock_acquired = True
- log.debug(f"File lock acquired (exclusive={exclusive})")
except (OSError, IOError, FileLockError) as e:
# Check timeout
@@ -128,8 +127,6 @@ def file_lock(
else:
_release_fallback_lock(file_handle)
- log.debug("File lock released")
-
except Exception as e:
log.error(f"Error releasing file lock: {e}")
# Don't raise - we're in cleanup
diff --git a/src/runpod_flash/core/utils/http.py b/src/runpod_flash/core/utils/http.py
index 954ea445..428999c9 100644
--- a/src/runpod_flash/core/utils/http.py
+++ b/src/runpod_flash/core/utils/http.py
@@ -11,9 +11,12 @@ def get_authenticated_httpx_client(
timeout: Optional[float] = None,
api_key_override: Optional[str] = None,
) -> httpx.AsyncClient:
- """Create httpx AsyncClient with RunPod authentication.
+ """Create httpx AsyncClient with RunPod authentication and User-Agent.
+
+ Automatically includes:
+ - User-Agent header identifying flash client and version
+ - Authorization header if RUNPOD_API_KEY is set
- Automatically includes Authorization header if RUNPOD_API_KEY is set.
This provides a centralized place to manage authentication headers for
all RunPod HTTP requests, avoiding repetitive manual header addition.
@@ -23,7 +26,7 @@ def get_authenticated_httpx_client(
Used for propagating API keys from mothership to worker endpoints.
Returns:
- Configured httpx.AsyncClient with Authorization header
+ Configured httpx.AsyncClient with User-Agent and Authorization headers
Example:
async with get_authenticated_httpx_client() as client:
@@ -37,7 +40,11 @@ def get_authenticated_httpx_client(
async with get_authenticated_httpx_client(api_key_override=context_key) as client:
response = await client.post(url, json=data)
"""
- headers = {}
+ from .user_agent import get_user_agent
+
+ headers = {
+ "User-Agent": get_user_agent(),
+ }
api_key = api_key_override or os.environ.get("RUNPOD_API_KEY")
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
@@ -49,9 +56,12 @@ def get_authenticated_httpx_client(
def get_authenticated_requests_session(
api_key_override: Optional[str] = None,
) -> requests.Session:
- """Create requests Session with RunPod authentication.
+ """Create requests Session with RunPod authentication and User-Agent.
+
+ Automatically includes:
+ - User-Agent header identifying flash client and version
+ - Authorization header if RUNPOD_API_KEY is set
- Automatically includes Authorization header if RUNPOD_API_KEY is set.
Provides a centralized place to manage authentication headers for
synchronous RunPod HTTP requests.
@@ -60,7 +70,7 @@ def get_authenticated_requests_session(
Used for propagating API keys from mothership to worker endpoints.
Returns:
- Configured requests.Session with Authorization header
+ Configured requests.Session with User-Agent and Authorization headers
Example:
session = get_authenticated_requests_session()
@@ -76,7 +86,11 @@ def get_authenticated_requests_session(
with contextlib.closing(get_authenticated_requests_session(api_key_override=context_key)) as session:
response = session.post(url, json=data)
"""
+ from .user_agent import get_user_agent
+
session = requests.Session()
+ session.headers["User-Agent"] = get_user_agent()
+
api_key = api_key_override or os.environ.get("RUNPOD_API_KEY")
if api_key:
session.headers["Authorization"] = f"Bearer {api_key}"
diff --git a/src/runpod_flash/core/utils/user_agent.py b/src/runpod_flash/core/utils/user_agent.py
new file mode 100644
index 00000000..2405dada
--- /dev/null
+++ b/src/runpod_flash/core/utils/user_agent.py
@@ -0,0 +1,27 @@
+"""User-Agent header generation for HTTP requests."""
+
+import platform
+from importlib.metadata import version
+
+
+def get_user_agent() -> str:
+ """Get the User-Agent string for flash HTTP requests.
+
+ Returns:
+ User-Agent string in format: Runpod Flash/ (Python ; ; )
+
+ Example:
+ >>> get_user_agent()
+ 'Runpod Flash/1.1.1 (Python 3.11.12; Darwin 25.2.0; arm64)'
+ """
+ try:
+ pkg_version = version("runpod-flash")
+ except Exception:
+ pkg_version = "unknown"
+
+ python_version = platform.python_version()
+ os_name = platform.system()
+ os_version = platform.release()
+ arch = platform.machine()
+
+ return f"Runpod Flash/{pkg_version} (Python {python_version}; {os_name} {os_version}; {arch})"
diff --git a/src/runpod_flash/execute_class.py b/src/runpod_flash/execute_class.py
index 0e301d5d..643bc378 100644
--- a/src/runpod_flash/execute_class.py
+++ b/src/runpod_flash/execute_class.py
@@ -57,8 +57,6 @@ def get_or_cache_class_data(
},
)
- log.debug(f"Cached class data for {cls.__name__} with key: {cache_key}")
-
except (TypeError, AttributeError, OSError, SerializationError) as e:
log.warning(
f"Could not serialize constructor arguments for {cls.__name__}: {e}"
@@ -81,9 +79,6 @@ def get_or_cache_class_data(
else:
# Cache hit - retrieve cached data
cached_data = _SERIALIZED_CLASS_CACHE.get(cache_key)
- log.debug(
- f"Retrieved cached class data for {cls.__name__} with key: {cache_key}"
- )
return cached_data["class_code"]
@@ -121,7 +116,6 @@ def extract_class_code_simple(cls: Type) -> str:
# Validate the code by trying to compile it
compile(class_code, "", "exec")
- log.debug(f"Successfully extracted class code for {cls.__name__}")
return class_code
except Exception as e:
@@ -182,7 +176,6 @@ def get_class_cache_key(
# Combine hashes for final cache key
cache_key = f"{cls.__name__}_{class_hash[:HASH_TRUNCATE_LENGTH]}_{args_hash[:HASH_TRUNCATE_LENGTH]}"
- log.debug(f"Generated cache key for {cls.__name__}: {cache_key}")
return cache_key
except (TypeError, AttributeError, OSError) as e:
@@ -229,8 +222,6 @@ def __init__(self, *args, **kwargs):
cls, args, kwargs, self._cache_key
)
- log.debug(f"Created remote class wrapper for {cls.__name__}")
-
async def _ensure_initialized(self):
"""Ensure the remote instance is created."""
if self._initialized:
diff --git a/src/runpod_flash/logger.py b/src/runpod_flash/logger.py
index d024b079..88283edc 100644
--- a/src/runpod_flash/logger.py
+++ b/src/runpod_flash/logger.py
@@ -64,6 +64,10 @@ class SensitiveDataFilter(logging.Filter):
# Pattern for Bearer tokens in Authorization headers
BEARER_PATTERN = re.compile(r"(bearer\s+)([A-Za-z0-9_.-]+)", re.IGNORECASE)
+ # Pattern for common API key prefixes (OpenAI, Anthropic, etc)
+ # Matches: sk-..., key_..., etc. (32+ chars total)
+ PREFIXED_KEY_PATTERN = re.compile(r"\b(sk-|key_|api_)[A-Za-z0-9_-]{28,}\b")
+
def filter(self, record: logging.LogRecord) -> bool:
"""Sanitize log record by redacting sensitive data.
@@ -129,8 +133,12 @@ def _redact_string(self, text: str) -> str:
lambda m: f"{m.group(1)}***REDACTED***{m.group(3)}", text
)
- # Redact generic long tokens
- text = self.TOKEN_PATTERN.sub(self._redact_token, text)
+ # Redact common prefixed API keys (sk-, key_, api_)
+ text = self.PREFIXED_KEY_PATTERN.sub(self._redact_token, text)
+
+ # Generic token pattern disabled - causes false positives with Job IDs, Template IDs, etc.
+ # Specific patterns above catch actual sensitive tokens.
+ # text = self.TOKEN_PATTERN.sub(self._redact_token, text)
# Redact common password/secret patterns
# Match field names with : or = separators and redact the value, preserving separator
@@ -293,7 +301,7 @@ def setup_logging(
# Determine format based on final effective level
if fmt is None:
if level == logging.DEBUG:
- fmt = "%(asctime)s | %(levelname)-5s | %(name)s | %(filename)s:%(lineno)d | %(message)s"
+ fmt = "%(asctime)s | %(levelname)-5s | %(message)s"
else:
# Default format for INFO level and above
fmt = "%(asctime)s | %(levelname)-5s | %(message)s"
@@ -322,3 +330,8 @@ def setup_logging(
existing_handler.addFilter(sensitive_filter)
root_logger.setLevel(level)
+
+ # Silence httpcore trace logs (connection/request details)
+ logging.getLogger("httpcore").setLevel(logging.WARNING)
+ logging.getLogger("httpx").setLevel(logging.WARNING)
+ logging.getLogger("asyncio").setLevel(logging.WARNING)
diff --git a/src/runpod_flash/runtime/context.py b/src/runpod_flash/runtime/context.py
index 0e3a7a6f..541c9bd5 100644
--- a/src/runpod_flash/runtime/context.py
+++ b/src/runpod_flash/runtime/context.py
@@ -9,10 +9,19 @@ def is_deployed_container() -> bool:
A deployed container is identified by:
- RUNPOD_ENDPOINT_ID is set (RunPod sets this for serverless endpoints)
- OR RUNPOD_POD_ID is set (RunPod sets this for pods)
+ - BUT NOT when FLASH_IS_LIVE_PROVISIONING is true (explicit local dev mode)
+
+ The FLASH_IS_LIVE_PROVISIONING flag allows local development with on-demand
+ provisioning even when RunPod environment variables are present (e.g., from
+ testing or previous deployments).
Returns:
True if running in deployed container, False for local dev
"""
+ # Explicit local development mode - overrides container detection
+ if os.getenv("FLASH_IS_LIVE_PROVISIONING", "").lower() == "true":
+ return False
+
return bool(os.getenv("RUNPOD_ENDPOINT_ID") or os.getenv("RUNPOD_POD_ID"))
diff --git a/src/runpod_flash/runtime/load_balancer.py b/src/runpod_flash/runtime/load_balancer.py
index 6c32b465..0c4b6f44 100644
--- a/src/runpod_flash/runtime/load_balancer.py
+++ b/src/runpod_flash/runtime/load_balancer.py
@@ -85,10 +85,6 @@ async def _round_robin_select(self, endpoints: List[str]) -> str:
async with self._lock:
selected = endpoints[self._round_robin_index % len(endpoints)]
self._round_robin_index += 1
- logger.debug(
- f"Load balancer: ROUND_ROBIN selected {selected} "
- f"(index {self._round_robin_index - 1})"
- )
return selected
async def _least_connections_select(self, endpoints: List[str]) -> str:
@@ -109,10 +105,6 @@ async def _least_connections_select(self, endpoints: List[str]) -> str:
# Find endpoint with minimum connections
selected = min(endpoints, key=lambda e: self._in_flight_requests.get(e, 0))
- logger.debug(
- f"Load balancer: LEAST_CONNECTIONS selected {selected} "
- f"({self._in_flight_requests.get(selected, 0)} in-flight)"
- )
return selected
async def _random_select(self, endpoints: List[str]) -> str:
@@ -125,7 +117,6 @@ async def _random_select(self, endpoints: List[str]) -> str:
Selected endpoint URL
"""
selected = random.choice(endpoints)
- logger.debug(f"Load balancer: RANDOM selected {selected}")
return selected
async def record_request(self, endpoint: str) -> None:
diff --git a/src/runpod_flash/runtime/service_registry.py b/src/runpod_flash/runtime/service_registry.py
index 9de8ebd2..1e85fc70 100644
--- a/src/runpod_flash/runtime/service_registry.py
+++ b/src/runpod_flash/runtime/service_registry.py
@@ -77,6 +77,11 @@ def __init__(
"RUNPOD_ENDPOINT_ID"
)
+ # Determine if this endpoint makes remote calls
+ self._makes_remote_calls = self._check_makes_remote_calls(
+ self._current_endpoint
+ )
+
def _load_manifest(self, manifest_path: Optional[Path]) -> None:
"""Load flash_manifest.json.
@@ -132,9 +137,31 @@ def _load_manifest(self, manifest_path: Optional[Path]) -> None:
resources={},
)
+ def _check_makes_remote_calls(self, resource_name: Optional[str]) -> bool:
+ """Check if current resource makes remote calls based on local manifest.
+
+ Args:
+ resource_name: Name of the resource config (FLASH_RESOURCE_NAME or RUNPOD_ENDPOINT_ID).
+
+ Returns:
+ True if resource makes remote calls, False if local-only,
+ True (safe default) if manifest/resource not found.
+ """
+ if not resource_name or not self._manifest.resources:
+ return True # Safe default - allow remote calls
+
+ resource_config = self._manifest.resources.get(resource_name)
+ if not resource_config:
+ return True # Safe default
+
+ return resource_config.makes_remote_calls
+
async def _ensure_manifest_loaded(self) -> None:
"""Load manifest from State Manager if cache expired or not loaded.
+ Skips State Manager query if this endpoint doesn't make remote calls
+ (makes_remote_calls=False in manifest).
+
Peer-to-Peer Architecture:
Each endpoint queries State Manager independently using its own
RUNPOD_ENDPOINT_ID. No mothership dependency - all endpoints
@@ -154,6 +181,14 @@ async def _ensure_manifest_loaded(self) -> None:
Returns:
None. Updates self._endpoint_registry internally.
"""
+ # Skip if endpoint is local-only
+ if not self._makes_remote_calls:
+ logger.debug(
+ "Endpoint does not make remote calls (makes_remote_calls=False), "
+ "skipping State Manager query"
+ )
+ return
+
async with self._endpoint_registry_lock:
now = time.time()
cache_age = now - self._endpoint_registry_loaded_at
diff --git a/tests/unit/cli/commands/build_utils/test_manifest_mothership.py b/tests/unit/cli/commands/build_utils/test_manifest_mothership.py
index 896eefdf..f25bb681 100644
--- a/tests/unit/cli/commands/build_utils/test_manifest_mothership.py
+++ b/tests/unit/cli/commands/build_utils/test_manifest_mothership.py
@@ -7,6 +7,8 @@
from runpod_flash.cli.commands.build_utils.manifest import ManifestBuilder
from runpod_flash.cli.commands.build_utils.scanner import RemoteFunctionMetadata
from runpod_flash.core.resources.constants import (
+ DEFAULT_WORKERS_MAX,
+ DEFAULT_WORKERS_MIN,
FLASH_CPU_LB_IMAGE,
FLASH_LB_IMAGE,
)
@@ -204,8 +206,8 @@ def root():
assert mothership["is_load_balanced"] is True
assert mothership["is_live_resource"] is True
assert mothership["imageName"] == FLASH_CPU_LB_IMAGE
- assert mothership["workersMin"] == 1
- assert mothership["workersMax"] == 1
+ assert mothership["workersMin"] == DEFAULT_WORKERS_MIN
+ assert mothership["workersMax"] == DEFAULT_WORKERS_MAX
def test_manifest_uses_explicit_mothership_config(self):
"""Test explicit mothership.py config takes precedence over auto-detection."""
diff --git a/tests/unit/cli/commands/test_init.py b/tests/unit/cli/commands/test_init.py
index 81910145..3f80e0a6 100644
--- a/tests/unit/cli/commands/test_init.py
+++ b/tests/unit/cli/commands/test_init.py
@@ -193,10 +193,10 @@ def test_status_message_for_current_directory(
init_command(".")
- # Check that status was called with "current directory" message
+ # Check that status was called with initialization message
mock_context["console"].status.assert_called_once()
status_msg = mock_context["console"].status.call_args[0][0]
- assert "current directory" in status_msg
+ assert "Initializing" in status_msg or "Flash project" in status_msg
class TestInitCommandProjectNameHandling:
diff --git a/tests/unit/cli/commands/test_resource.py b/tests/unit/cli/commands/test_resource.py
index e42a546c..a62dcb92 100644
--- a/tests/unit/cli/commands/test_resource.py
+++ b/tests/unit/cli/commands/test_resource.py
@@ -4,7 +4,7 @@
import pytest
-from runpod_flash.cli.commands.resource import generate_resource_table, report_command
+from runpod_flash.cli.commands.resource import _render_resource_report, report_command
@pytest.fixture
@@ -16,80 +16,68 @@ def mock_resource_manager():
class TestGenerateResourceTableEmpty:
- """Tests for generate_resource_table with empty resources."""
-
- def test_empty_resources_returns_panel(self, mock_resource_manager):
- """Test that empty resources returns panel object."""
- mock_resource_manager._resources = {}
-
- result = generate_resource_table(mock_resource_manager)
+ """Tests for _render_resource_report with empty resources."""
+ def test_empty_resources_returns_renderable(self, mock_resource_manager):
+ """Test that empty resources returns a renderable."""
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
- assert hasattr(result, "title") or hasattr(result, "expand")
def test_empty_resources_no_error(self, mock_resource_manager):
"""Test that empty resources doesn't raise error."""
- mock_resource_manager._resources = {}
-
try:
- generate_resource_table(mock_resource_manager)
+ _render_resource_report(mock_resource_manager)
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
class TestGenerateResourceTableSingleResource:
- """Tests for generate_resource_table with single resource."""
+ """Tests for _render_resource_report with single resource."""
def test_single_active_resource_no_error(self, mock_resource_manager):
- """Test table with single active resource doesn't error."""
+ """Test with single active resource doesn't error."""
resource = MagicMock()
resource.is_deployed.return_value = True
resource.__class__.__name__ = "ServerlessEndpoint"
resource.url = "https://example.com/endpoint-123"
- mock_resource_manager._resources = {
- "endpoint-001": resource,
- }
+ mock_resource_manager._resources = {"endpoint-001": resource}
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
def test_single_inactive_resource_no_error(self, mock_resource_manager):
- """Test table with single inactive resource doesn't error."""
+ """Test with single inactive resource doesn't error."""
resource = MagicMock()
resource.is_deployed.return_value = False
resource.__class__.__name__ = "ServerlessEndpoint"
resource.url = "https://example.com/endpoint-456"
- mock_resource_manager._resources = {
- "endpoint-002": resource,
- }
+ mock_resource_manager._resources = {"endpoint-002": resource}
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
def test_resource_is_deployed_exception_handled(self, mock_resource_manager):
- """Test table handles is_deployed exception."""
+ """Test handles is_deployed exception."""
resource = MagicMock()
resource.is_deployed.side_effect = Exception("Connection failed")
resource.__class__.__name__ = "ServerlessEndpoint"
resource.url = "https://example.com/endpoint-789"
- mock_resource_manager._resources = {
- "endpoint-003": resource,
- }
+ mock_resource_manager._resources = {"endpoint-003": resource}
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
def test_resource_without_url_attribute_handled(self, mock_resource_manager):
"""Test resource without url attribute is handled."""
@@ -97,15 +85,13 @@ def test_resource_without_url_attribute_handled(self, mock_resource_manager):
resource.is_deployed.return_value = True
resource.__class__.__name__ = "LoadBalancer"
- mock_resource_manager._resources = {
- "lb-001": resource,
- }
+ mock_resource_manager._resources = {"lb-001": resource}
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
def test_resource_with_empty_url(self, mock_resource_manager):
"""Test resource with empty string URL."""
@@ -114,22 +100,20 @@ def test_resource_with_empty_url(self, mock_resource_manager):
resource.__class__.__name__ = "ServerlessEndpoint"
resource.url = ""
- mock_resource_manager._resources = {
- "endpoint-empty-url": resource,
- }
+ mock_resource_manager._resources = {"endpoint-empty-url": resource}
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
class TestGenerateResourceTableMultipleResources:
- """Tests for generate_resource_table with multiple resources."""
+ """Tests for _render_resource_report with multiple resources."""
def test_multiple_resources_mixed_status_no_error(self, mock_resource_manager):
- """Test table with mixed statuses doesn't error."""
+ """Test with mixed statuses doesn't error."""
active_resource = MagicMock()
active_resource.is_deployed.return_value = True
active_resource.__class__.__name__ = "ServerlessEndpoint"
@@ -146,13 +130,13 @@ def test_multiple_resources_mixed_status_no_error(self, mock_resource_manager):
}
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
def test_multiple_resources_all_active_no_error(self, mock_resource_manager):
- """Test table with all active resources doesn't error."""
+ """Test with all active resources doesn't error."""
resources = {}
for i in range(3):
resource = MagicMock()
@@ -164,10 +148,10 @@ def test_multiple_resources_all_active_no_error(self, mock_resource_manager):
mock_resource_manager._resources = resources
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
def test_long_resource_id_handling(self, mock_resource_manager):
"""Test that long resource IDs are handled (truncated)."""
@@ -176,17 +160,15 @@ def test_long_resource_id_handling(self, mock_resource_manager):
resource.__class__.__name__ = "ServerlessEndpoint"
resource.url = "https://example.com"
- long_id = "a" * 30 # 30 character ID
+ long_id = "a" * 30
- mock_resource_manager._resources = {
- long_id: resource,
- }
+ mock_resource_manager._resources = {long_id: resource}
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
def test_short_resource_id_no_error(self, mock_resource_manager):
"""Test short resource IDs work."""
@@ -195,21 +177,17 @@ def test_short_resource_id_no_error(self, mock_resource_manager):
resource.__class__.__name__ = "ServerlessEndpoint"
resource.url = "https://example.com"
- short_id = "endpoint-123" # 12 characters
-
- mock_resource_manager._resources = {
- short_id: resource,
- }
+ mock_resource_manager._resources = {"endpoint-123": resource}
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
class TestGenerateResourceTableSummary:
- """Tests for generate_resource_table summary calculation."""
+ """Tests for _render_resource_report summary calculation."""
def test_summary_all_active_no_error(self, mock_resource_manager):
"""Test summary with all active resources."""
@@ -224,16 +202,15 @@ def test_summary_all_active_no_error(self, mock_resource_manager):
mock_resource_manager._resources = resources
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
def test_summary_mixed_status_no_error(self, mock_resource_manager):
"""Test summary with mixed status resources."""
resources = {}
- # 2 active
for i in range(2):
resource = MagicMock()
resource.is_deployed.return_value = True
@@ -241,14 +218,12 @@ def test_summary_mixed_status_no_error(self, mock_resource_manager):
resource.url = f"https://example.com/active-{i}"
resources[f"endpoint-{i}"] = resource
- # 1 inactive
resource = MagicMock()
resource.is_deployed.return_value = False
resource.__class__.__name__ = "ServerlessEndpoint"
resource.url = "https://example.com/inactive"
resources["endpoint-2"] = resource
- # 2 unknown (exception)
for i in range(3, 5):
resource = MagicMock()
resource.is_deployed.side_effect = Exception("Error")
@@ -259,10 +234,10 @@ def test_summary_mixed_status_no_error(self, mock_resource_manager):
mock_resource_manager._resources = resources
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
class TestGenerateResourceTableResourceTypes:
@@ -288,10 +263,10 @@ def test_various_resource_types_no_error(self, mock_resource_manager):
mock_resource_manager._resources = resources
try:
- result = generate_resource_table(mock_resource_manager)
+ result = _render_resource_report(mock_resource_manager)
assert result is not None
except Exception as e:
- pytest.fail(f"generate_resource_table raised {type(e).__name__}: {e}")
+ pytest.fail(f"_render_resource_report raised {type(e).__name__}: {e}")
@patch("runpod_flash.cli.commands.resource.ResourceManager")
@@ -304,10 +279,7 @@ def test_report_command_static_mode(mock_console, mock_resource_manager_class):
report_command(live=False)
- # Verify ResourceManager was instantiated
mock_resource_manager_class.assert_called_once()
-
- # Verify console.print was called
mock_console.print.assert_called_once()
@@ -327,7 +299,6 @@ def test_report_command_live_mode(
mock_live_class.return_value.__enter__ = MagicMock(return_value=mock_live_instance)
mock_live_class.return_value.__exit__ = MagicMock(return_value=False)
- # Make it break after first iteration
call_count = [0]
def sleep_side_effect(duration):
@@ -339,10 +310,7 @@ def sleep_side_effect(duration):
report_command(live=True, refresh=2)
- # Verify Live was used
mock_live_class.assert_called_once()
-
- # Verify console printed "stopped" message
assert any("stopped" in str(c).lower() for c in mock_console.print.call_args_list)
@@ -356,7 +324,6 @@ def test_report_command_with_custom_refresh(mock_console, mock_resource_manager_
report_command(live=False, refresh=5)
- # Verify it ran without error with custom refresh value
mock_resource_manager_class.assert_called_once()
mock_console.print.assert_called_once()
@@ -373,22 +340,20 @@ def test_report_command_instantiates_resource_manager(
report_command(live=False)
- # Verify ResourceManager() was called (instantiated)
mock_resource_manager_class.assert_called_once_with()
-@patch("runpod_flash.cli.commands.resource.generate_resource_table")
+@patch("runpod_flash.cli.commands.resource._render_resource_report")
@patch("runpod_flash.cli.commands.resource.ResourceManager")
@patch("runpod_flash.cli.commands.resource.console")
-def test_report_command_calls_generate_table(
- mock_console, mock_resource_manager_class, mock_generate_table
+def test_report_command_calls_render_report(
+ mock_console, mock_resource_manager_class, mock_render
):
- """Test that report_command calls generate_resource_table."""
+ """Test that report_command calls _render_resource_report."""
mock_manager_instance = MagicMock()
mock_resource_manager_class.return_value = mock_manager_instance
- mock_generate_table.return_value = MagicMock()
+ mock_render.return_value = MagicMock()
report_command(live=False)
- # Verify generate_resource_table was called with manager
- mock_generate_table.assert_called_once_with(mock_manager_instance)
+ mock_render.assert_called_once_with(mock_manager_instance)
diff --git a/tests/unit/cli/test_apps.py b/tests/unit/cli/test_apps.py
index cfabc66d..022bc9c5 100644
--- a/tests/unit/cli/test_apps.py
+++ b/tests/unit/cli/test_apps.py
@@ -1,10 +1,8 @@
-"""More focused apps CLI tests that validate asyncio + console wiring."""
+"""Unit tests for apps CLI commands."""
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
-from rich.panel import Panel
-from rich.table import Table
from typer.testing import CliRunner
from runpod_flash.cli.main import app
@@ -49,11 +47,13 @@ def test_create_app_success(
assert result.exit_code == 0
mock_create.assert_awaited_once_with("demo-app")
- last_call = patched_console.print.call_args_list[-1]
- panel = last_call.args[0]
- assert isinstance(panel, Panel)
- assert "demo-app" in panel.renderable
- assert panel.title == "✅ App Created"
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "demo-app" in printed
+ assert "app-987" in printed
@patch("runpod_flash.cli.commands.apps.FlashApp.create", new_callable=AsyncMock)
def test_create_app_failure_bubbles_error(
@@ -85,40 +85,26 @@ def test_list_apps_empty(
result = runner.invoke(app, ["app", "list"])
assert result.exit_code == 0
- patched_console.print.assert_called_with("No Flash apps found.")
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "No Flash apps found" in printed
@patch("runpod_flash.cli.commands.apps.FlashApp.list", new_callable=AsyncMock)
def test_list_apps_with_data(
self, mock_list, runner, mock_asyncio_run_coro, patched_console
):
- # Matches actual GraphQL flashApp response structure
mock_list.return_value = [
{
"id": "app-1",
"name": "demo",
"flashEnvironments": [
- {
- "id": "env-1",
- "name": "dev",
- "state": "ACTIVE",
- "activeBuildId": None,
- "createdAt": "2024-01-01T00:00:00Z",
- },
- {
- "id": "env-2",
- "name": "prod",
- "state": "ACTIVE",
- "activeBuildId": "build-1",
- "createdAt": "2024-01-02T00:00:00Z",
- },
- ],
- "flashBuilds": [
- {
- "id": "build-1",
- "objectKey": "builds/app-1/build-1.tar.gz",
- "createdAt": "2024-01-01T00:00:00Z",
- }
+ {"id": "env-1", "name": "dev"},
+ {"id": "env-2", "name": "prod"},
],
+ "flashBuilds": [{"id": "build-1"}],
}
]
@@ -129,12 +115,14 @@ def test_list_apps_with_data(
result = runner.invoke(app, ["app", "list"])
assert result.exit_code == 0
- table = patched_console.print.call_args_list[-1].args[0]
- assert isinstance(table, Table)
- columns = table.columns
- assert "demo" in columns[0]._cells[0]
- assert "dev, prod" in columns[2]._cells[0]
- assert "build-1" in columns[3]._cells[0]
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "demo" in printed
+ assert "dev" in printed
+ assert "prod" in printed
class TestAppsGet:
@@ -171,13 +159,14 @@ def test_get_app_details(
mock_from_name.assert_awaited_once_with("demo")
flash_app.list_environments.assert_awaited_once()
flash_app.list_builds.assert_awaited_once()
- panel = patched_console.print.call_args_list[0].args[0]
- assert isinstance(panel, Panel)
- assert "Name: demo" in panel.renderable
- env_table = patched_console.print.call_args_list[1].args[0]
- build_table = patched_console.print.call_args_list[2].args[0]
- assert isinstance(env_table, Table)
- assert isinstance(build_table, Table)
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "demo" in printed
+ assert "Environments" in printed
+ assert "Builds" in printed
@patch("runpod_flash.cli.commands.apps.FlashApp.from_name", new_callable=AsyncMock)
def test_get_app_without_related_data(
@@ -197,9 +186,13 @@ def test_get_app_without_related_data(
result = runner.invoke(app, ["app", "get", "demo"])
assert result.exit_code == 0
- assert len(patched_console.print.call_args_list) == 1
- panel = patched_console.print.call_args.args[0]
- assert isinstance(panel, Panel)
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "None yet" in printed
+ assert "flash deploy" in printed
class TestAppsDelete:
@@ -213,13 +206,17 @@ def test_delete_app_success(
"runpod_flash.cli.commands.apps.asyncio.run",
side_effect=mock_asyncio_run_coro,
):
- result = runner.invoke(app, ["app", "delete", "--app", "demo"])
+ result = runner.invoke(app, ["app", "delete", "demo"])
assert result.exit_code == 0
mock_delete.assert_awaited_once_with(app_name="demo")
- patched_console.print.assert_called_with(
- "✅ Flash app 'demo' deleted successfully"
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
)
+ assert "Deleted" in printed
+ assert "demo" in printed
@patch("runpod_flash.cli.commands.apps.FlashApp.delete", new_callable=AsyncMock)
def test_delete_app_failure_raises_exit(
@@ -231,30 +228,17 @@ def test_delete_app_failure_raises_exit(
"runpod_flash.cli.commands.apps.asyncio.run",
side_effect=mock_asyncio_run_coro,
):
- result = runner.invoke(app, ["app", "delete", "--app", "demo"])
+ result = runner.invoke(app, ["app", "delete", "demo"])
assert result.exit_code == 1
- patched_console.print.assert_called_with("❌ Failed to delete flash app 'demo'")
-
- @patch("runpod_flash.cli.commands.apps.discover_flash_project")
- @patch("runpod_flash.cli.commands.apps.FlashApp.delete", new_callable=AsyncMock)
- def test_delete_app_uses_discovered_name(
- self,
- mock_delete,
- mock_discover,
- runner,
- mock_asyncio_run_coro,
- patched_console,
- ):
- mock_delete.return_value = True
- mock_discover.return_value = ("/tmp/flash", "derived")
-
- with patch(
- "runpod_flash.cli.commands.apps.asyncio.run",
- side_effect=mock_asyncio_run_coro,
- ):
- result = runner.invoke(app, ["app", "delete", "--app", ""])
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "Failed to delete" in printed
- assert result.exit_code == 0
- mock_discover.assert_called_once()
- mock_delete.assert_awaited_once_with(app_name="derived")
+ def test_delete_app_missing_name_exits_with_error(self, runner):
+ result = runner.invoke(app, ["app", "delete"])
+ assert result.exit_code == 2
+ assert "Missing argument" in result.output
diff --git a/tests/unit/cli/test_deploy.py b/tests/unit/cli/test_deploy.py
index 90872ec1..80df379d 100644
--- a/tests/unit/cli/test_deploy.py
+++ b/tests/unit/cli/test_deploy.py
@@ -24,9 +24,24 @@ def patched_console():
yield mock_console
+def _make_flash_app(**kwargs):
+ """Create a MagicMock flash app with common async methods."""
+ flash_app = MagicMock()
+ flash_app.upload_build = AsyncMock(return_value={"id": "build-123"})
+ flash_app.get_environment_by_name = AsyncMock()
+ for key, value in kwargs.items():
+ setattr(flash_app, key, value)
+ return flash_app
+
+
class TestDeployCommand:
@patch(
- "runpod_flash.cli.commands.deploy.deploy_to_environment", new_callable=AsyncMock
+ "runpod_flash.cli.commands.deploy.deploy_from_uploaded_build",
+ new_callable=AsyncMock,
+ )
+ @patch(
+ "runpod_flash.cli.commands.deploy.validate_local_manifest",
+ return_value={"resources": {}},
)
@patch(
"runpod_flash.cli.commands.deploy.FlashApp.from_name", new_callable=AsyncMock
@@ -38,17 +53,20 @@ def test_deploy_single_env_auto_selects(
mock_discover,
mock_build,
mock_from_name,
- mock_deploy_to_env,
+ mock_validate,
+ mock_deploy,
runner,
mock_asyncio_run_coro,
patched_console,
):
mock_discover.return_value = (Path("/tmp/project"), "my-app")
mock_build.return_value = Path("/tmp/project/.flash/artifact.tar.gz")
+ mock_deploy.return_value = {"success": True}
- flash_app = MagicMock()
- flash_app.list_environments = AsyncMock(
- return_value=[{"name": "production", "id": "env-1"}]
+ flash_app = _make_flash_app(
+ list_environments=AsyncMock(
+ return_value=[{"name": "production", "id": "env-1"}]
+ ),
)
mock_from_name.return_value = flash_app
@@ -63,10 +81,15 @@ def test_deploy_single_env_auto_selects(
assert result.exit_code == 0
mock_build.assert_called_once()
- mock_deploy_to_env.assert_awaited_once()
+ mock_deploy.assert_awaited_once()
@patch(
- "runpod_flash.cli.commands.deploy.deploy_to_environment", new_callable=AsyncMock
+ "runpod_flash.cli.commands.deploy.deploy_from_uploaded_build",
+ new_callable=AsyncMock,
+ )
+ @patch(
+ "runpod_flash.cli.commands.deploy.validate_local_manifest",
+ return_value={"resources": {}},
)
@patch(
"runpod_flash.cli.commands.deploy.FlashApp.from_name", new_callable=AsyncMock
@@ -78,20 +101,23 @@ def test_deploy_with_explicit_env(
mock_discover,
mock_build,
mock_from_name,
- mock_deploy_to_env,
+ mock_validate,
+ mock_deploy,
runner,
mock_asyncio_run_coro,
patched_console,
):
mock_discover.return_value = (Path("/tmp/project"), "my-app")
mock_build.return_value = Path("/tmp/project/.flash/artifact.tar.gz")
-
- flash_app = MagicMock()
- flash_app.list_environments = AsyncMock(
- return_value=[
- {"name": "staging", "id": "env-1"},
- {"name": "production", "id": "env-2"},
- ]
+ mock_deploy.return_value = {"success": True}
+
+ flash_app = _make_flash_app(
+ list_environments=AsyncMock(
+ return_value=[
+ {"name": "staging", "id": "env-1"},
+ {"name": "production", "id": "env-2"},
+ ]
+ ),
)
mock_from_name.return_value = flash_app
@@ -105,9 +131,9 @@ def test_deploy_with_explicit_env(
result = runner.invoke(app, ["deploy", "--env", "staging"])
assert result.exit_code == 0
- mock_deploy_to_env.assert_awaited_once()
- call_args = mock_deploy_to_env.call_args
- assert call_args[0][1] == "staging"
+ mock_deploy.assert_awaited_once()
+ call_args = mock_deploy.call_args
+ assert call_args[0][2] == "staging" # env_name
@patch(
"runpod_flash.cli.commands.deploy.FlashApp.from_name", new_callable=AsyncMock
@@ -126,12 +152,13 @@ def test_deploy_multiple_envs_no_flag_errors(
mock_discover.return_value = (Path("/tmp/project"), "my-app")
mock_build.return_value = Path("/tmp/project/.flash/artifact.tar.gz")
- flash_app = MagicMock()
- flash_app.list_environments = AsyncMock(
- return_value=[
- {"name": "staging", "id": "env-1"},
- {"name": "production", "id": "env-2"},
- ]
+ flash_app = _make_flash_app(
+ list_environments=AsyncMock(
+ return_value=[
+ {"name": "staging", "id": "env-1"},
+ {"name": "production", "id": "env-2"},
+ ]
+ ),
)
mock_from_name.return_value = flash_app
@@ -151,7 +178,12 @@ def test_deploy_multiple_envs_no_flag_errors(
"runpod_flash.cli.commands.deploy.FlashApp.from_name", new_callable=AsyncMock
)
@patch(
- "runpod_flash.cli.commands.deploy.deploy_to_environment", new_callable=AsyncMock
+ "runpod_flash.cli.commands.deploy.deploy_from_uploaded_build",
+ new_callable=AsyncMock,
+ )
+ @patch(
+ "runpod_flash.cli.commands.deploy.validate_local_manifest",
+ return_value={"resources": {}},
)
@patch("runpod_flash.cli.commands.deploy.run_build")
@patch("runpod_flash.cli.commands.deploy.discover_flash_project")
@@ -159,7 +191,8 @@ def test_deploy_no_app_creates_app_and_env(
self,
mock_discover,
mock_build,
- mock_deploy_to_env,
+ mock_validate,
+ mock_deploy,
mock_from_name,
mock_create,
runner,
@@ -168,8 +201,11 @@ def test_deploy_no_app_creates_app_and_env(
):
mock_discover.return_value = (Path("/tmp/project"), "my-app")
mock_build.return_value = Path("/tmp/project/.flash/artifact.tar.gz")
+ mock_deploy.return_value = {"success": True}
mock_from_name.side_effect = Exception("GraphQL errors: app not found")
- mock_create.return_value = (MagicMock(), {"id": "env-1", "name": "production"})
+
+ created_app = _make_flash_app()
+ mock_create.return_value = (created_app, {"id": "env-1", "name": "production"})
with (
patch(
@@ -211,7 +247,12 @@ def test_deploy_non_app_error_propagates(
assert result.exit_code == 1
@patch(
- "runpod_flash.cli.commands.deploy.deploy_to_environment", new_callable=AsyncMock
+ "runpod_flash.cli.commands.deploy.deploy_from_uploaded_build",
+ new_callable=AsyncMock,
+ )
+ @patch(
+ "runpod_flash.cli.commands.deploy.validate_local_manifest",
+ return_value={"resources": {}},
)
@patch(
"runpod_flash.cli.commands.deploy.FlashApp.from_name", new_callable=AsyncMock
@@ -223,19 +264,22 @@ def test_deploy_auto_creates_nonexistent_env(
mock_discover,
mock_build,
mock_from_name,
- mock_deploy_to_env,
+ mock_validate,
+ mock_deploy,
runner,
mock_asyncio_run_coro,
patched_console,
):
mock_discover.return_value = (Path("/tmp/project"), "my-app")
mock_build.return_value = Path("/tmp/project/.flash/artifact.tar.gz")
+ mock_deploy.return_value = {"success": True}
- flash_app = MagicMock()
- flash_app.list_environments = AsyncMock(
- return_value=[{"name": "production", "id": "env-1"}]
+ flash_app = _make_flash_app(
+ list_environments=AsyncMock(
+ return_value=[{"name": "production", "id": "env-1"}]
+ ),
+ create_environment=AsyncMock(),
)
- flash_app.create_environment = AsyncMock()
mock_from_name.return_value = flash_app
with (
@@ -251,7 +295,12 @@ def test_deploy_auto_creates_nonexistent_env(
flash_app.create_environment.assert_awaited_once_with("staging")
@patch(
- "runpod_flash.cli.commands.deploy.deploy_to_environment", new_callable=AsyncMock
+ "runpod_flash.cli.commands.deploy.deploy_from_uploaded_build",
+ new_callable=AsyncMock,
+ )
+ @patch(
+ "runpod_flash.cli.commands.deploy.validate_local_manifest",
+ return_value={"resources": {}},
)
@patch(
"runpod_flash.cli.commands.deploy.FlashApp.from_name", new_callable=AsyncMock
@@ -263,17 +312,20 @@ def test_deploy_zero_envs_creates_production(
mock_discover,
mock_build,
mock_from_name,
- mock_deploy_to_env,
+ mock_validate,
+ mock_deploy,
runner,
mock_asyncio_run_coro,
patched_console,
):
mock_discover.return_value = (Path("/tmp/project"), "my-app")
mock_build.return_value = Path("/tmp/project/.flash/artifact.tar.gz")
+ mock_deploy.return_value = {"success": True}
- flash_app = MagicMock()
- flash_app.list_environments = AsyncMock(return_value=[])
- flash_app.create_environment = AsyncMock()
+ flash_app = _make_flash_app(
+ list_environments=AsyncMock(return_value=[]),
+ create_environment=AsyncMock(),
+ )
mock_from_name.return_value = flash_app
with (
@@ -289,7 +341,12 @@ def test_deploy_zero_envs_creates_production(
flash_app.create_environment.assert_awaited_once_with("production")
@patch(
- "runpod_flash.cli.commands.deploy.deploy_to_environment", new_callable=AsyncMock
+ "runpod_flash.cli.commands.deploy.deploy_from_uploaded_build",
+ new_callable=AsyncMock,
+ )
+ @patch(
+ "runpod_flash.cli.commands.deploy.validate_local_manifest",
+ return_value={"resources": {}},
)
@patch(
"runpod_flash.cli.commands.deploy.FlashApp.from_name", new_callable=AsyncMock
@@ -301,17 +358,20 @@ def test_deploy_shows_completion_panel(
mock_discover,
mock_build,
mock_from_name,
- mock_deploy_to_env,
+ mock_validate,
+ mock_deploy,
runner,
mock_asyncio_run_coro,
patched_console,
):
mock_discover.return_value = (Path("/tmp/project"), "my-app")
mock_build.return_value = Path("/tmp/project/.flash/artifact.tar.gz")
+ mock_deploy.return_value = {"success": True}
- flash_app = MagicMock()
- flash_app.list_environments = AsyncMock(
- return_value=[{"name": "production", "id": "env-1"}]
+ flash_app = _make_flash_app(
+ list_environments=AsyncMock(
+ return_value=[{"name": "production", "id": "env-1"}]
+ ),
)
mock_from_name.return_value = flash_app
@@ -331,11 +391,15 @@ def test_deploy_shows_completion_panel(
for call in patched_console.print.call_args_list
]
guidance_text = " ".join(printed_output)
- assert "Next Steps:" in guidance_text
- assert "Authentication Required" in guidance_text
+ assert "Useful commands:" in guidance_text
@patch(
- "runpod_flash.cli.commands.deploy.deploy_to_environment", new_callable=AsyncMock
+ "runpod_flash.cli.commands.deploy.deploy_from_uploaded_build",
+ new_callable=AsyncMock,
+ )
+ @patch(
+ "runpod_flash.cli.commands.deploy.validate_local_manifest",
+ return_value={"resources": {}},
)
@patch(
"runpod_flash.cli.commands.deploy.FlashApp.from_name", new_callable=AsyncMock
@@ -347,17 +411,20 @@ def test_deploy_uses_app_flag(
mock_discover,
mock_build,
mock_from_name,
- mock_deploy_to_env,
+ mock_validate,
+ mock_deploy,
runner,
mock_asyncio_run_coro,
patched_console,
):
mock_discover.return_value = (Path("/tmp/project"), "default-app")
mock_build.return_value = Path("/tmp/project/.flash/artifact.tar.gz")
+ mock_deploy.return_value = {"success": True}
- flash_app = MagicMock()
- flash_app.list_environments = AsyncMock(
- return_value=[{"name": "production", "id": "env-1"}]
+ flash_app = _make_flash_app(
+ list_environments=AsyncMock(
+ return_value=[{"name": "production", "id": "env-1"}]
+ ),
)
mock_from_name.return_value = flash_app
diff --git a/tests/unit/cli/test_env.py b/tests/unit/cli/test_env.py
index a5d16390..3ef66aa0 100644
--- a/tests/unit/cli/test_env.py
+++ b/tests/unit/cli/test_env.py
@@ -3,8 +3,6 @@
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
-from rich.panel import Panel
-from rich.table import Table
from typer.testing import CliRunner
from runpod_flash.cli.main import app
@@ -41,7 +39,13 @@ def test_list_environments_empty(
result = runner.invoke(app, ["env", "list", "--app", "demo"])
assert result.exit_code == 0
- patched_console.print.assert_called_with("No environments found for 'demo'.")
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "No environments" in printed
+ assert "demo" in printed
mock_from_name.assert_awaited_once_with("demo")
@patch("runpod_flash.cli.commands.env.FlashApp.from_name", new_callable=AsyncMock)
@@ -68,10 +72,13 @@ def test_list_environments_with_data(
result = runner.invoke(app, ["env", "list", "--app", "demo"])
assert result.exit_code == 0
- table = patched_console.print.call_args_list[-1].args[0]
- assert isinstance(table, Table)
- assert table.columns[0]._cells[0] == "dev"
- assert table.columns[2]._cells[0] == "build-1"
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "dev" in printed
+ assert "build-1" in printed
@patch("runpod_flash.cli.commands.env.discover_flash_project")
@patch("runpod_flash.cli.commands.env.FlashApp.from_name", new_callable=AsyncMock)
@@ -125,11 +132,13 @@ def test_create_environment_success(
assert result.exit_code == 0
mock_create.assert_awaited_once_with("demo", "dev")
- panel = patched_console.print.call_args_list[0].args[0]
- assert isinstance(panel, Panel)
- assert "[bold]dev[/bold]" in panel.renderable
- table = patched_console.print.call_args_list[1].args[0]
- assert isinstance(table, Table)
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "dev" in printed
+ assert "env-123" in printed
class TestEnvGet:
@@ -158,12 +167,16 @@ def test_get_includes_children(
result = runner.invoke(app, ["env", "get", "dev", "--app", "demo"])
assert result.exit_code == 0
- panel = patched_console.print.call_args_list[0].args[0]
- assert isinstance(panel, Panel)
- endpoint_table = patched_console.print.call_args_list[1].args[0]
- network_table = patched_console.print.call_args_list[2].args[0]
- assert isinstance(endpoint_table, Table)
- assert isinstance(network_table, Table)
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "dev" in printed
+ assert "ep-1" in printed
+ assert "nv-1" in printed
+ assert "Endpoints" in printed
+ assert "Network Volumes" in printed
@patch("runpod_flash.cli.commands.env.FlashApp.from_name", new_callable=AsyncMock)
def test_get_without_children(
@@ -190,9 +203,15 @@ def test_get_without_children(
result = runner.invoke(app, ["env", "get", "dev", "--app", "demo"])
assert result.exit_code == 0
- # Only the panel should be printed when there are no child resources
- assert len(patched_console.print.call_args_list) == 1
- assert isinstance(patched_console.print.call_args.args[0], Panel)
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "dev" in printed
+ # no endpoint or nv sections when empty
+ assert "Endpoints" not in printed
+ assert "Network Volumes" not in printed
class TestEnvDelete:
@@ -237,7 +256,12 @@ def test_delete_environment_success(
assert result.exit_code == 0
mock_questionary.confirm.assert_called_once()
flash_app.delete_environment.assert_awaited_once_with("dev")
- patched_console.print.assert_any_call("Environment 'dev' deleted successfully")
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
+ )
+ assert "Deleted" in printed
@patch(
"runpod_flash.cli.commands.env._fetch_environment_info",
@@ -276,7 +300,7 @@ def test_delete_environment_cancelled(
assert result.exit_code == 0
mock_questionary.confirm.assert_called_once()
flash_app.delete_environment.assert_not_called()
- patched_console.print.assert_any_call("Deletion cancelled")
+ patched_console.print.assert_any_call("[yellow]Cancelled[/yellow]")
@patch(
"runpod_flash.cli.commands.env._fetch_environment_info",
@@ -318,6 +342,9 @@ def test_delete_environment_failure(
assert result.exit_code == 1
flash_app.delete_environment.assert_awaited_once_with("dev")
- patched_console.print.assert_any_call(
- "[red]Failed to delete environment 'dev'[/red]"
+ printed = " ".join(
+ str(call.args[0])
+ for call in patched_console.print.call_args_list
+ if call.args
)
+ assert "Failed to delete" in printed
diff --git a/tests/unit/cli/test_undeploy.py b/tests/unit/cli/test_undeploy.py
index 16113163..ee3d9267 100644
--- a/tests/unit/cli/test_undeploy.py
+++ b/tests/unit/cli/test_undeploy.py
@@ -173,7 +173,7 @@ def test_undeploy_no_args_shows_help(self, runner):
assert "Usage" in result.stdout or "undeploy" in result.stdout.lower()
def test_undeploy_no_args_shows_usage_text(self, runner):
- """Ensure usage panel is rendered when no args are provided."""
+ """Ensure usage help is rendered when no args are provided."""
with patch(
"runpod_flash.cli.commands.undeploy._get_resource_manager"
) as mock_get_rm:
@@ -185,7 +185,6 @@ def test_undeploy_no_args_shows_usage_text(self, runner):
result = runner.invoke(app, ["undeploy"])
- assert "usage: flash undeploy" in result.stdout.lower()
assert "please specify a name" in result.stdout.lower()
def test_undeploy_nonexistent_name(self, runner, sample_resources):
@@ -260,7 +259,7 @@ async def mock_undeploy(resource_id, name):
result = runner.invoke(app, ["undeploy", "test-api-1"])
assert result.exit_code == 0
- assert "Successfully" in result.stdout
+ assert "Deleted" in result.stdout
@patch("runpod_flash.cli.commands.undeploy.asyncio.run")
def test_undeploy_all_flag(
@@ -302,7 +301,7 @@ async def mock_undeploy(resource_id, name):
result = runner.invoke(app, ["undeploy", "--all"])
assert result.exit_code == 0
- assert "Successfully" in result.stdout
+ assert "Deleted" in result.stdout
def test_undeploy_all_wrong_confirmation(self, runner, sample_resources):
"""Test undeploy --all with wrong confirmation text."""
@@ -341,10 +340,10 @@ def test_get_resource_status_active(self):
mock_resource = MagicMock()
mock_resource.is_deployed.return_value = True
- icon, text = _get_resource_status(mock_resource)
+ color, text = _get_resource_status(mock_resource)
- assert icon == "🟢"
- assert text == "Active"
+ assert color == "green"
+ assert text == "active"
def test_get_resource_status_inactive(self):
"""Test _get_resource_status for inactive resource."""
@@ -353,10 +352,10 @@ def test_get_resource_status_inactive(self):
mock_resource = MagicMock()
mock_resource.is_deployed.return_value = False
- icon, text = _get_resource_status(mock_resource)
+ color, text = _get_resource_status(mock_resource)
- assert icon == "🔴"
- assert text == "Inactive"
+ assert color == "red"
+ assert text == "inactive"
def test_get_resource_status_exception(self):
"""Test _get_resource_status when exception occurs."""
@@ -365,10 +364,10 @@ def test_get_resource_status_exception(self):
mock_resource = MagicMock()
mock_resource.is_deployed.side_effect = Exception("API Error")
- icon, text = _get_resource_status(mock_resource)
+ color, text = _get_resource_status(mock_resource)
- assert icon == "❓"
- assert text == "Unknown"
+ assert color == "yellow"
+ assert text == "unknown"
def test_get_resource_type(self, sample_resources):
"""Test _get_resource_type returns formatted type."""
diff --git a/tests/unit/cli/utils/test_deployment.py b/tests/unit/cli/utils/test_deployment.py
index 5880643e..663d8a0c 100644
--- a/tests/unit/cli/utils/test_deployment.py
+++ b/tests/unit/cli/utils/test_deployment.py
@@ -2,13 +2,12 @@
import asyncio
from unittest.mock import AsyncMock, MagicMock, patch
-from pathlib import Path
import pytest
from runpod_flash.cli.utils.deployment import (
provision_resources_for_build,
- deploy_to_environment,
+ deploy_from_uploaded_build,
reconcile_and_provision_resources,
)
@@ -204,12 +203,11 @@ async def mock_get_or_deploy_resource(resource):
@pytest.mark.asyncio
-async def test_deploy_to_environment_success(
+async def test_deploy_from_uploaded_build_success(
mock_flash_app, mock_deployed_resource, tmp_path
):
"""Test successful deployment flow with provisioning."""
mock_flash_app.get_environment_by_name = AsyncMock()
- mock_flash_app.upload_build = AsyncMock(return_value={"id": "build-123"})
mock_flash_app.deploy_build_to_environment = AsyncMock(
return_value={"success": True}
)
@@ -222,7 +220,6 @@ async def test_deploy_to_environment_success(
)
mock_flash_app.update_build_manifest = AsyncMock()
- build_path = Path("/tmp/build.tar.gz")
local_manifest = {
"resources": {
"cpu": {"resource_type": "ServerlessResource"},
@@ -230,7 +227,6 @@ async def test_deploy_to_environment_success(
"resources_endpoints": {},
}
- # Create temporary manifest file
import json
manifest_dir = tmp_path / ".flash"
@@ -240,13 +236,11 @@ async def test_deploy_to_environment_success(
with (
patch("pathlib.Path.cwd", return_value=tmp_path),
- patch("runpod_flash.cli.utils.deployment.FlashApp.from_name") as mock_from_name,
patch("runpod_flash.cli.utils.deployment.ResourceManager") as mock_manager_cls,
patch(
"runpod_flash.cli.utils.deployment.create_resource_from_manifest"
) as mock_create_resource,
):
- mock_from_name.return_value = mock_flash_app
mock_manager = MagicMock()
mock_manager.get_or_deploy_resource = AsyncMock(
return_value=mock_deployed_resource
@@ -254,27 +248,32 @@ async def test_deploy_to_environment_success(
mock_manager_cls.return_value = mock_manager
mock_create_resource.return_value = MagicMock()
- result = await deploy_to_environment("app-name", "dev", build_path)
+ result = await deploy_from_uploaded_build(
+ mock_flash_app, "build-123", "dev", local_manifest
+ )
- assert result == {"success": True}
+ assert result["success"] is True
+ assert "resources_endpoints" in result
+ assert "local_manifest" in result
mock_flash_app.get_environment_by_name.assert_awaited_once_with("dev")
- mock_flash_app.upload_build.assert_awaited_once_with(build_path)
mock_flash_app.deploy_build_to_environment.assert_awaited_once()
@pytest.mark.asyncio
-async def test_deploy_to_environment_provisioning_failure(mock_flash_app, tmp_path):
+async def test_deploy_from_uploaded_build_provisioning_failure(
+ mock_flash_app, tmp_path
+):
"""Test deployment when provisioning fails."""
mock_flash_app.get_environment_by_name = AsyncMock()
- mock_flash_app.upload_build = AsyncMock(return_value={"id": "build-123"})
- # State Manager has no resources, so local_manifest resources will be NEW
+ mock_flash_app.deploy_build_to_environment = AsyncMock(
+ return_value={"success": True}
+ )
mock_flash_app.get_build_manifest = AsyncMock(
return_value={
"resources": {},
}
)
- build_path = Path("/tmp/build.tar.gz")
local_manifest = {
"resources": {
"cpu": {"resource_type": "ServerlessResource"},
@@ -282,7 +281,6 @@ async def test_deploy_to_environment_provisioning_failure(mock_flash_app, tmp_pa
"resources_endpoints": {},
}
- # Create temporary manifest file
import json
manifest_dir = tmp_path / ".flash"
@@ -292,13 +290,11 @@ async def test_deploy_to_environment_provisioning_failure(mock_flash_app, tmp_pa
with (
patch("pathlib.Path.cwd", return_value=tmp_path),
- patch("runpod_flash.cli.utils.deployment.FlashApp.from_name") as mock_from_name,
patch("runpod_flash.cli.utils.deployment.ResourceManager") as mock_manager_cls,
patch(
"runpod_flash.cli.utils.deployment.create_resource_from_manifest"
) as mock_create_resource,
):
- mock_from_name.return_value = mock_flash_app
mock_manager = MagicMock()
mock_manager.get_or_deploy_resource = AsyncMock(
side_effect=Exception("Resource deployment failed")
@@ -307,7 +303,9 @@ async def test_deploy_to_environment_provisioning_failure(mock_flash_app, tmp_pa
mock_create_resource.return_value = MagicMock()
with pytest.raises(RuntimeError) as exc_info:
- await deploy_to_environment("app-name", "dev", build_path)
+ await deploy_from_uploaded_build(
+ mock_flash_app, "build-123", "dev", local_manifest
+ )
assert "Failed to provision resources" in str(exc_info.value)
@@ -384,7 +382,7 @@ async def test_reconciliation_reprovisions_resources_without_endpoints(tmp_path)
mock_create_resource.side_effect = [MagicMock(), MagicMock()]
result = await reconcile_and_provision_resources(
- app, "build-123", "dev", local_manifest, show_progress=False
+ app, "build-123", "dev", local_manifest
)
# Both resources should have been provisioned (re-provisioned actually)
diff --git a/tests/unit/cli/utils/test_formatting.py b/tests/unit/cli/utils/test_formatting.py
new file mode 100644
index 00000000..bdefce90
--- /dev/null
+++ b/tests/unit/cli/utils/test_formatting.py
@@ -0,0 +1,56 @@
+"""Tests for CLI formatting utilities."""
+
+from runpod_flash.cli.utils.formatting import format_datetime, state_dot
+
+
+class TestFormatDatetime:
+ def test_iso_utc_z_suffix(self):
+ result = format_datetime("2025-06-15T14:30:00Z")
+ assert "Jun" in result
+ assert "2025" in result
+ assert "AM" in result or "PM" in result
+
+ def test_iso_with_offset(self):
+ result = format_datetime("2025-06-15T14:30:00+00:00")
+ assert "Jun" in result
+ assert "2025" in result
+
+ def test_none_returns_dash(self):
+ assert format_datetime(None) == "-"
+
+ def test_empty_returns_dash(self):
+ assert format_datetime("") == "-"
+
+ def test_unparseable_returns_original(self):
+ assert format_datetime("not-a-date") == "not-a-date"
+
+ def test_includes_day_of_week(self):
+ # use a fixed-offset timestamp so the weekday is deterministic
+ result = format_datetime("2025-06-15T00:00:00+00:00")
+ # output should start with a 3-letter weekday abbreviation
+ assert result[:3].isalpha() and result[3] == ","
+
+ def test_includes_timezone(self):
+ result = format_datetime("2025-06-15T14:30:00Z")
+ # should have some tz abbreviation at the end
+ parts = result.split()
+ assert len(parts) >= 5
+
+ def test_no_leading_zeros(self):
+ # day "5" should not be zero-padded to "05"
+ result = format_datetime("2025-01-05T12:00:00+00:00")
+ assert " 5 " in result or " 5," in result
+
+
+class TestStateDot:
+ def test_healthy(self):
+ assert "[green]●[/green]" in state_dot("HEALTHY")
+
+ def test_building(self):
+ assert "[cyan]●[/cyan]" in state_dot("BUILDING")
+
+ def test_error(self):
+ assert "[red]●[/red]" in state_dot("ERROR")
+
+ def test_unknown_defaults_yellow(self):
+ assert "[yellow]●[/yellow]" in state_dot("WHATEVER")
diff --git a/tests/unit/core/utils/test_http.py b/tests/unit/core/utils/test_http.py
index 06f124e5..023bb812 100644
--- a/tests/unit/core/utils/test_http.py
+++ b/tests/unit/core/utils/test_http.py
@@ -77,6 +77,28 @@ def test_get_authenticated_httpx_client_zero_timeout(self, monkeypatch):
assert client is not None
assert client.timeout.read == 0.0
+ def test_get_authenticated_httpx_client_includes_user_agent(self, monkeypatch):
+ """Test client includes User-Agent header."""
+ monkeypatch.delenv("RUNPOD_API_KEY", raising=False)
+
+ client = get_authenticated_httpx_client()
+
+ assert client is not None
+ assert "User-Agent" in client.headers
+ assert client.headers["User-Agent"].startswith("Runpod Flash/")
+
+ def test_get_authenticated_httpx_client_user_agent_with_auth(self, monkeypatch):
+ """Test client includes both User-Agent and Authorization headers."""
+ monkeypatch.setenv("RUNPOD_API_KEY", "test-key")
+
+ client = get_authenticated_httpx_client()
+
+ assert client is not None
+ assert "User-Agent" in client.headers
+ assert "Authorization" in client.headers
+ assert client.headers["User-Agent"].startswith("Runpod Flash/")
+ assert client.headers["Authorization"] == "Bearer test-key"
+
class TestGetAuthenticatedRequestsSession:
"""Test the get_authenticated_requests_session utility function."""
@@ -123,3 +145,27 @@ def test_get_authenticated_requests_session_is_valid_session(self, monkeypatch):
assert isinstance(session, requests.Session)
session.close()
+
+ def test_get_authenticated_requests_session_includes_user_agent(self, monkeypatch):
+ """Test session includes User-Agent header."""
+ monkeypatch.delenv("RUNPOD_API_KEY", raising=False)
+
+ session = get_authenticated_requests_session()
+
+ assert session is not None
+ assert "User-Agent" in session.headers
+ assert session.headers["User-Agent"].startswith("Runpod Flash/")
+ session.close()
+
+ def test_get_authenticated_requests_session_user_agent_with_auth(self, monkeypatch):
+ """Test session includes both User-Agent and Authorization headers."""
+ monkeypatch.setenv("RUNPOD_API_KEY", "test-key")
+
+ session = get_authenticated_requests_session()
+
+ assert session is not None
+ assert "User-Agent" in session.headers
+ assert "Authorization" in session.headers
+ assert session.headers["User-Agent"].startswith("Runpod Flash/")
+ assert session.headers["Authorization"] == "Bearer test-key"
+ session.close()
diff --git a/tests/unit/core/utils/test_user_agent.py b/tests/unit/core/utils/test_user_agent.py
new file mode 100644
index 00000000..bc317bbf
--- /dev/null
+++ b/tests/unit/core/utils/test_user_agent.py
@@ -0,0 +1,99 @@
+"""Tests for user_agent module."""
+
+import platform
+import re
+
+
+def test_get_user_agent_format():
+ """Test User-Agent string format matches expected pattern."""
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
+ ua = get_user_agent()
+
+ # Should match: "Runpod Flash/ (Python ; ; )"
+ pattern = r"^Runpod Flash/[\w\.]+ \(Python [\d\.]+; \w+ [\w\.\-]+; [\w\d_]+\)$"
+ assert re.match(pattern, ua), f"User-Agent '{ua}' doesn't match expected format"
+
+
+def test_get_user_agent_contains_version():
+ """Test User-Agent includes version information."""
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
+ ua = get_user_agent()
+
+ # Should start with "Runpod Flash/"
+ assert ua.startswith("Runpod Flash/"), (
+ f"User-Agent should start with 'Runpod Flash/', got: {ua}"
+ )
+
+ # Should contain version (either real version or 'unknown')
+ version_part = ua.split(" ")[2] # "Runpod Flash/ (Python ..."
+ assert version_part.startswith("(Python"), (
+ "User-Agent should contain Python version"
+ )
+
+
+def test_get_user_agent_contains_python_version():
+ """Test User-Agent includes Python version."""
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
+ ua = get_user_agent()
+ python_version = platform.python_version()
+
+ assert f"Python {python_version}" in ua, (
+ f"User-Agent should contain Python {python_version}"
+ )
+
+
+def test_get_user_agent_contains_os():
+ """Test User-Agent includes OS name."""
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
+ ua = get_user_agent()
+ os_name = platform.system()
+
+ assert os_name in ua, f"User-Agent should contain OS name {os_name}"
+
+
+def test_get_user_agent_contains_os_version():
+ """Test User-Agent includes OS version."""
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
+ ua = get_user_agent()
+ os_version = platform.release()
+
+ assert os_version in ua, f"User-Agent should contain OS version {os_version}"
+
+
+def test_get_user_agent_contains_architecture():
+ """Test User-Agent includes CPU architecture."""
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
+ ua = get_user_agent()
+ arch = platform.machine()
+
+ assert arch in ua, f"User-Agent should contain architecture {arch}"
+
+
+def test_get_user_agent_structure():
+ """Test User-Agent has correct structure."""
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
+ ua = get_user_agent()
+
+ # Should have exactly one opening and closing parenthesis
+ assert ua.count("(") == 1, "User-Agent should have exactly one opening parenthesis"
+ assert ua.count(")") == 1, "User-Agent should have exactly one closing parenthesis"
+
+ # Should have exactly two semicolons (Python/OS separator, OS/arch separator)
+ assert ua.count(";") == 2, "User-Agent should have exactly two semicolons"
+
+
+def test_get_user_agent_consistency():
+ """Test User-Agent is consistent across multiple calls."""
+ from runpod_flash.core.utils.user_agent import get_user_agent
+
+ ua1 = get_user_agent()
+ ua2 = get_user_agent()
+
+ assert ua1 == ua2, "User-Agent should be consistent across calls"
diff --git a/tests/unit/runtime/test_context.py b/tests/unit/runtime/test_context.py
index ce708a9c..a601ebfb 100644
--- a/tests/unit/runtime/test_context.py
+++ b/tests/unit/runtime/test_context.py
@@ -42,6 +42,77 @@ def test_local_development_empty_env_vars(self):
):
assert is_deployed_container() is False
+ def test_live_provisioning_overrides_endpoint_id(self):
+ """FLASH_IS_LIVE_PROVISIONING=true should override RUNPOD_ENDPOINT_ID."""
+ with patch.dict(
+ os.environ,
+ {
+ "RUNPOD_ENDPOINT_ID": "test-endpoint-123",
+ "FLASH_IS_LIVE_PROVISIONING": "true",
+ },
+ ):
+ assert is_deployed_container() is False
+
+ def test_live_provisioning_overrides_pod_id(self):
+ """FLASH_IS_LIVE_PROVISIONING=true should override RUNPOD_POD_ID."""
+ with patch.dict(
+ os.environ,
+ {
+ "RUNPOD_POD_ID": "test-pod-456",
+ "FLASH_IS_LIVE_PROVISIONING": "true",
+ },
+ clear=True,
+ ):
+ assert is_deployed_container() is False
+
+ def test_live_provisioning_overrides_both_ids(self):
+ """FLASH_IS_LIVE_PROVISIONING=true should override both RunPod IDs."""
+ with patch.dict(
+ os.environ,
+ {
+ "RUNPOD_ENDPOINT_ID": "test-endpoint-123",
+ "RUNPOD_POD_ID": "test-pod-456",
+ "FLASH_IS_LIVE_PROVISIONING": "true",
+ },
+ ):
+ assert is_deployed_container() is False
+
+ def test_live_provisioning_case_insensitive(self):
+ """FLASH_IS_LIVE_PROVISIONING should be case-insensitive."""
+ test_values = ["true", "True", "TRUE", "TrUe"]
+
+ for value in test_values:
+ with patch.dict(
+ os.environ,
+ {
+ "RUNPOD_ENDPOINT_ID": "test-endpoint-123",
+ "FLASH_IS_LIVE_PROVISIONING": value,
+ },
+ ):
+ assert is_deployed_container() is False
+
+ def test_live_provisioning_false_does_not_override(self):
+ """FLASH_IS_LIVE_PROVISIONING=false should not override deployment detection."""
+ with patch.dict(
+ os.environ,
+ {
+ "RUNPOD_ENDPOINT_ID": "test-endpoint-123",
+ "FLASH_IS_LIVE_PROVISIONING": "false",
+ },
+ ):
+ assert is_deployed_container() is True
+
+ def test_live_provisioning_empty_does_not_override(self):
+ """Empty FLASH_IS_LIVE_PROVISIONING should not override deployment detection."""
+ with patch.dict(
+ os.environ,
+ {
+ "RUNPOD_ENDPOINT_ID": "test-endpoint-123",
+ "FLASH_IS_LIVE_PROVISIONING": "",
+ },
+ ):
+ assert is_deployed_container() is True
+
class TestIsLocalDevelopment:
"""Tests for is_local_development function."""
@@ -73,3 +144,14 @@ def test_inverse_of_is_deployed(self):
for env_vars in test_cases:
with patch.dict(os.environ, env_vars, clear=True):
assert is_local_development() == (not is_deployed_container())
+
+ def test_local_with_live_provisioning(self):
+ """Should return True when FLASH_IS_LIVE_PROVISIONING=true even with RunPod IDs."""
+ with patch.dict(
+ os.environ,
+ {
+ "RUNPOD_ENDPOINT_ID": "test-endpoint-123",
+ "FLASH_IS_LIVE_PROVISIONING": "true",
+ },
+ ):
+ assert is_local_development() is True
diff --git a/tests/unit/test_logger.py b/tests/unit/test_logger.py
index de33226f..7b527ede 100644
--- a/tests/unit/test_logger.py
+++ b/tests/unit/test_logger.py
@@ -253,7 +253,7 @@ def test_log_level_override_via_env(self, tmp_path, monkeypatch):
monkeypatch.delenv("LOG_LEVEL")
def test_debug_format_includes_details(self, tmp_path, monkeypatch):
- """Verify DEBUG level uses detailed format."""
+ """Verify DEBUG level logging works with clean format."""
# Change to temp directory
monkeypatch.chdir(tmp_path)
@@ -275,10 +275,9 @@ def test_debug_format_includes_details(self, tmp_path, monkeypatch):
output = stream.getvalue()
- # Verify detailed format includes filename and line number
+ # Verify message is logged
assert "Debug message" in output
- assert "test_logger.py" in output # filename
- assert "test" in output # logger name
+ assert "DEBUG" in output
# Cleanup
cleanup_handlers(root_logger)
diff --git a/tests/unit/test_logger_sensitive_data.py b/tests/unit/test_logger_sensitive_data.py
index 573c3193..e5ad2640 100644
--- a/tests/unit/test_logger_sensitive_data.py
+++ b/tests/unit/test_logger_sensitive_data.py
@@ -127,10 +127,11 @@ def test_recursive_dict_sanitization(self):
assert sanitized_config["api"]["endpoint"] == "https://api.example.com"
def test_long_token_partial_redaction(self):
- """Verify long tokens show first/last 4 chars for debugging."""
+ """Verify prefixed API keys show first/last 4 chars for debugging."""
filter_instance = SensitiveDataFilter()
- long_token = "abcdefghijklmnopqrstuvwxyz0123456789"
+ # Use a prefixed token that will be caught by PREFIXED_KEY_PATTERN
+ long_token = "sk-abcdefghijklmnopqrstuvwxyz0123456789"
record = logging.LogRecord(
name="test",
level=logging.INFO,
@@ -143,7 +144,7 @@ def test_long_token_partial_redaction(self):
filter_instance.filter(record)
# Should show first 4 and last 4 chars
- assert "abcd" in record.msg
+ assert "sk-a" in record.msg
assert "6789" in record.msg
assert "***REDACTED***" in record.msg
assert long_token not in record.msg
diff --git a/uv.lock b/uv.lock
index 0d991417..ec9d3459 100644
--- a/uv.lock
+++ b/uv.lock
@@ -3176,7 +3176,7 @@ dependencies = [
[[package]]
name = "runpod-flash"
-version = "1.1.1"
+version = "1.2.0"
source = { editable = "." }
dependencies = [
{ name = "cloudpickle" },