Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 23 additions & 10 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,30 @@
# Generate with: python -c "import secrets; print(secrets.token_hex(32))"
SECRET_KEY=your-secure-random-key-here-replace-this

# Database configuration
# SQLite (default): sqlite:////app/instance/media_checker.db
# PostgreSQL: postgresql://user:password@host:port/database
# MySQL: mysql://user:password@host:port/database
DATABASE_URL=sqlite:////app/instance/media_checker.db
# Database configuration (PostgreSQL REQUIRED since v2.2.0)
# When using docker-compose.yml, the POSTGRES_* variables below are used instead.
# DATABASE_URL is only needed for standalone (non-Docker) deployments.
# DATABASE_URL=postgresql://pixelprobe:changeme@localhost:5432/pixelprobe

# PostgreSQL credentials (used by docker-compose.yml)
POSTGRES_PASSWORD=changeme

# Scanning configuration
# Comma-separated list of directories to monitor
# Comma-separated list of directories to monitor (inside container)
SCAN_PATHS=/media

# Optional: Path to your media files on the host system
# Path to your media files on the host system (mounted read-only into container)
MEDIA_PATH=/path/to/your/media

# Performance tuning
MAX_FILES_TO_SCAN=100
MAX_SCAN_WORKERS=4
# MAX_WORKERS: parallel file scanning threads per task (default: 10)
MAX_WORKERS=10
# BATCH_SIZE: files per database batch commit (default: 100)
BATCH_SIZE=100

# Celery configuration
# CELERY_CONCURRENCY: number of concurrent Celery tasks (default: 4)
CELERY_CONCURRENCY=4

# Timezone configuration
TZ=America/New_York
Expand All @@ -40,4 +48,9 @@ CLEANUP_SCHEDULE=
# Path exclusions (comma-separated)
EXCLUDED_PATHS=/media/temp,/media/cache
# Extension exclusions (comma-separated)
EXCLUDED_EXTENSIONS=.tmp,.temp,.cache
EXCLUDED_EXTENSIONS=.tmp,.temp,.cache

# SSRF trusted hosts (optional)
# Hostnames and/or CIDR ranges that bypass private-IP blocking for healthcheck pings.
# Comma-separated. Example: myhost.local,192.168.5.0/24
TRUSTED_INTERNAL_HOSTS=
32 changes: 32 additions & 0 deletions CHANGELOG.MD
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,38 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0).

## [2.5.67] - 2026-03-05

### Fixed

- **Fix stuck scan bug**: Scans could get permanently stuck as "active" after a Celery task crash (e.g., `psycopg2.DatabaseError`). The crash recovery handler in `scan_service.py` attempted to mark the scan as crashed but failed because the DB session was in a rolled-back state. Added `db.session.rollback()` before recovery writes and re-query the scan state with a fresh session.
- **Fix stuck scan detection for lost Celery tasks**: When Celery task state is `None` (task lost/unreachable), the `is_scan_running()` check previously assumed the scan was still running indefinitely. Now falls through to time-based detection -- if no update for over 1 hour with unknown task state, marks as crashed.
- **Fix scheduler stuck scan checker**: The `_check_stuck_scans` scheduler job now also verifies Celery task state. If a Celery task is gone AND no progress update for 5+ minutes, the scan is marked as crashed (previously only relied on 30-minute time threshold).
- **Fix Phase 3 progress display appearing frozen**: The scan-status API endpoint now reads real-time progress from Redis (instead of only PostgreSQL) when a scan is active. Redis is updated by the Celery scan worker on every file, while PostgreSQL lagged behind, causing the UI to show stale values like "97/397" or appear stuck.
- **Fix final progress not reflected in DB on scan completion**: All scan completion paths now write `files_processed` and `estimated_total` in the same SQL UPDATE that marks the scan as completed. Previously, the completion UPDATE only set `phase` and `is_active`, leaving stale progress values in PostgreSQL.
- **Fix UI worker exiting without final sync**: The `ui_progress_update_task` now performs a final Redis-to-PostgreSQL sync of progress values before exiting when it detects the scan is complete or inactive.
- **Fix ORM staleness in final Redis-to-DB sync**: Added `ui_session.refresh(scan_state)` before comparing Redis vs DB progress values in `_final_sync_redis_to_db()`. After `_mark_scan_completed()` writes via raw SQL, the ORM object could have stale values, causing unnecessary or missed updates.
- **Fix trailing whitespace in scan_routes.py**: Removed trailing whitespace on blank line.
- **Move inline import to module level in tasks.py**: `get_scan_progress_redis` was imported inline at line 775 despite being available from the existing module-level import.
- Files affected: `pixelprobe/services/scan_service.py`, `pixelprobe/api/scan_routes.py`, `pixelprobe/scheduler.py`, `pixelprobe/tasks.py`, `pixelprobe/progress_utils.py`

### Added

- **`get_scan_progress_redis()` in `pixelprobe/progress_utils.py`**: Read function for fetching real-time scan progress from Redis, used by the scan-status API endpoint.
- **`_mark_scan_completed()` in `pixelprobe/services/scan_service.py`**: Extracted helper that consolidates all scan completion SQL into a single method, replacing 7 duplicated SQL blocks.
- **Unit tests for new functions**: Added `tests/unit/test_progress_utils.py` with 12 tests covering `get_scan_progress_redis()`, `_final_sync_redis_to_db()` sync logic, and `_mark_scan_completed()` SQL behavior.

### Changed

- **Move all application modules into `pixelprobe/` package**: Moved `models.py`, `auth.py`, `config.py`, `media_checker.py`, `scheduler.py`, `celery_config.py`, `version.py` from root into the `pixelprobe/` package. Root now only contains entry points (`app.py`, `celery_worker.py`) and build/config files. Updated 73+ import statements across 43 files.
- **Move root `utils.py` into `pixelprobe/utils/helpers.py`**: Consolidated shared utilities (`ProgressTracker`, `create_state_dict`, `batch_process`, etc.) into the package.
- **Documentation cleanup**: Fixed port 5001 references (should be 5000), updated stale version references (v2.4.48/v2.4.93), fixed SQLite references (PostgreSQL-only since v2.2.0), rewrote PROJECT_STRUCTURE.md, fixed container name references.
- **Repository cleanup**: Removed legacy files (`database_migrations.py`, `init_db.py`, `operation_handlers.py`, `utils.py`, `migrations/` directory, broken shell scripts, obsolete patches, outdated development docs).
- Updated docker-compose.yml image tags from `pixelprobe:test-v2.4.93` to `ttlequals0/pixelprobe:2.5.67`.
- Updated `.env.example` with correct PostgreSQL defaults and added missing env vars.

---

## [2.5.66] - 2026-02-23

### Fixed
Expand Down
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ PixelProbe is a comprehensive media file corruption detection tool with a modern

### Web Interface
- Modern responsive design with dark/light theme support
- Real-time scan progress with WebSocket updates
- Real-time scan progress with live polling updates
- Advanced filtering and search capabilities
- Bulk file selection and management with shift-click range selection
- Mobile-optimized touch interface
Expand Down Expand Up @@ -257,20 +257,20 @@ Fine-tune scanning behavior by excluding specific paths and file types:
```

4. **Access the web interface**:
Open http://localhost:5001 in your browser
Open http://localhost:5000 in your browser

5. **Initial Setup** (IMPORTANT - First Run Only):

On first run, you must create the admin account via the setup endpoint:

```bash
# Create admin user with your chosen password
curl -X POST http://localhost:5001/api/auth/setup \
curl -X POST http://localhost:5000/api/auth/setup \
-H "Content-Type: application/json" \
-d '{"password":"YourSecurePassword123"}'
```

Or visit http://localhost:5001/login and follow the first-run setup wizard.
Or visit http://localhost:5000/login and follow the first-run setup wizard.

**Security Note**: No default admin account exists. You must explicitly create it on first run.

Expand Down Expand Up @@ -341,7 +341,7 @@ export MEDIA_PATH=/mnt/all-media # Contains subdirs: movies/, tv/, backup/

### Web Interface

1. **Access the Dashboard**: Navigate to http://localhost:5001
1. **Access the Dashboard**: Navigate to http://localhost:5000
2. **Start a Scan**: Click "Scan All Files" to begin scanning your media directories
3. **View Results**: Results appear in the table below with corruption status
4. **Filter Results**: Use the filter buttons to show only corrupted or healthy files
Expand Down Expand Up @@ -474,7 +474,7 @@ curl http://localhost:5000/api/stats \
### Command Line Usage

```python
from media_checker import PixelProbe
from pixelprobe.media_checker import PixelProbe

checker = PixelProbe()

Expand Down
14 changes: 7 additions & 7 deletions app.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@
import json

# Import database and models
from models import db
from version import __version__, __github_url__
from scheduler import MediaScheduler
from pixelprobe.models import db
from pixelprobe.version import __version__, __github_url__
from pixelprobe.scheduler import MediaScheduler

# Import blueprints from new modular structure
from pixelprobe.api.scan_routes import scan_bp
Expand All @@ -36,7 +36,7 @@
from pixelprobe.api.auth_routes import auth_api_bp, auth_ui_bp, auth_bp # auth_bp for backward compat

# Import authentication module
from auth import init_auth, auth_required
from pixelprobe.auth import init_auth, auth_required

# OpenAPI documentation is available as openapi.yaml in the project root

Expand Down Expand Up @@ -66,7 +66,7 @@
# Configure app
# Require SECRET_KEY in production - no insecure fallback
# Load configuration from config module
from config import get_config
from pixelprobe.config import get_config
config_name = os.getenv('FLASK_ENV', 'development')
config_class = get_config(config_name)
config_class.init_app(app)
Expand Down Expand Up @@ -128,7 +128,7 @@
init_auth(app)

# P1 Implementation: Initialize Celery task queue
from celery_config import create_celery, init_celery
from pixelprobe.celery_config import create_celery, init_celery
celery = create_celery(app)
init_celery(app, celery)
# CRITICAL: Attach celery to app so scan_routes can find it
Expand Down Expand Up @@ -247,7 +247,7 @@ def sync_scan_paths_to_db():
Uses INSERT ... ON CONFLICT DO NOTHING to handle race conditions when
multiple gunicorn workers start simultaneously.
"""
from models import ScanConfiguration
from pixelprobe.models import ScanConfiguration
from sqlalchemy.dialects.postgresql import insert

scan_paths = app.config.get('SCAN_PATHS', [])
Expand Down
94 changes: 0 additions & 94 deletions database_migrations.py

This file was deleted.

8 changes: 4 additions & 4 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ services:
networks:
- pixelprobe-network

# Redis for P1 Celery implementation
# Redis for Celery task broker
# Performance settings for large task queues (file changes check with 1M+ files):
# - REDIS_MAX_MEMORY: Total memory for task queue (default: 2gb, recommended: 1-4gb)
redis:
Expand All @@ -62,7 +62,7 @@ services:

# PixelProbe application
pixelprobe:
image: pixelprobe:test-v2.4.93
image: ttlequals0/pixelprobe:2.5.67
container_name: pixelprobe-app
environment:
# Security
Expand All @@ -75,7 +75,7 @@ services:
POSTGRES_USER: pixelprobe
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-changeme}

# P1 Celery configuration (scaffolded for future implementation)
# Celery configuration
CELERY_BROKER_URL: redis://redis:6379/0
CELERY_RESULT_BACKEND: redis://redis:6379/0

Expand Down Expand Up @@ -126,7 +126,7 @@ services:

# P1 Celery Worker for distributed task processing
celery-worker:
image: pixelprobe:test-v2.4.93
image: ttlequals0/pixelprobe:2.5.67
container_name: pixelprobe-celery-worker
command: python celery_worker.py
environment:
Expand Down
Loading