feat: metrics-to-structured-jsonl with CLI fixes, observer refactoring, and test hardening by ElNiak · Pull Request #116 · ElNiak/PANTHER

ElNiak · 2026-03-27T11:45:25Z

Summary

Observer refactoring, CLI bug fixes, event system improvements, config deduplication, and test hardening (41 commits)
~117 files changed, net +108 lines

Changes

Observer system: Extracted _is_duplicate helper, fixed test isolation, wired up typed handlers, fixed broken dedup and is_interested substring bug
Event system: Flattened ServiceManagerEventMixin (881→700 lines), extracted _requires_emitter decorator
CLI fixes: Fixed crashes in plugins params/scan, check --format/--imports, create plugin/subplugin, validate command Path handling
Config: Deduplicated ValidationResult/ValidationError across config system, wired 8 orphaned CLI params
Reporting: Fixed O(n*m) dedup, log pattern errors, inaccurate docstrings
Error handling: Replaced bare except:pass with circuit-breaker logging
Logging: Added ConsoleFormatter with phase context, --verbose/--debug flags
Metrics: Write metrics to structured.jsonl, CLI query support
Tests: Fixed 17 pre-existing failures, 62 webapp test failures, early termination test timing issues

Test plan

Run pytest tests/ -n auto -m unit — verify no regressions
Run pytest tests/ -n auto -m integration — verify Docker-dependent tests
Verify CLI commands: panther plugins list, panther check, panther config validate
Run experiment end-to-end and verify metrics output

Replace the raw colorlog/logging.Formatter in LoggerFactory with a new ConsoleFormatter that provides: - Short HH:MM:SS timestamps (no date for live runs) - Phase/service context from LogContext: (phase|service_id) prefix - Abbreviated module names (strips panther.core. prefix) - Static banner() method for phase separators via click.echo() Wire phase banners into ExperimentManager at each transition: initialization, plugin loading, test execution, and cleanup.

Add LoggerFactory.set_console_level() classmethod that updates all StreamHandler instances (root + named loggers) to a given level while keeping FileHandlers at DEBUG for structured JSONL output. Wire --verbose (DEBUG) and --debug (TRACE) Click options through the run command into ExperimentManager via a new console_level parameter, giving CLI flags final precedence over YAML config. Includes 8 unit tests covering level updates, FileHandler preservation, TRACE support, and idempotency.

Verify format_test_header() and format_test_result() output formatting, emoji control, timing precision, and padding behavior.

Add format_experiment_summary() that displays a structured banner after all tests complete, showing pass/fail counts with percentage, duration, output directory, failed test details, and next-step CLI commands. Replaces the simple one-line completion message.

DockerOutputParser now adapts to console verbosity: concise one-liner step labels by default, full instruction text when --verbose/--debug is active. Download/extract progress is suppressed in concise mode. Build start, completion (with duration), and cache-skip messages use click.echo() for consistent structured output.

…pport Metrics are now emitted as JSONL entries (source="metrics") directly into structured.jsonl instead of separate metrics/ directory with CSV/JSON files. Adds `panther logs metrics` CLI command and metric_names/metric_types filters to LogFilter. - MetricsCollector writes to structured.jsonl via _write_metric_jsonl() - Removed metrics/ dir creation and CSV/JSON export from cleanup - Added metric_names/metric_types fields to LogFilter - Added `panther logs metrics` CLI subcommand - Removed stale empty outputs/metrics/ directory

…dispatch, and harden error handling Comprehensive logging cleanup across ~40 files: replace str(e) with e, f-strings with lazy %s format, add exc_info=True to error/warning logs, remove emojis from log messages, and replace print(stderr) with proper logger calls. Fix observer event naming to match ITypedObserver dispatch convention (handle_* → on_*) in CommandAuditObserver, update StateEventObserver event types, add test.failed handler to ExperimentObserver, and widen MetricsObserver interest to test/step events. Align FeatureLogLevelsConfig fields with feature_registry canonical names. Use LoggerFactory.get_logger() in TestCaseBase for feature-level filtering. Add atomic write for output index, build timeout for buildx, safer BuildKit fallback, and defensive try/except in RCA pattern matching.

…ssion Post-commit audit of a8f804d found several issues: - CommandAuditObserver.on_service_command_modified and on_service_config_generated were dead code — the event dispatch system resolves handler names from event.name (preparation_completed, deployment_started), not from the action field. Replaced with action-based sub-dispatch inside on_service_preparation_completed and new on_service_deployment_started handler. - get_supported_event_types() returned non-existent event names (service.command_modified, service.config_generated). Fixed to return actual dispatched names. - Structured JSONL file handler level was changed from DEBUG to console_level, losing post-mortem debug data. Reverted to DEBUG. - Removed 5 unused imports in test_case_base.py left over from _setup_logging refactoring. - Updated all 56 tests to use dispatch-aware routing; added 4 new on_event() integration tests verifying command_modified and config_generated dispatch end-to-end.

…patch tests - Fix dedup in ITypedObserver.on_event(): change event.event_id to event.id and add the actual membership check before appending (was append-only, never checked) - Fix is_interested() substring false positives: replace `in` containment with bidirectional prefix matching and add min-length guard to reject empty/"e" inputs - Fix type-handler class name matching in is_interested() to use startswith instead of substring containment - Update docstrings: event.uuid -> event.id in observer_interface.py and observer/__init__.py; fix handler naming convention note in typed_observer_interface.py - Add 23 tests covering _handler_name_for(), on_event() dispatch routing, dedup, is_interested() correctness, and error handling

…ault, use set for dedup - LoggerObserver._process_event now dispatches to typed handlers (on_experiment_failed, on_test_failed, on_service_error, etc.) before falling back to generic _log_event, so emoji-formatted ERROR-level logging is no longer dead code (H2) - ExperimentObserver._handle_test_execution_completed defaults success to True (M1: "completed" implies success) - ExperimentObserver.get_priority docstring corrected to match IObserver contract: higher number = higher priority (M12) - EventStreamRecorder.processed_events_uuids uses set instead of list for O(1) dedup lookups (M2) - base_execution_environment.handle_event uses event.name instead of dead class-name comparisons (H8)

…gging, prevent data loss

…cstring

…, LoggerObserver dedup+error handling - Change IObserver.processed_events_uuids from List to set for O(1) lookup - Fix ITypedObserver to use .add() instead of .append() for set - Add dedup guard to LoggerObserver._process_event (bypasses ITypedObserver.on_event) - Add try/except around LoggerObserver typed/named handler calls - Trim _logger_typed_handlers to name-based-only entries (type-based already handled) - Fix contradictory test docstring in test_observer_dispatch.py

…, and metrics tests - Fix is_interested() test expectations: MetricsObserver matches "test" entity prefix, StateEventObserver uses exact-match list (docker_build.started not in it) - Fix CommandAuditObserver tests: route through on_event() instead of calling non-existent public handle_* methods, fix fixture processed_events_uuids type (list→set), add _type_handlers - Fix StateEventObserver test: on_command_generation_started→on_service_preparation_started - Fix LoggerFactory test: replace sys.stdout mock with real StreamHandler capture - Fix ExperimentManager counter tests: add missing experiment_dir to _make_manager fixture - Fix cleanup test: assert finalize() called instead of expecting metrics.json export

…d method RC1: Add `web` to `dev` extras in pyproject.toml, install with `.[dev]` in panther_builder.py, add `web` extra to CI workflows, and wrap components/__init__.py imports in try/except for resilience. RC2: Replace all `update_gui()` calls with `on_event()` in test_web_observer.py to match the renamed WebObserver method.

Remove click.echo() from core code (4 files), replace print() with logger in service_builder, convert ~76 f-string debug calls to lazy %-style formatting, audit log levels (DEBUG↔INFO adjustments, add recovery context to warnings, operation context to errors, completion logs for orphaned starts), and gate LoggerFactory internal debug logging behind _DEBUG_FACTORY flag.

…lic API Follow established codebase pattern (tools/docs_gen/__init__.py) of defining __all__ on successful import and falling back to __all__ = [] when dependencies are missing.

Adds e2e to norecursedirs in both pyproject.toml and tests/pytest.ini so pytest never imports tests/e2e/conftest.py (which requires the uninstalled pytest_asyncio package) when running unit tests with -m unit.

… up logging - Remove CommandAuditObserver and PluginObserver (unused) - Simplify ExperimentObserver and StateObserver (~2100 lines removed) - Streamline ObserverFactory registration - Clean up logger_factory and logging_mixin - Update tests to match simplified observer system - Remove stale observer references from plugin_manager and experiment_manager

…oss observer system Fix 8 critical and 10 important documentation issues found in PR review: - Fix set/list mismatch in observer_interface docstrings (.append → .add) - Replace "compile-time type checking" with "runtime type dispatch" - Clarify multi-layer deduplication architecture - Replace all tqdm references with Click (rename tqdm_msg → build_msg) - Add missing event_stream type to factory docstring - Fix misleading --json help text in CLI commands - Fix O(log n) claim → O(n) in event_manager - Trim bloated logger_factory docstring, remove unverified perf claims - Remove dead ColoredFormatter import and duplicate docstrings - Fix priority description in EventStreamRecorder

… clean up dead code - Extract dedup check into ITypedObserver._is_duplicate() reused by all observers - EventStreamRecorder: keep file handle open instead of re-opening per event - ExperimentObserver: remove redundant _should_terminate_early flag - LoggerObserver: remove unused Click progress coordination and priority methods, add bounded collections to prevent unbounded memory growth - StorageObserver: initialize _instance_lock at class level (not lazily) - Fix 2 pre-existing test failures: mock DockerBuilder.get_instance and reset PluginManager singleton for xdist isolation, fix wrong PluginFormInfo attribute names in test assertions

…lt_observer_set functions These two factory functions were exported but never called by any production or test code. Removing them and their re-exports.

Remove _try_forward_to_cli() and dead argparse choices (package, package-test, install-local, docs, serve-docs, deploy-docs, zip-outputs). Keep only the two bootstrap commands: package-dev and clean.

… config system - Remove duplicate classes from validation_ops.py, import from canonical components/validators.py - Add __str__ to canonical ValidationResult for feature parity - Rename command_processor's ValidationResult → CommandValidationResult to disambiguate (different API: 1-arg add_error vs 2-arg)

- Add docker user mapping params (run_as_host, uid, gid, user_name) to CLI_CONFIG_MAP so they flow through to config - Pass plugin directory params (exec_env_dir, net_env_dir, iut_dir, testers_dir) to ConfigurationManager constructor

Extract two helpers to eliminate repeated boilerplate in 12 notify_* methods: - _get_service_context(): builds (service_id, service_name) tuple - _emit_with_fallback(): handles the validated→service_emitter→direct triple-fallback emission pattern All public method signatures preserved. notify_service_event() dispatch table and emit_test_*/handle_event kept as-is (unique logic per branch).

…methods Replace repeated inline guard clause (hasattr + truthiness check on event_emitter) with a module-level @_requires_emitter decorator. Also replaces dead _emit_with_fallback triple-fallback with direct _emit_event, and removes no-value debug logs from emit_test_* guards. Net: ~30 lines removed, single source of truth for the guard logic.

Replace fragile conditional-then-remove command building with clean append-based construction.

…plugin create_plugin() and create_subplugin() do not accept force_overwrite.

discover_plugins() returns Dict[str, PluginMetadata], not Dict[str, List[str]]. Fix params to use direct dict lookup and scan to group by metadata.type for display.

click.Path() returns str but validate() calls .is_dir() and .is_file(). Add Path() conversion matching the existing pattern in check_deps().

… observer comment - Add missing FastFailHandler import (2 tests crashed with NameError) - Remove module-level DockerBuilder.reset_singleton() (unsafe for xdist) - Add step.* prefix tests for MetricsObserver.is_interested() - Fix stale comment about MetricsObserver event scope

…s, metrics filter) 1.1: Fix Jinja2 whitespace stripping that merged success summary lines 1.2: Silence ~38 unhandled event types with noop handlers + configurable logging 1.3: Broaden MetricsObserver.is_interested() to accept test.* and step.* events 1.4: Group service health tables by test_name in both Jinja2 and fallback reports 1.5: Remove duplicate TestCompleted/TestFailed emissions from ExperimentManager 1.6: Add required/optional pattern distinction for output completeness calculation

TestServiceHealthTestName and TestErrorCategoryMatching were defined twice in test_status_collector.py due to the production merge.

…create commands These options were accepted by `panther create plugin` and `panther create subplugin` but silently ignored — never forwarded to create_plugin()/create_subplugin(). Removes them to eliminate user confusion and clean up the public API surface.

- Change [tool:pytest] to [pytest] in tests/cli/pytest.ini (fixes PytestUnknownMarkWarning for @pytest.mark.unit in CLI tests) - Fix validate() and check_deps() type annotations: click.Path() returns str, not Path

- Add pytestmark = pytest.mark.integration at module level - Fix Mock identity comparison (== -> is) for fast_fail_handler - Replace DockerConnection test: builder enters cache-only mode (client=None), not exception - Remove overly specific message/context assertions from docker build failure tests - Add mock_docker_client fixture to test_docker_client_none_raises_exception (singleton isolation) - Fix test_docker_client_none: catches PantherException (validate_build_prerequisites fires first) - Fix test_docker_build_failure_stops_experiment: test handler directly instead of via build_image - Fix test_configuration_based_fast_fail_behavior: remove critical_only (nonexistent), test real behavior - Fix mock_event_manager: configure cleanup_none_observers() to return 0 (not MagicMock) - Fix TestPluginManagerFastFail: use plugin_factory/create_service_manager (service_factory doesn't exist) - Fix test_experiment_initialization_error: assign experiment_config; catch TestCaseInitializationError

…ed API, mock setup) - Replace removed _deploy_services_non_blocking() / deploy_services() tests with test_deploy_services_monitoring() and test_monitoring_not_active_before_deployment_event() - Fix thread-based tests to use polling with deadline instead of fixed sleeps - Mock execute_docker_command (not _is_service_ready) to match _is_service_ready_with_timeout() internals - Fix stop_monitoring() assertion: now sets monitor_thread=None internally - Fix docker_compose_env fixture: env_type must be "network_environment" not "network" - Fix test_test_case_checks_early_termination: use MagicMock for configs, steps.wait attribute - Add pytestmark = pytest.mark.integration

…ed API, mock setup) - Add pytestmark = pytest.mark.integration - Fix thread timing: use stop_monitoring() internal join instead of sleep - Replace removed _deploy_services_non_blocking/deploy_services API calls - Fix BackgroundServiceMonitor mock: mock execute_docker_command not _is_service_ready - Fix TestCase mock setup with proper patches - Clean up unused imports (threading, TestConfig) - Fix BackgroundServiceMonitor import path (moved to own module)

sourcery-ai

Sorry @ElNiak, your pull request is larger than the review limit of 150000 diff characters

Copilot

Pull request overview

This PR modernizes PANTHER’s logging/observer/metrics pipeline by moving metrics into the shared structured.jsonl stream, refactoring typed observer dispatch/dedup behavior, and hardening multiple CLI commands and tests to be more robust under parallel execution and optional dependencies.

Changes:

Added structured JSONL metric emission and query support (engine filters + CLI logs metrics).
Refactored observer/event plumbing (typed observer dispatch, state observer routing, recorder persistence/dedup).
Fixed and hardened CLI commands and tests (path handling, plugin metadata handling, logging verbosity controls, xdist isolation).

Reviewed changes

Copilot reviewed 114 out of 115 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
tests/unit/test_webapp/test_web_observer.py	Update observer API usage (`on_event`)
tests/unit/test_webapp/test_plugin_forms.py	Patch DockerBuilder singleton in tests; adjust form info assertions
tests/unit/test_plugins/test_modern_plugin_system.py	Reset singletons for xdist; patch DockerBuilder
tests/unit/test_core/test_utils/test_logger_factory_console_level.py	New tests for console-level updates
tests/unit/test_core/test_utils/test_console_formatter.py	New tests for ConsoleFormatter/context tagging
tests/unit/test_core/test_status_collector.py	Remove redundant/obsolete unit tests
tests/unit/test_core/test_plugin_manager.py	Align with plugin event emitter changes
tests/unit/test_core/test_observer_dispatch.py	New typed observer routing/dedup/is_interested tests
tests/unit/test_core/test_metrics_structured_jsonl.py	New tests for metrics JSONL + query filters
tests/unit/test_core/test_metrics_export_lifecycle.py	Cleanup now verifies collector finalization
tests/unit/test_core/test_metrics_counter_increments.py	Ensure experiment_dir exists in test harness
tests/unit/test_core/test_logging_mixin.py	Update assertions for lazy logging formatting
tests/unit/test_core/test_logger_factory.py	Fix log capture strategy for formatting test
tests/unit/test_config/test_environment_hierarchy.py	Update expected default for `output_format`
tests/unit/plugins/environments/execution_environment/conftest.py	Update strace fixture output_format
tests/pytest.ini	Exclude `e2e/` from recursion
tests/integration/test_fast_fail_integration.py	Update expectations for fast-fail behavior and APIs
tests/cli/unit/commands/test_plugins_validate_path.py	New test for validate path conversion
tests/cli/unit/commands/test_plugins_metadata.py	New tests for PluginMetadata handling in CLI
tests/cli/unit/commands/test_create_command_api.py	New tests to prevent invalid kwargs to creators
tests/cli/unit/commands/test_check_command_building.py	New tests for check command subprocess args
tests/cli/pytest.ini	Fix pytest section header
pyproject.toml	Add web deps to dev; set pytest basetemp; norecursedirs
panther_builder.py	Simplify bootstrap commands; use `.[dev]` installs
panther/webapp/components/init.py	Make webapp components import optional
panther/tools/plugins/services/tutorials/tutorial.py	Switch to lazy logging formatting
panther/tools/plugins/environments/tutorials/tutorial.py	Improve logging; docstring cleanup; add `noqa` where needed
panther/tools/docs_gen/mkdocs/fix_markdown_links.py	Add module docstring; docstring cleanup
panther/tools/docs_gen/link_rewriting.py	Add clarifying fallback comments
panther/tools/docs_gen/generate_plugin_inventory.py	Replace bare except with debug logging
panther/tools/config/panther_config_validator.py	Add comments; improve error strings
panther/plugins/services/service_manager_mixin.py	Docstring formatting cleanup; adjust log levels
panther/plugins/plugin_manager.py	Remove PluginObserver; emitter-only event integration
panther/plugins/environments/network_environment/shadow_ns/shadow_network_resolver.py	Docstring cleanup; error message formatting
panther/plugins/environments/network_environment/placeholder_parser.py	Docstring cleanup; error message formatting
panther/plugins/environments/network_environment/mixins/subprocess_executor.py	Docstring cleanup; adjust logging verbosity
panther/plugins/environments/network_environment/localhost_single_container/localhost_network_resolver.py	Docstring cleanup; error message formatting
panther/plugins/environments/network_environment/docker_compose/docker_network_resolver.py	Docstring cleanup; error message formatting
panther/plugins/environments/network_environment/base_network_resolver.py	Docstring cleanup; safer exception messages
panther/plugins/environments/execution_environment/base_execution_environment.py	Update event handling to name-based logic
panther/plugins/core/plugin_factory.py	Improve error logging; strip None config fields
panther/core/utils/logging_mixin.py	Avoid eager work; use lazy formatting + debug gating
panther/core/utils/console_formatter.py	New console formatter with context tags + banners
panther/core/utils/init.py	Export ConsoleFormatter
panther/core/test_cases/test_case_impl.py	Remove emoji glyphs from dry-run logs
panther/core/test_cases/mixins/service_management.py	Improve error messages with context
panther/core/test_cases/mixins/environment_management.py	Make execution env setup failures fatal
panther/core/test_cases/execution/test_executor.py	Use lazy logging formatting
panther/core/test_cases/base/test_case_base.py	Switch test-case logging to LoggerFactory + file handler
panther/core/test_cases/analysis/output_analyzer.py	Remove emoji glyphs; improve error strings
panther/core/reporting/root_cause_analyzer.py	Improve dedup + isolate pattern matcher failures
panther/core/reporting/log_query_engine.py	Add metric filters; normalize datetime comparisons
panther/core/reporting/failure_patterns.py	Remove outdated comment block
panther/core/reporting/experiment_reporter.py	Log with tracebacks; group service health by test
panther/core/outputs/output_index.py	Atomic writes for output index
panther/core/outputs/output_aggregator.py	Reduce log verbosity; improve warnings
panther/core/outputs/init.py	Clarify internal-only interfaces in docs
panther/core/observer/management/event_manager.py	Fix performance notes in docstring
panther/core/observer/impl/storage_observer.py	Make instance lock eager; keep failed pending events; lazy formatting
panther/core/observer/impl/state_observer.py	Table-driven routing for workflow state transitions
panther/core/observer/impl/plugin_observer.py	Removed legacy PluginObserver implementation
panther/core/observer/impl/metrics_observer.py	Normalize log_level type; add completion log
panther/core/observer/impl/event_stream_recorder.py	Cache file handle; reuse dedup helper; warn on write failures
panther/core/observer/impl/command_audit_observer.py	Removed command audit observer
panther/core/observer/impl/init.py	Remove PluginObserver export
panther/core/observer/base/typed_observer_interface.py	Add `_is_duplicate`; safer handler naming; fix is_interested
panther/core/observer/base/observer_interface.py	Use set for dedup; add `_get_event_type_safely`
panther/core/observer/init.py	Update docs; remove default observer set exports; remove PluginObserver
panther/core/metrics/resource_monitor.py	Add more context to monitoring errors
panther/core/metrics/metrics_exporter.py	Include tracebacks on exporter errors
panther/core/metrics/metrics_collector.py	Add structured JSONL metric writes; suppress repeated warnings
panther/core/experiment_analysis.py	Docstring cleanup; tracebacks on warnings; remove emoji glyphs
panther/core/exceptions/fast_fail.py	Minor string formatting cleanup
panther/core/exceptions/error_handler_mixin.py	Avoid redundant `str()`; preserve tracebacks
panther/core/docker_builder/platform_detection.py	Safer fallback when Dockerfile unreadable
panther/core/docker_builder/image_operations.py	Add build/cached info logs
panther/core/docker_builder/caching/docker_image_cache.py	Docstring cleanup; lazy formatting; atomic save logging
panther/core/docker_builder/caching/docker_build_cache_mixin.py	Improve warnings with rebuild hint
panther/core/docker_builder/buildx_operations.py	Add build logs; add build timeout
panther/core/command_processor/core/validator.py	Rename result type to CommandValidationResult
panther/core/command_processor/core/init.py	Export CommandValidationResult
panther/core/command_processor/builders/service_builder.py	Replace prints with logger warnings
panther/config/core/utils/cli_overrides.py	Wire docker user-mapping CLI overrides
panther/config/core/models/global_config.py	Expand/rename feature log-level fields
panther/config/core/models/environment.py	Make ExecutionEnvironmentConfig.output_format optional
panther/config/core/mixins/validation_ops.py	Deduplicate validation types via shared components
panther/config/core/mixins/config_loading.py	Raise validation completion logs to INFO
panther/config/core/components/validators.py	Add ValidationResult.str
panther/cli/commands/run.py	Add --verbose/--debug; pass console_level; wire plugin dirs
panther/cli/commands/report.py	Clarify `--json` help text
panther/cli/commands/plugins.py	Fix metadata handling; convert click path strings to Path
panther/cli/commands/logs.py	Validate timestamp/regex args; add `logs metrics`
panther/cli/commands/create.py	Remove unsupported CLI options; add debug logging on errors
panther/cli/commands/config.py	Add debug logs on validation/schema errors
panther/cli/commands/check.py	Fix subprocess command construction
panther/cli/commands/admin.py	Add debug logs on failures
experiment-config/base/experiment_config_example_minimal_docker.yaml	Update logging defaults/feature levels; docker cache defaults
.gitignore	Ignore pytest temp dirs
.github/workflows/unittests_codecov.yml	Install `web` extras in CI
.github/workflows/pr-validation.yml	Install editable with tests/lint/web extras

Comments suppressed due to low confidence (1)

panther/tools/docs_gen/mkdocs/fix_markdown_links.py:14

The file has a shebang (#!/usr/bin/env python3) in the middle of the module (after imports) and a second triple-quoted string that becomes a no-op string literal. This is confusing and defeats the purpose of a shebang. Move the shebang to line 1 (or remove it if this isn’t intended to be executed directly) and keep a single module docstring.

Copilot · 2026-03-27T11:50:44Z

        process._log_files = (stdout_handle, stderr_handle)

-        self.logger.debug(f"Background process started with PID: {process.pid}")
+        self.logger.info(f"Background process started with PID: {process.pid}")


This log call uses an f-string, which eagerly formats the message and is inconsistent with the lazy formatting pattern used elsewhere in this PR. Prefer parameterized logging (e.g., passing process.pid as an argument) so formatting is deferred and consistent.

Copilot · 2026-03-27T11:50:45Z

+try:
+    from panther.webapp.components.display.event_viewer import event_viewer
+    from panther.webapp.components.display.metrics_panel import metrics_panel
+    from panther.webapp.components.display.plugin_card import plugin_card
+    from panther.webapp.components.display.plugin_detail_panel import (
+        render_plugin_detail,
+    )
+    from panther.webapp.components.display.service_health_card import (
+        service_health_card,
+    )
+    from panther.webapp.components.display.service_log_browser import (
+        service_log_browser,
+    )
+    from panther.webapp.components.display.test_detail_panel import test_detail_panel
+    from panther.webapp.components.forms.config_form_panel import config_form_panel
+    from panther.webapp.components.forms.test_list_editor import TestListEditor
+    from panther.webapp.components.status.error_boundary import error_boundary
+    from panther.webapp.components.status.notifications import (
+        notify_error,
+        notify_info,
+        notify_success,
+        notify_warning,
+    )
+    from panther.webapp.components.status.progress_bar import ExperimentProgress
+    from panther.webapp.components.status.status_badge import status_badge
+
+    __all__ = [
+        "event_viewer",
+        "metrics_panel",
+        "plugin_card",
+        "render_plugin_detail",
+        "service_health_card",
+        "service_log_browser",
+        "test_detail_panel",
+        "config_form_panel",
+        "TestListEditor",
+        "error_boundary",
+        "notify_error",
+        "notify_info",
+        "notify_success",
+        "notify_warning",
+        "ExperimentProgress",
+        "status_badge",
+    ]
+except ImportError:
+    __all__ = []


The package-level import is wrapped in a blanket except ImportError that silently hides all import failures, including errors caused by regressions inside the webapp modules themselves (not just missing optional dependencies). This makes debugging harder because import panther.webapp.components can succeed while leaving missing attributes. Consider narrowing the exception to known optional third-party deps (or logging a warning with the underlying ImportError) so real internal import errors aren’t swallowed.

Copilot · 2026-03-27T11:50:45Z

+def _build_context_tag() -> str:
+    """Build the ``(phase|service)`` tag from the current LogContext.
+
+    Returns:
+        A string like ``"(initialization|picoquic) "`` when both fields
+        are set, ``"(initialization) "`` when only phase is set, or
+        ``""`` when no context is available.
+    """
+    ctx = get_log_context()
+    parts: list[str] = []
+    if ctx.phase:
+        parts.append(ctx.phase)
+    if ctx.service_id:
+        parts.append(ctx.service_id)
+    if parts:
+        return f"({' | '.join(parts)}) "  # note trailing space
+    return ""


_build_context_tag() returns strings like "(phase | service) " (spaces around the separator), but the docstring/examples describe the format as "(phase|service) ". Either adjust the join separator to match the documented output or update the docstring/examples so they reflect what is actually emitted.

Copilot · 2026-03-27T11:50:45Z

+def _normalize_dt(dt: datetime) -> datetime:
+    """Strip timezone info for safe comparison."""
+    return dt.replace(tzinfo=None) if dt.tzinfo else dt


Timezone handling in time-range filters is incorrect when callers pass offset-aware datetimes. _normalize_dt() strips tzinfo without converting to a common timeline (e.g., UTC), so 2026-01-01T00:00:00+01:00 will be compared as naive 00:00 instead of 23:00Z, yielding wrong query results. Consider normalizing both record timestamps and filter bounds to UTC (or epoch seconds) before comparing.

Copilot · 2026-03-27T11:50:46Z

+        event_name = getattr(event, "name", type(event).__name__)
+        self.logger.debug(f"{self.__class__.__name__} received event: %s", event_name)
+        if event_name == "started":
            self.logger.debug(
                "Service started, environment monitoring should be active"
            )
-        elif event_type == "ServiceStoppedEvent":
+        elif event_name == "stopped":
            self.logger.debug("Service stopped, environment collection complete")
        else:
-            self.logger.debug("Unhandled event type: %s", event_type)
+            self.logger.debug("Unhandled event type: %s", event_name)


handle_event() now branches solely on event.name being "started"/"stopped". Multiple event domains can share these names (e.g., plugin lifecycle), so this can log misleading messages or trigger environment logic for non-service events. Restrict these branches to event.entity_type == EventType.SERVICE (or the concrete service event classes) to preserve the original intent.

Copilot · 2026-03-27T11:50:46Z

+    def __del__(self):
+        """Ensure the file handle is closed on garbage collection."""
+        if self._file_handle is not None and not self._file_handle.closed:
+            try:
+                self._file_handle.close()
+            except Exception:
+                pass


__del__() closes _file_handle without acquiring _lock, which can race with on_event() (which writes under _lock) and lead to writes against a closing/closed handle. To keep thread-safety consistent, delegate destructor cleanup to close() (or acquire _lock inside __del__).

… code) - Add threading.Lock to FastFailHandler for thread-safe error tracking - Add warning log + method guard to _requires_emitter and _emit_event - Remove broken conftest fixtures referencing deleted modules - Fix EventStreamRecorder file handle reset to be inside lock - Return False from LoggerObserver on handler exceptions - Pass global_config to metrics/storage observer factory builders - Move logger assignments below imports (isort compliance)

- Delegate EventStreamRecorder.__del__ to close() for thread safety - Guard handle_event against non-service events via entity_type check - Normalize datetimes to UTC before stripping tzinfo in log query engine - Correct _build_context_tag docstring separator to match code - Log ImportError in webapp components instead of silently swallowing - Use lazy log formatting in subprocess_executor - Skip panther_ivy tests when submodule not installed (fixes CI)

Wrap webapp/components/forms/__init__.py imports in try/except ImportError so test collection succeeds when nicegui is not installed.

…B memory usage Coverage was baked into pyproject.toml addopts, meaning every pytest invocation instrumented 321K lines with branch tracing + generated 3 report formats. This caused ~50GB memory usage on casual test runs. Changes: - Remove --cov* from default addopts (coverage now opt-in) - Set branch = false (halves instrumentation overhead) - Enable parallel = true (distribute across SQLite files with xdist) - Expand omit patterns (webapp, tools, banner, install_templates) - Update CLAUDE.md with explicit coverage commands

Critical fixes: - Fix UnboundLocalError in run_tests() by initializing counters before try - Fix double LoggerFactory.initialize() so root logger gets JSONL handler - Add explicit EventStreamRecorder.close() in cleanup to prevent data loss - Upgrade record_error defensive catches from DEBUG to WARNING - Add concurrent thread-safety tests for FastFailHandler Important fixes: - Replace bare except:pass with circuit-breaker logging in resource_sampler - Stop returning fabricated zero metrics on resource collection error - Return False from _set_state on exception instead of always True - Fix event.event_type -> event.get_type() in docstring example - Rename stale DebugObserver logger to LoggerObserver, add recursion warning - Remove unused enable_colors/structured_output params from _setup_logging - Add experiment_workflow field to FeatureLogLevelsConfig - Count and warn about skipped malformed JSONL lines in StorageObserver - Narrow subprocess retry exception types to prevent wasted retries Tests: - Add FastFailHandler concurrent thread-safety tests - Add MetricsCollector JSONL write circuit-breaker test - Add dot-separated event name dispatch test - Add build-metrics CLI command tests

…tion - Increase stdout/stderr truncation from 500 to 2000 chars in DockerComposeException for better diagnostics - Wrap NiceGUI UI updates in try/except RuntimeError to handle client disconnection gracefully during experiment execution

Reformat multi-line docstrings to single-line where appropriate (ruff D-rule compliance). Add missing module, class, and method docstrings throughout panther/ and tests/. Net reduction of ~370 lines from docstring compaction.

Replace three independent file-write paths (LoggerFactory FileHandler, EventStreamRecorder lock+handle, MetricsCollector lock+open) with a single JsonlWriter that serializes all structured.jsonl output through one threading.Lock and one persistent file handle. This eliminates the critical bug where interleaved writes from concurrent producers could corrupt the JSONL file. - Add panther/core/utils/jsonl_writer.py (JsonlWriter, JsonlLogHandler) - LoggerFactory: use JsonlLogHandler instead of FileHandler, expose get_jsonl_writer() for other components - EventStreamRecorder: remove own lock/handle/close, delegate to writer - MetricsCollector: remove _jsonl_lock and per-write open(), use writer - ExperimentManager: wire shared writer to metrics and event recorder - Update all affected tests (8 new, 3 test files updated)

Fix ImportError handling (C2/C3), experiment_manager context manager (C4), legacy code removal (4a-4f), error handling improvements (5a-5c), bounded dedup set, consistent global_config injection, docstring accuracy fixes, and strict event filtering. Add test coverage for _requires_emitter decorator and ExperimentObserverMixin. Add pytest.mark.unit markers to 3 test files. Fix .add() → OrderedDict in typed_observer and update test_observer_dispatch to remove _HANDLER_OVERRIDES references.

Replace fragile __class__ mutation with direct mixin inheritance, use call.kwargs for observer_id extraction instead of positional indexing, and strengthen warning count assertion from >= 1 to >= 2.

ElNiak added 30 commits March 27, 2026 12:40

test: add unit tests for test execution display helpers

1060283

Verify format_test_header() and format_test_result() output formatting, emoji control, timing precision, and padding behavior.

fix(error-handling): replace bare except:pass with circuit-breaker lo…

72bccde

…gging, prevent data loss

fix(reporting): fix O(n*m) dedup, log pattern errors, remove stale do…

55f88c8

…cstring

docs: fix inaccurate docstrings in logger_factory and typed_observer

fd8c18a

refactor(webapp): add __all__ to components __init__ for explicit pub…

fe3eb08

…lic API Follow established codebase pattern (tools/docs_gen/__init__.py) of defining __all__ on successful import and falling back to __all__ = [] when dependencies are missing.

fix(tests): exclude e2e directory from pytest collection

17278a9

Adds e2e to norecursedirs in both pyproject.toml and tests/pytest.ini so pytest never imports tests/e2e/conftest.py (which requires the uninstalled pytest_asyncio package) when running unit tests with -m unit.

refactor(observer): remove dead create_default_observers/create_defau…

dbcf5ca

…lt_observer_set functions These two factory functions were exported but never called by any production or test code. Removing them and their re-exports.

refactor: strip panther_builder.py to bootstrap-only (188→135 lines)

d563939

Remove _try_forward_to_cli() and dead argparse choices (package, package-test, install-local, docs, serve-docs, deploy-docs, zip-outputs). Keep only the two bootstrap commands: package-dev and clean.

chore: add pytest temp dirs to .gitignore and basetemp config

1f90c12

fix(cli): fix list.remove crash in panther check --format/--imports

2888055

Replace fragile conditional-then-remove command building with clean append-based construction.

ElNiak added 11 commits March 27, 2026 12:40

fix(cli): remove invalid force_overwrite param from create plugin/sub…

f107f18

…plugin create_plugin() and create_subplugin() do not accept force_overwrite.

fix(cli): fix PluginMetadata iteration crashes in plugins params/scan

ac5b4d8

discover_plugins() returns Dict[str, PluginMetadata], not Dict[str, List[str]]. Fix params to use direct dict lookup and scan to group by metadata.type for display.

fix(cli): convert str to Path in plugins validate command

d8137f1

click.Path() returns str but validate() calls .is_dir() and .is_file(). Add Path() conversion matching the existing pattern in check_deps().

fix(tests): remove duplicate test classes from merge artifact

8aa3126

TestServiceHealthTestName and TestErrorCategoryMatching were defined twice in test_status_collector.py due to the production merge.

fix(cli): fix pytest.ini header and type annotations in plugins.py

06d7d5e

- Change [tool:pytest] to [pytest] in tests/cli/pytest.ini (fixes PytestUnknownMarkWarning for @pytest.mark.unit in CLI tests) - Fix validate() and check_deps() type annotations: click.Path() returns str, not Path

Copilot AI review requested due to automatic review settings March 27, 2026 11:45

sourcery-ai Bot reviewed Mar 27, 2026

View reviewed changes

Copilot started reviewing on behalf of ElNiak March 27, 2026 11:45 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

ElNiak added 10 commits March 27, 2026 13:27

fix(ci): guard webapp forms imports for missing nicegui

65a6f9e

Wrap webapp/components/forms/__init__.py imports in try/except ImportError so test collection succeeds when nicegui is not installed.

docs: reformat and add missing docstrings across codebase

19959d8

Reformat multi-line docstrings to single-line where appropriate (ruff D-rule compliance). Add missing module, class, and method docstrings throughout panther/ and tests/. Net reduction of ~370 lines from docstring compaction.

fix: improve test_experiment_observer_mixin quality

116a58f

Replace fragile __class__ mutation with direct mixin inheritance, use call.kwargs for observer_id extraction instead of positional indexing, and strengthen warning count assertion from >= 1 to >= 2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: metrics-to-structured-jsonl with CLI fixes, observer refactoring, and test hardening#116

feat: metrics-to-structured-jsonl with CLI fixes, observer refactoring, and test hardening#116
ElNiak wants to merge 51 commits intofeature/structured-logging-overhaulfrom
feature/metrics-to-structured-jsonl

ElNiak commented Mar 27, 2026

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ElNiak commented Mar 27, 2026

Summary

Changes

Test plan

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants