Skip to content

feat: metrics-to-structured-jsonl with CLI fixes, observer refactoring, and test hardening#116

Open
ElNiak wants to merge 51 commits intofeature/structured-logging-overhaulfrom
feature/metrics-to-structured-jsonl
Open

feat: metrics-to-structured-jsonl with CLI fixes, observer refactoring, and test hardening#116
ElNiak wants to merge 51 commits intofeature/structured-logging-overhaulfrom
feature/metrics-to-structured-jsonl

Conversation

@ElNiak
Copy link
Copy Markdown
Owner

@ElNiak ElNiak commented Mar 27, 2026

Summary

  • Observer refactoring, CLI bug fixes, event system improvements, config deduplication, and test hardening (41 commits)
  • ~117 files changed, net +108 lines

Changes

  1. Observer system: Extracted _is_duplicate helper, fixed test isolation, wired up typed handlers, fixed broken dedup and is_interested substring bug
  2. Event system: Flattened ServiceManagerEventMixin (881→700 lines), extracted _requires_emitter decorator
  3. CLI fixes: Fixed crashes in plugins params/scan, check --format/--imports, create plugin/subplugin, validate command Path handling
  4. Config: Deduplicated ValidationResult/ValidationError across config system, wired 8 orphaned CLI params
  5. Reporting: Fixed O(n*m) dedup, log pattern errors, inaccurate docstrings
  6. Error handling: Replaced bare except:pass with circuit-breaker logging
  7. Logging: Added ConsoleFormatter with phase context, --verbose/--debug flags
  8. Metrics: Write metrics to structured.jsonl, CLI query support
  9. Tests: Fixed 17 pre-existing failures, 62 webapp test failures, early termination test timing issues

Test plan

  • Run pytest tests/ -n auto -m unit — verify no regressions
  • Run pytest tests/ -n auto -m integration — verify Docker-dependent tests
  • Verify CLI commands: panther plugins list, panther check, panther config validate
  • Run experiment end-to-end and verify metrics output

ElNiak added 30 commits March 27, 2026 12:40
Replace the raw colorlog/logging.Formatter in LoggerFactory with a new
ConsoleFormatter that provides:
- Short HH:MM:SS timestamps (no date for live runs)
- Phase/service context from LogContext: (phase|service_id) prefix
- Abbreviated module names (strips panther.core. prefix)
- Static banner() method for phase separators via click.echo()

Wire phase banners into ExperimentManager at each transition:
initialization, plugin loading, test execution, and cleanup.
Add LoggerFactory.set_console_level() classmethod that updates all
StreamHandler instances (root + named loggers) to a given level while
keeping FileHandlers at DEBUG for structured JSONL output.

Wire --verbose (DEBUG) and --debug (TRACE) Click options through
the run command into ExperimentManager via a new console_level
parameter, giving CLI flags final precedence over YAML config.

Includes 8 unit tests covering level updates, FileHandler
preservation, TRACE support, and idempotency.
Verify format_test_header() and format_test_result() output
formatting, emoji control, timing precision, and padding behavior.
Add format_experiment_summary() that displays a structured banner after
all tests complete, showing pass/fail counts with percentage, duration,
output directory, failed test details, and next-step CLI commands.
Replaces the simple one-line completion message.
DockerOutputParser now adapts to console verbosity: concise one-liner
step labels by default, full instruction text when --verbose/--debug is
active. Download/extract progress is suppressed in concise mode. Build
start, completion (with duration), and cache-skip messages use
click.echo() for consistent structured output.
…pport

Metrics are now emitted as JSONL entries (source="metrics") directly
into structured.jsonl instead of separate metrics/ directory with
CSV/JSON files. Adds `panther logs metrics` CLI command and
metric_names/metric_types filters to LogFilter.

- MetricsCollector writes to structured.jsonl via _write_metric_jsonl()
- Removed metrics/ dir creation and CSV/JSON export from cleanup
- Added metric_names/metric_types fields to LogFilter
- Added `panther logs metrics` CLI subcommand
- Removed stale empty outputs/metrics/ directory
…dispatch, and harden error handling

Comprehensive logging cleanup across ~40 files: replace str(e) with e,
f-strings with lazy %s format, add exc_info=True to error/warning logs,
remove emojis from log messages, and replace print(stderr) with proper
logger calls.

Fix observer event naming to match ITypedObserver dispatch convention
(handle_* → on_*) in CommandAuditObserver, update StateEventObserver
event types, add test.failed handler to ExperimentObserver, and widen
MetricsObserver interest to test/step events.

Align FeatureLogLevelsConfig fields with feature_registry canonical names.
Use LoggerFactory.get_logger() in TestCaseBase for feature-level filtering.
Add atomic write for output index, build timeout for buildx, safer BuildKit
fallback, and defensive try/except in RCA pattern matching.
…ssion

Post-commit audit of a8f804d found several issues:

- CommandAuditObserver.on_service_command_modified and
  on_service_config_generated were dead code — the event dispatch
  system resolves handler names from event.name (preparation_completed,
  deployment_started), not from the action field. Replaced with
  action-based sub-dispatch inside on_service_preparation_completed
  and new on_service_deployment_started handler.

- get_supported_event_types() returned non-existent event names
  (service.command_modified, service.config_generated). Fixed to
  return actual dispatched names.

- Structured JSONL file handler level was changed from DEBUG to
  console_level, losing post-mortem debug data. Reverted to DEBUG.

- Removed 5 unused imports in test_case_base.py left over from
  _setup_logging refactoring.

- Updated all 56 tests to use dispatch-aware routing; added 4 new
  on_event() integration tests verifying command_modified and
  config_generated dispatch end-to-end.
…patch tests

- Fix dedup in ITypedObserver.on_event(): change event.event_id to event.id
  and add the actual membership check before appending (was append-only, never
  checked)
- Fix is_interested() substring false positives: replace `in` containment
  with bidirectional prefix matching and add min-length guard to reject
  empty/"e" inputs
- Fix type-handler class name matching in is_interested() to use startswith
  instead of substring containment
- Update docstrings: event.uuid -> event.id in observer_interface.py and
  observer/__init__.py; fix handler naming convention note in
  typed_observer_interface.py
- Add 23 tests covering _handler_name_for(), on_event() dispatch routing,
  dedup, is_interested() correctness, and error handling
…ault, use set for dedup

- LoggerObserver._process_event now dispatches to typed handlers
  (on_experiment_failed, on_test_failed, on_service_error, etc.)
  before falling back to generic _log_event, so emoji-formatted
  ERROR-level logging is no longer dead code (H2)
- ExperimentObserver._handle_test_execution_completed defaults
  success to True (M1: "completed" implies success)
- ExperimentObserver.get_priority docstring corrected to match
  IObserver contract: higher number = higher priority (M12)
- EventStreamRecorder.processed_events_uuids uses set instead
  of list for O(1) dedup lookups (M2)
- base_execution_environment.handle_event uses event.name instead
  of dead class-name comparisons (H8)
…, LoggerObserver dedup+error handling

- Change IObserver.processed_events_uuids from List to set for O(1) lookup
- Fix ITypedObserver to use .add() instead of .append() for set
- Add dedup guard to LoggerObserver._process_event (bypasses ITypedObserver.on_event)
- Add try/except around LoggerObserver typed/named handler calls
- Trim _logger_typed_handlers to name-based-only entries (type-based already handled)
- Fix contradictory test docstring in test_observer_dispatch.py
…, and metrics tests

- Fix is_interested() test expectations: MetricsObserver matches "test" entity prefix,
  StateEventObserver uses exact-match list (docker_build.started not in it)
- Fix CommandAuditObserver tests: route through on_event() instead of calling non-existent
  public handle_* methods, fix fixture processed_events_uuids type (list→set), add _type_handlers
- Fix StateEventObserver test: on_command_generation_started→on_service_preparation_started
- Fix LoggerFactory test: replace sys.stdout mock with real StreamHandler capture
- Fix ExperimentManager counter tests: add missing experiment_dir to _make_manager fixture
- Fix cleanup test: assert finalize() called instead of expecting metrics.json export
…d method

RC1: Add `web` to `dev` extras in pyproject.toml, install with `.[dev]`
in panther_builder.py, add `web` extra to CI workflows, and wrap
components/__init__.py imports in try/except for resilience.

RC2: Replace all `update_gui()` calls with `on_event()` in
test_web_observer.py to match the renamed WebObserver method.
Remove click.echo() from core code (4 files), replace print() with
logger in service_builder, convert ~76 f-string debug calls to lazy
%-style formatting, audit log levels (DEBUG↔INFO adjustments, add
recovery context to warnings, operation context to errors, completion
logs for orphaned starts), and gate LoggerFactory internal debug
logging behind _DEBUG_FACTORY flag.
…lic API

Follow established codebase pattern (tools/docs_gen/__init__.py) of
defining __all__ on successful import and falling back to __all__ = []
when dependencies are missing.
Adds e2e to norecursedirs in both pyproject.toml and tests/pytest.ini
so pytest never imports tests/e2e/conftest.py (which requires the
uninstalled pytest_asyncio package) when running unit tests with -m unit.
… up logging

- Remove CommandAuditObserver and PluginObserver (unused)
- Simplify ExperimentObserver and StateObserver (~2100 lines removed)
- Streamline ObserverFactory registration
- Clean up logger_factory and logging_mixin
- Update tests to match simplified observer system
- Remove stale observer references from plugin_manager and experiment_manager
…oss observer system

Fix 8 critical and 10 important documentation issues found in PR review:
- Fix set/list mismatch in observer_interface docstrings (.append → .add)
- Replace "compile-time type checking" with "runtime type dispatch"
- Clarify multi-layer deduplication architecture
- Replace all tqdm references with Click (rename tqdm_msg → build_msg)
- Add missing event_stream type to factory docstring
- Fix misleading --json help text in CLI commands
- Fix O(log n) claim → O(n) in event_manager
- Trim bloated logger_factory docstring, remove unverified perf claims
- Remove dead ColoredFormatter import and duplicate docstrings
- Fix priority description in EventStreamRecorder
… clean up dead code

- Extract dedup check into ITypedObserver._is_duplicate() reused by all observers
- EventStreamRecorder: keep file handle open instead of re-opening per event
- ExperimentObserver: remove redundant _should_terminate_early flag
- LoggerObserver: remove unused Click progress coordination and priority methods,
  add bounded collections to prevent unbounded memory growth
- StorageObserver: initialize _instance_lock at class level (not lazily)
- Fix 2 pre-existing test failures: mock DockerBuilder.get_instance and
  reset PluginManager singleton for xdist isolation, fix wrong PluginFormInfo
  attribute names in test assertions
…lt_observer_set functions

These two factory functions were exported but never called by any
production or test code. Removing them and their re-exports.
Remove _try_forward_to_cli() and dead argparse choices (package,
package-test, install-local, docs, serve-docs, deploy-docs, zip-outputs).
Keep only the two bootstrap commands: package-dev and clean.
… config system

- Remove duplicate classes from validation_ops.py, import from canonical
  components/validators.py
- Add __str__ to canonical ValidationResult for feature parity
- Rename command_processor's ValidationResult → CommandValidationResult
  to disambiguate (different API: 1-arg add_error vs 2-arg)
- Add docker user mapping params (run_as_host, uid, gid, user_name)
  to CLI_CONFIG_MAP so they flow through to config
- Pass plugin directory params (exec_env_dir, net_env_dir, iut_dir,
  testers_dir) to ConfigurationManager constructor
Extract two helpers to eliminate repeated boilerplate in 12 notify_*
methods:
- _get_service_context(): builds (service_id, service_name) tuple
- _emit_with_fallback(): handles the validated→service_emitter→direct
  triple-fallback emission pattern

All public method signatures preserved. notify_service_event() dispatch
table and emit_test_*/handle_event kept as-is (unique logic per branch).
…methods

Replace repeated inline guard clause (hasattr + truthiness check on
event_emitter) with a module-level @_requires_emitter decorator.
Also replaces dead _emit_with_fallback triple-fallback with direct
_emit_event, and removes no-value debug logs from emit_test_* guards.

Net: ~30 lines removed, single source of truth for the guard logic.
Replace fragile conditional-then-remove command building with clean
append-based construction.
ElNiak added 11 commits March 27, 2026 12:40
…plugin

create_plugin() and create_subplugin() do not accept force_overwrite.
discover_plugins() returns Dict[str, PluginMetadata], not
Dict[str, List[str]]. Fix params to use direct dict lookup and
scan to group by metadata.type for display.
click.Path() returns str but validate() calls .is_dir() and .is_file().
Add Path() conversion matching the existing pattern in check_deps().
… observer comment

- Add missing FastFailHandler import (2 tests crashed with NameError)
- Remove module-level DockerBuilder.reset_singleton() (unsafe for xdist)
- Add step.* prefix tests for MetricsObserver.is_interested()
- Fix stale comment about MetricsObserver event scope
…s, metrics filter)

1.1: Fix Jinja2 whitespace stripping that merged success summary lines
1.2: Silence ~38 unhandled event types with noop handlers + configurable logging
1.3: Broaden MetricsObserver.is_interested() to accept test.* and step.* events
1.4: Group service health tables by test_name in both Jinja2 and fallback reports
1.5: Remove duplicate TestCompleted/TestFailed emissions from ExperimentManager
1.6: Add required/optional pattern distinction for output completeness calculation
TestServiceHealthTestName and TestErrorCategoryMatching were defined
twice in test_status_collector.py due to the production merge.
…create commands

These options were accepted by `panther create plugin` and `panther create subplugin`
but silently ignored — never forwarded to create_plugin()/create_subplugin().
Removes them to eliminate user confusion and clean up the public API surface.
- Change [tool:pytest] to [pytest] in tests/cli/pytest.ini (fixes
  PytestUnknownMarkWarning for @pytest.mark.unit in CLI tests)
- Fix validate() and check_deps() type annotations: click.Path()
  returns str, not Path
- Add pytestmark = pytest.mark.integration at module level
- Fix Mock identity comparison (== -> is) for fast_fail_handler
- Replace DockerConnection test: builder enters cache-only mode (client=None), not exception
- Remove overly specific message/context assertions from docker build failure tests
- Add mock_docker_client fixture to test_docker_client_none_raises_exception (singleton isolation)
- Fix test_docker_client_none: catches PantherException (validate_build_prerequisites fires first)
- Fix test_docker_build_failure_stops_experiment: test handler directly instead of via build_image
- Fix test_configuration_based_fast_fail_behavior: remove critical_only (nonexistent), test real behavior
- Fix mock_event_manager: configure cleanup_none_observers() to return 0 (not MagicMock)
- Fix TestPluginManagerFastFail: use plugin_factory/create_service_manager (service_factory doesn't exist)
- Fix test_experiment_initialization_error: assign experiment_config; catch TestCaseInitializationError
…ed API, mock setup)

- Replace removed _deploy_services_non_blocking() / deploy_services() tests with
  test_deploy_services_monitoring() and test_monitoring_not_active_before_deployment_event()
- Fix thread-based tests to use polling with deadline instead of fixed sleeps
- Mock execute_docker_command (not _is_service_ready) to match _is_service_ready_with_timeout() internals
- Fix stop_monitoring() assertion: now sets monitor_thread=None internally
- Fix docker_compose_env fixture: env_type must be "network_environment" not "network"
- Fix test_test_case_checks_early_termination: use MagicMock for configs, steps.wait attribute
- Add pytestmark = pytest.mark.integration
…ed API, mock setup)

- Add pytestmark = pytest.mark.integration
- Fix thread timing: use stop_monitoring() internal join instead of sleep
- Replace removed _deploy_services_non_blocking/deploy_services API calls
- Fix BackgroundServiceMonitor mock: mock execute_docker_command not _is_service_ready
- Fix TestCase mock setup with proper patches
- Clean up unused imports (threading, TestConfig)
- Fix BackgroundServiceMonitor import path (moved to own module)
Copilot AI review requested due to automatic review settings March 27, 2026 11:45
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @ElNiak, your pull request is larger than the review limit of 150000 diff characters

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modernizes PANTHER’s logging/observer/metrics pipeline by moving metrics into the shared structured.jsonl stream, refactoring typed observer dispatch/dedup behavior, and hardening multiple CLI commands and tests to be more robust under parallel execution and optional dependencies.

Changes:

  • Added structured JSONL metric emission and query support (engine filters + CLI logs metrics).
  • Refactored observer/event plumbing (typed observer dispatch, state observer routing, recorder persistence/dedup).
  • Fixed and hardened CLI commands and tests (path handling, plugin metadata handling, logging verbosity controls, xdist isolation).

Reviewed changes

Copilot reviewed 114 out of 115 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/unit/test_webapp/test_web_observer.py Update observer API usage (on_event)
tests/unit/test_webapp/test_plugin_forms.py Patch DockerBuilder singleton in tests; adjust form info assertions
tests/unit/test_plugins/test_modern_plugin_system.py Reset singletons for xdist; patch DockerBuilder
tests/unit/test_core/test_utils/test_logger_factory_console_level.py New tests for console-level updates
tests/unit/test_core/test_utils/test_console_formatter.py New tests for ConsoleFormatter/context tagging
tests/unit/test_core/test_status_collector.py Remove redundant/obsolete unit tests
tests/unit/test_core/test_plugin_manager.py Align with plugin event emitter changes
tests/unit/test_core/test_observer_dispatch.py New typed observer routing/dedup/is_interested tests
tests/unit/test_core/test_metrics_structured_jsonl.py New tests for metrics JSONL + query filters
tests/unit/test_core/test_metrics_export_lifecycle.py Cleanup now verifies collector finalization
tests/unit/test_core/test_metrics_counter_increments.py Ensure experiment_dir exists in test harness
tests/unit/test_core/test_logging_mixin.py Update assertions for lazy logging formatting
tests/unit/test_core/test_logger_factory.py Fix log capture strategy for formatting test
tests/unit/test_config/test_environment_hierarchy.py Update expected default for output_format
tests/unit/plugins/environments/execution_environment/conftest.py Update strace fixture output_format
tests/pytest.ini Exclude e2e/ from recursion
tests/integration/test_fast_fail_integration.py Update expectations for fast-fail behavior and APIs
tests/cli/unit/commands/test_plugins_validate_path.py New test for validate path conversion
tests/cli/unit/commands/test_plugins_metadata.py New tests for PluginMetadata handling in CLI
tests/cli/unit/commands/test_create_command_api.py New tests to prevent invalid kwargs to creators
tests/cli/unit/commands/test_check_command_building.py New tests for check command subprocess args
tests/cli/pytest.ini Fix pytest section header
pyproject.toml Add web deps to dev; set pytest basetemp; norecursedirs
panther_builder.py Simplify bootstrap commands; use .[dev] installs
panther/webapp/components/init.py Make webapp components import optional
panther/tools/plugins/services/tutorials/tutorial.py Switch to lazy logging formatting
panther/tools/plugins/environments/tutorials/tutorial.py Improve logging; docstring cleanup; add noqa where needed
panther/tools/docs_gen/mkdocs/fix_markdown_links.py Add module docstring; docstring cleanup
panther/tools/docs_gen/link_rewriting.py Add clarifying fallback comments
panther/tools/docs_gen/generate_plugin_inventory.py Replace bare except with debug logging
panther/tools/config/panther_config_validator.py Add comments; improve error strings
panther/plugins/services/service_manager_mixin.py Docstring formatting cleanup; adjust log levels
panther/plugins/plugin_manager.py Remove PluginObserver; emitter-only event integration
panther/plugins/environments/network_environment/shadow_ns/shadow_network_resolver.py Docstring cleanup; error message formatting
panther/plugins/environments/network_environment/placeholder_parser.py Docstring cleanup; error message formatting
panther/plugins/environments/network_environment/mixins/subprocess_executor.py Docstring cleanup; adjust logging verbosity
panther/plugins/environments/network_environment/localhost_single_container/localhost_network_resolver.py Docstring cleanup; error message formatting
panther/plugins/environments/network_environment/docker_compose/docker_network_resolver.py Docstring cleanup; error message formatting
panther/plugins/environments/network_environment/base_network_resolver.py Docstring cleanup; safer exception messages
panther/plugins/environments/execution_environment/base_execution_environment.py Update event handling to name-based logic
panther/plugins/core/plugin_factory.py Improve error logging; strip None config fields
panther/core/utils/logging_mixin.py Avoid eager work; use lazy formatting + debug gating
panther/core/utils/console_formatter.py New console formatter with context tags + banners
panther/core/utils/init.py Export ConsoleFormatter
panther/core/test_cases/test_case_impl.py Remove emoji glyphs from dry-run logs
panther/core/test_cases/mixins/service_management.py Improve error messages with context
panther/core/test_cases/mixins/environment_management.py Make execution env setup failures fatal
panther/core/test_cases/execution/test_executor.py Use lazy logging formatting
panther/core/test_cases/base/test_case_base.py Switch test-case logging to LoggerFactory + file handler
panther/core/test_cases/analysis/output_analyzer.py Remove emoji glyphs; improve error strings
panther/core/reporting/root_cause_analyzer.py Improve dedup + isolate pattern matcher failures
panther/core/reporting/log_query_engine.py Add metric filters; normalize datetime comparisons
panther/core/reporting/failure_patterns.py Remove outdated comment block
panther/core/reporting/experiment_reporter.py Log with tracebacks; group service health by test
panther/core/outputs/output_index.py Atomic writes for output index
panther/core/outputs/output_aggregator.py Reduce log verbosity; improve warnings
panther/core/outputs/init.py Clarify internal-only interfaces in docs
panther/core/observer/management/event_manager.py Fix performance notes in docstring
panther/core/observer/impl/storage_observer.py Make instance lock eager; keep failed pending events; lazy formatting
panther/core/observer/impl/state_observer.py Table-driven routing for workflow state transitions
panther/core/observer/impl/plugin_observer.py Removed legacy PluginObserver implementation
panther/core/observer/impl/metrics_observer.py Normalize log_level type; add completion log
panther/core/observer/impl/event_stream_recorder.py Cache file handle; reuse dedup helper; warn on write failures
panther/core/observer/impl/command_audit_observer.py Removed command audit observer
panther/core/observer/impl/init.py Remove PluginObserver export
panther/core/observer/base/typed_observer_interface.py Add _is_duplicate; safer handler naming; fix is_interested
panther/core/observer/base/observer_interface.py Use set for dedup; add _get_event_type_safely
panther/core/observer/init.py Update docs; remove default observer set exports; remove PluginObserver
panther/core/metrics/resource_monitor.py Add more context to monitoring errors
panther/core/metrics/metrics_exporter.py Include tracebacks on exporter errors
panther/core/metrics/metrics_collector.py Add structured JSONL metric writes; suppress repeated warnings
panther/core/experiment_analysis.py Docstring cleanup; tracebacks on warnings; remove emoji glyphs
panther/core/exceptions/fast_fail.py Minor string formatting cleanup
panther/core/exceptions/error_handler_mixin.py Avoid redundant str(); preserve tracebacks
panther/core/docker_builder/platform_detection.py Safer fallback when Dockerfile unreadable
panther/core/docker_builder/image_operations.py Add build/cached info logs
panther/core/docker_builder/caching/docker_image_cache.py Docstring cleanup; lazy formatting; atomic save logging
panther/core/docker_builder/caching/docker_build_cache_mixin.py Improve warnings with rebuild hint
panther/core/docker_builder/buildx_operations.py Add build logs; add build timeout
panther/core/command_processor/core/validator.py Rename result type to CommandValidationResult
panther/core/command_processor/core/init.py Export CommandValidationResult
panther/core/command_processor/builders/service_builder.py Replace prints with logger warnings
panther/config/core/utils/cli_overrides.py Wire docker user-mapping CLI overrides
panther/config/core/models/global_config.py Expand/rename feature log-level fields
panther/config/core/models/environment.py Make ExecutionEnvironmentConfig.output_format optional
panther/config/core/mixins/validation_ops.py Deduplicate validation types via shared components
panther/config/core/mixins/config_loading.py Raise validation completion logs to INFO
panther/config/core/components/validators.py Add ValidationResult.str
panther/cli/commands/run.py Add --verbose/--debug; pass console_level; wire plugin dirs
panther/cli/commands/report.py Clarify --json help text
panther/cli/commands/plugins.py Fix metadata handling; convert click path strings to Path
panther/cli/commands/logs.py Validate timestamp/regex args; add logs metrics
panther/cli/commands/create.py Remove unsupported CLI options; add debug logging on errors
panther/cli/commands/config.py Add debug logs on validation/schema errors
panther/cli/commands/check.py Fix subprocess command construction
panther/cli/commands/admin.py Add debug logs on failures
experiment-config/base/experiment_config_example_minimal_docker.yaml Update logging defaults/feature levels; docker cache defaults
.gitignore Ignore pytest temp dirs
.github/workflows/unittests_codecov.yml Install web extras in CI
.github/workflows/pr-validation.yml Install editable with tests/lint/web extras
Comments suppressed due to low confidence (1)

panther/tools/docs_gen/mkdocs/fix_markdown_links.py:14

  • The file has a shebang (#!/usr/bin/env python3) in the middle of the module (after imports) and a second triple-quoted string that becomes a no-op string literal. This is confusing and defeats the purpose of a shebang. Move the shebang to line 1 (or remove it if this isn’t intended to be executed directly) and keep a single module docstring.

process._log_files = (stdout_handle, stderr_handle)

self.logger.debug(f"Background process started with PID: {process.pid}")
self.logger.info(f"Background process started with PID: {process.pid}")
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log call uses an f-string, which eagerly formats the message and is inconsistent with the lazy formatting pattern used elsewhere in this PR. Prefer parameterized logging (e.g., passing process.pid as an argument) so formatting is deferred and consistent.

Copilot uses AI. Check for mistakes.
Comment thread panther/webapp/components/__init__.py Outdated
Comment on lines +32 to +77
try:
from panther.webapp.components.display.event_viewer import event_viewer
from panther.webapp.components.display.metrics_panel import metrics_panel
from panther.webapp.components.display.plugin_card import plugin_card
from panther.webapp.components.display.plugin_detail_panel import (
render_plugin_detail,
)
from panther.webapp.components.display.service_health_card import (
service_health_card,
)
from panther.webapp.components.display.service_log_browser import (
service_log_browser,
)
from panther.webapp.components.display.test_detail_panel import test_detail_panel
from panther.webapp.components.forms.config_form_panel import config_form_panel
from panther.webapp.components.forms.test_list_editor import TestListEditor
from panther.webapp.components.status.error_boundary import error_boundary
from panther.webapp.components.status.notifications import (
notify_error,
notify_info,
notify_success,
notify_warning,
)
from panther.webapp.components.status.progress_bar import ExperimentProgress
from panther.webapp.components.status.status_badge import status_badge

__all__ = [
"event_viewer",
"metrics_panel",
"plugin_card",
"render_plugin_detail",
"service_health_card",
"service_log_browser",
"test_detail_panel",
"config_form_panel",
"TestListEditor",
"error_boundary",
"notify_error",
"notify_info",
"notify_success",
"notify_warning",
"ExperimentProgress",
"status_badge",
]
except ImportError:
__all__ = []
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The package-level import is wrapped in a blanket except ImportError that silently hides all import failures, including errors caused by regressions inside the webapp modules themselves (not just missing optional dependencies). This makes debugging harder because import panther.webapp.components can succeed while leaving missing attributes. Consider narrowing the exception to known optional third-party deps (or logging a warning with the underlying ImportError) so real internal import errors aren’t swallowed.

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +60
def _build_context_tag() -> str:
"""Build the ``(phase|service)`` tag from the current LogContext.

Returns:
A string like ``"(initialization|picoquic) "`` when both fields
are set, ``"(initialization) "`` when only phase is set, or
``""`` when no context is available.
"""
ctx = get_log_context()
parts: list[str] = []
if ctx.phase:
parts.append(ctx.phase)
if ctx.service_id:
parts.append(ctx.service_id)
if parts:
return f"({' | '.join(parts)}) " # note trailing space
return ""
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_build_context_tag() returns strings like "(phase | service) " (spaces around the separator), but the docstring/examples describe the format as "(phase|service) ". Either adjust the join separator to match the documented output or update the docstring/examples so they reflect what is actually emitted.

Copilot uses AI. Check for mistakes.
Comment on lines +26 to +28
def _normalize_dt(dt: datetime) -> datetime:
"""Strip timezone info for safe comparison."""
return dt.replace(tzinfo=None) if dt.tzinfo else dt
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timezone handling in time-range filters is incorrect when callers pass offset-aware datetimes. _normalize_dt() strips tzinfo without converting to a common timeline (e.g., UTC), so 2026-01-01T00:00:00+01:00 will be compared as naive 00:00 instead of 23:00Z, yielding wrong query results. Consider normalizing both record timestamps and filter bounds to UTC (or epoch seconds) before comparing.

Copilot uses AI. Check for mistakes.
Comment on lines +134 to +143
event_name = getattr(event, "name", type(event).__name__)
self.logger.debug(f"{self.__class__.__name__} received event: %s", event_name)
if event_name == "started":
self.logger.debug(
"Service started, environment monitoring should be active"
)
elif event_type == "ServiceStoppedEvent":
elif event_name == "stopped":
self.logger.debug("Service stopped, environment collection complete")
else:
self.logger.debug("Unhandled event type: %s", event_type)
self.logger.debug("Unhandled event type: %s", event_name)
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle_event() now branches solely on event.name being "started"/"stopped". Multiple event domains can share these names (e.g., plugin lifecycle), so this can log misleading messages or trigger environment logic for non-service events. Restrict these branches to event.entity_type == EventType.SERVICE (or the concrete service event classes) to preserve the original intent.

Copilot uses AI. Check for mistakes.
Comment on lines +106 to +112
def __del__(self):
"""Ensure the file handle is closed on garbage collection."""
if self._file_handle is not None and not self._file_handle.closed:
try:
self._file_handle.close()
except Exception:
pass
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__del__() closes _file_handle without acquiring _lock, which can race with on_event() (which writes under _lock) and lead to writes against a closing/closed handle. To keep thread-safety consistent, delegate destructor cleanup to close() (or acquire _lock inside __del__).

Copilot uses AI. Check for mistakes.
ElNiak added 10 commits March 27, 2026 13:27
… code)

- Add threading.Lock to FastFailHandler for thread-safe error tracking
- Add warning log + method guard to _requires_emitter and _emit_event
- Remove broken conftest fixtures referencing deleted modules
- Fix EventStreamRecorder file handle reset to be inside lock
- Return False from LoggerObserver on handler exceptions
- Pass global_config to metrics/storage observer factory builders
- Move logger assignments below imports (isort compliance)
- Delegate EventStreamRecorder.__del__ to close() for thread safety
- Guard handle_event against non-service events via entity_type check
- Normalize datetimes to UTC before stripping tzinfo in log query engine
- Correct _build_context_tag docstring separator to match code
- Log ImportError in webapp components instead of silently swallowing
- Use lazy log formatting in subprocess_executor
- Skip panther_ivy tests when submodule not installed (fixes CI)
Wrap webapp/components/forms/__init__.py imports in try/except
ImportError so test collection succeeds when nicegui is not installed.
…B memory usage

Coverage was baked into pyproject.toml addopts, meaning every pytest
invocation instrumented 321K lines with branch tracing + generated 3
report formats. This caused ~50GB memory usage on casual test runs.

Changes:
- Remove --cov* from default addopts (coverage now opt-in)
- Set branch = false (halves instrumentation overhead)
- Enable parallel = true (distribute across SQLite files with xdist)
- Expand omit patterns (webapp, tools, banner, install_templates)
- Update CLAUDE.md with explicit coverage commands
Critical fixes:
- Fix UnboundLocalError in run_tests() by initializing counters before try
- Fix double LoggerFactory.initialize() so root logger gets JSONL handler
- Add explicit EventStreamRecorder.close() in cleanup to prevent data loss
- Upgrade record_error defensive catches from DEBUG to WARNING
- Add concurrent thread-safety tests for FastFailHandler

Important fixes:
- Replace bare except:pass with circuit-breaker logging in resource_sampler
- Stop returning fabricated zero metrics on resource collection error
- Return False from _set_state on exception instead of always True
- Fix event.event_type -> event.get_type() in docstring example
- Rename stale DebugObserver logger to LoggerObserver, add recursion warning
- Remove unused enable_colors/structured_output params from _setup_logging
- Add experiment_workflow field to FeatureLogLevelsConfig
- Count and warn about skipped malformed JSONL lines in StorageObserver
- Narrow subprocess retry exception types to prevent wasted retries

Tests:
- Add FastFailHandler concurrent thread-safety tests
- Add MetricsCollector JSONL write circuit-breaker test
- Add dot-separated event name dispatch test
- Add build-metrics CLI command tests
…tion

- Increase stdout/stderr truncation from 500 to 2000 chars in
  DockerComposeException for better diagnostics
- Wrap NiceGUI UI updates in try/except RuntimeError to handle
  client disconnection gracefully during experiment execution
Reformat multi-line docstrings to single-line where appropriate
(ruff D-rule compliance). Add missing module, class, and method
docstrings throughout panther/ and tests/.

Net reduction of ~370 lines from docstring compaction.
Replace three independent file-write paths (LoggerFactory FileHandler,
EventStreamRecorder lock+handle, MetricsCollector lock+open) with a
single JsonlWriter that serializes all structured.jsonl output through
one threading.Lock and one persistent file handle.

This eliminates the critical bug where interleaved writes from
concurrent producers could corrupt the JSONL file.

- Add panther/core/utils/jsonl_writer.py (JsonlWriter, JsonlLogHandler)
- LoggerFactory: use JsonlLogHandler instead of FileHandler, expose
  get_jsonl_writer() for other components
- EventStreamRecorder: remove own lock/handle/close, delegate to writer
- MetricsCollector: remove _jsonl_lock and per-write open(), use writer
- ExperimentManager: wire shared writer to metrics and event recorder
- Update all affected tests (8 new, 3 test files updated)
Fix ImportError handling (C2/C3), experiment_manager context manager (C4),
legacy code removal (4a-4f), error handling improvements (5a-5c), bounded
dedup set, consistent global_config injection, docstring accuracy fixes,
and strict event filtering.

Add test coverage for _requires_emitter decorator and
ExperimentObserverMixin. Add pytest.mark.unit markers to 3 test files.
Fix .add() → OrderedDict in typed_observer and update test_observer_dispatch
to remove _HANDLER_OVERRIDES references.
Replace fragile __class__ mutation with direct mixin inheritance,
use call.kwargs for observer_id extraction instead of positional
indexing, and strengthen warning count assertion from >= 1 to >= 2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants