Skip to content

fix + feat: expose on-screen error detection to library consumers + review fixes#285

Closed
chinmayajha wants to merge 15 commits into
mozarkai:mainfrom
chinmayajha:feature/error-file-addition
Closed

fix + feat: expose on-screen error detection to library consumers + review fixes#285
chinmayajha wants to merge 15 commits into
mozarkai:mainfrom
chinmayajha:feature/error-file-addition

Conversation

@chinmayajha

@chinmayajha chinmayajha commented May 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

Expose on-screen error detection to library consumers and address review feedback.

Feature

  • Refactors the existing CLI-only on-screen error detection into a shared module (optics_framework/common/error_detection.py) so it can be reused by TestRunner (CLI) and the Optics class (library).
  • Adds two new public methods on Optics:
    • add_error_definitions(source) — accepts a raw dict or an ErrorDefinitions model; merges into the active session.
    • capture_and_detect(context_label, test_case) — captures page source via StrategyManager, runs detection, returns matched dicts. Pure data primitive; driver errors are swallowed so a flaky session returns [] instead of raising.
  • Adds docs/usage/error_detection.md documenting CLI CSV, library API, and matching semantics.

Review fixes (this commit)

  • Rename patternmatch_string: the column name now reflects that matching is a case-insensitive substring check, not regex. Updated everywhere: sample CSV, models, reader, parser, optics.py, JUnit handler, logs, and docstrings.
  • Fix regex false positives: the malformed-XML fallback's \b boundary partially matched data-text/aria-label/tab-name as text/label/name. Use (?<![\w-]) lookbehind to match only whole attribute names.
  • Narrow broad except: limit BeautifulSoup parsing exception to (ValueError, TypeError) only; the ImportError is guarded at module load, so it is never masked. Log the parse failure; return empty string instead of raw HTML (avoids CSS/class false positives).
  • Add case-insensitive-substring semantics to docs: prominent note in docstrings and new error_detection.md page (wired into mkdocs nav).
  • Update poetry.lock: added beautifulsoup4 + soupsieve.
  • Merge origin/main: resolve version conflict (1.8.7-beta.1) and bring in 38 commits from main.

Testing

  • Pre-commit (ruff, bandit, trim-whitespace, etc.) passes on all changed files.
  • Import smoke test on all modified modules ✓
  • End-to-end detection flow (XML extraction → substring match → result dict) ✓
  • CSV header identification and loading with new match_string column ✓

After all test cases complete, capture and save the final device/browser
screen state before the session is torn down — a screenshot saved as
{timestamp}-end_of_run.jpg and the page source appended to
page_sources_log.xml (Appium) or page_sources_log.html (Selenium/Playwright).
Each capture is independently guarded so failures never affect the run result.
Adds an error_definitions CSV type (headers: error_code, pattern,
description, severity) that is auto-detected alongside other CSVs.
After the end-of-run page source is captured, each pattern is checked
against it via substring search. Matches are logged as warnings,
written to detected_errors.json, and injected into the JUnit XML
<testsuite> as <properties> elements — visible in Jenkins, Allure,
and any standard JUnit-aware CI dashboard.

Touches: models.py, data_reader.py, execute.py, session_manager.py,
test_runnner.py, Junit_eventhandler.py
Includes 10 common mobile/web error patterns (app crashes, network
failures, session timeout, permission denied, etc.) as a reference
for the on-screen error detection feature.
Replace raw page-source substring search with driver-aware visible-text
extraction: BeautifulSoup get_text() for HTML (Selenium/Playwright) and
a targeted regex over text-bearing XML attributes (text, content-desc,
label, value, hint, name) for Appium XML. This eliminates false positives
where patterns matched resource-ids, class names, or bounds strings instead
of actual on-screen text.

Also fix three correctness issues found during review:
- session_manager.py: sync SessionHandler ABC signature to include
  error_definitions param and mark non-optional args as Optional, matching
  the concrete SessionManager.create_session() implementation
- test_runnner.py: scope detected_errors output file per session
  (detected_errors_{session_id}.json) so concurrent/repeated runs do not
  overwrite each other
- test_runnner.py: fix pre-existing bug where passing test cases emitted
  EventStatus.FAIL to JUnit, causing all PASS results to appear as failures
  in XML reports
Before this change, on-screen error detection only ran inside the
CLI/TestRunner path (`optics execute ...`). Library consumers using
the `Optics` class directly — pytest-generator-template, Robot Framework
integrations — had no access to it even when `error_definitions.csv`
existed alongside their tests.

Three changes:

1. Extract the detection primitives out of `TestRunner` into a new
   shared module `optics_framework/common/error_detection.py`:
   - `extract_visible_text(page_source)`: BeautifulSoup `get_text()` for
     HTML (Selenium/Playwright), targeted attribute regex
     (`text|content-desc|label|value|hint|name`) for Appium XML. Prevents
     false-positive matches against resource-ids, class names, bounds.
   - `detect_errors_in_text(text, defs, context_label, test_case)`:
     OR-matches each row by `pattern` OR `error_code` substring; each
     result carries `matched_on` ("pattern"|"code") plus optional
     `detected_at` and `test_case` context.

2. Add two methods to the `Optics` class (`optics_framework/optics.py`):
   - `add_error_definitions(source)`: loads from CSV path, raw dict, or
     ErrorDefinitions model onto the active session. Repeated calls merge.
   - `capture_and_detect(context_label, test_case)`: captures page source
     via StrategyManager, runs detection, returns matched dicts. Pure
     data primitive — no files written; driver errors swallowed so a
     flaky session returns `[]` instead of raising.

   `TestRunner._capture_end_of_run_artifacts()` now calls the shared
   helpers; no behaviour change for CLI users (the existing
   `detected_errors_{session_id}.json` output and JUnit XML properties
   remain).

3. Two pre-existing bug fixes surfaced while working in the same area:
   - `TestRunner` emitted `EventStatus.FAIL` for *passing* test cases on
     the JUnit event bus, so every PASS was written to JUnit XML as a
     failure. Changed to `EventStatus.PASS`.
   - `JUnitEventHandler.keyword_elements` was keyed by `module_id` on
     write but looked up and deleted by `testcase_id` on read, causing
     every testcase completion to log
     `ERROR - Error in subscriber junit: <id>`. The XML tree is already
     built correctly via nested `ET.SubElement`, so the dict was dead
     code — removed.

Version bumped to 1.8.6-beta10 so downstream consumers can pin the new
APIs.
- optics.py: replace chained startswith() with tuple-arg form
  (config_string.startswith(("{", "[")))
- data_reader.py: replace chained startswith() with tuple-arg form
  (param.startswith(("/", "//", "(")))
- test_runnner.py: reduce cognitive complexity of
  _capture_end_of_run_artifacts from 17 to below 15 by extracting the
  three independent capture phases into private helpers:
    _save_end_of_run_screenshot
    _save_end_of_run_pagesource
    _run_end_of_run_detection
  The orchestrator now reads top-to-bottom as screenshot -> page source
  -> detection, with each phase owning its own error handling. No
  behaviour change.
chinmayajha added a commit to chinmayajha/optics-framework that referenced this pull request May 19, 2026
Addresses the SonarCloud finding on PR mozarkai#285 that flagged the missing
lock file: "Dependency versions are not predictable if the lock file
(uv.lock or poetry.lock) is missing."

- .gitignore: comment out the `poetry.lock` line so poetry.lock is no
  longer ignored.
- poetry.lock: add the freshly-generated lock file (poetry 2.4.1)
  pinning the full dependency graph resolved against pyproject.toml.

Going forward, dependency changes should bump pyproject.toml and re-run
`poetry lock` so the lock file stays in sync. CI builds and contributor
checkouts will now resolve to identical versions.
@chinmayajha chinmayajha force-pushed the feature/error-file-addition branch from 6ee58b2 to 0554e7a Compare May 20, 2026 13:22
thakur-patel pushed a commit that referenced this pull request May 20, 2026
Addresses the SonarCloud finding on PR #285 that flagged the missing
lock file: "Dependency versions are not predictable if the lock file
(uv.lock or poetry.lock) is missing."

- .gitignore: comment out the `poetry.lock` line so poetry.lock is no
  longer ignored.
- poetry.lock: add the freshly-generated lock file (poetry 2.4.1)
  pinning the full dependency graph resolved against pyproject.toml.

Going forward, dependency changes should bump pyproject.toml and re-run
`poetry lock` so the lock file stays in sync. CI builds and contributor
checkouts will now resolve to identical versions.
@malto101

malto101 commented May 26, 2026

Copy link
Copy Markdown
Member

as its a new feature, it would be nice to have it documented when you have free time.

UPD: Added

@sonarqubecloud

Copy link
Copy Markdown

Comment thread optics_framework/optics.py Outdated
Comment thread optics_framework/common/Junit_eventhandler.py Outdated
Comment thread optics_framework/common/error_detection.py Outdated
Comment thread optics_framework/common/Junit_eventhandler.py
Comment thread optics_framework/common/runner/test_runnner.py
Comment thread .gitignore
@chinmayajha chinmayajha self-assigned this Jun 26, 2026
Comment thread optics_framework/common/error_detection.py Outdated
Comment thread optics_framework/common/error_detection.py Outdated
Comment thread optics_framework/common/error_detection.py Outdated
Comment thread optics_framework/common/error_detection.py Outdated
Comment thread optics_framework/common/error_detection.py Outdated
Comment thread optics_framework/common/runner/test_runnner.py Outdated
Comment thread optics_framework/common/error_detection.py Outdated
Comment thread optics_framework/common/error_detection.py Outdated
…stem

- error_detection.py: move bs4 import to module level with availability
  guard; return "" (not raw HTML) on parse failure to prevent false
  positives from CSS class names; use xml.etree.ElementTree for Appium
  XML parsing with regex fallback for malformed input; rename
  _APPIUM_TEXT_ATTRS → _MOBILE_TEXT_ATTRS; add case-insensitive
  substring matching in detect_errors_in_text
- Junit_eventhandler.py: replace <properties> block approach with a
  synthetic "on-screen-error-detection" testcase containing <failure>
  elements per detected error, so standard CI tools (Jenkins, GitLab,
  GitHub Actions) surface and alert on them; defend keyword_elements
  removal — those keys were module-ids while the lookup used testcase-ids,
  making the original attach loop dead code; keywords are still logged
  correctly nested under module elements
- optics.py: remove CSV file-path branch from add_error_definitions to
  decouple library API from file I/O; CLI/runner loading already handled
  by BaseRunner._load_error_definitions; accept only dict | ErrorDefinitions
- test_runnner.py: pass context_label="end_of_run" to detect_errors_in_text
  so the detected_at field is present consistently in both runner and SDK
  code paths
- pyproject.toml: declare beautifulsoup4 ^4.12 dependency
- .gitignore: restore *.csv rule (already-tracked sample CSVs remain
  tracked; prevents accidental staging of test-run output CSVs)
- appium.py: add #nosec B603 to pre-existing subprocess.check_output call
  that uses static trusted args (matching pattern already used elsewhere
  in the same file)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@chinmayajha

Copy link
Copy Markdown
Collaborator Author

@thakur-patel @kun-codes I have added fixes for the changes requested in the comments. Please review the latest commit.

chinmayajha and others added 4 commits June 30, 2026 10:42
…lers

Three compounding causes behind 25k+ line debug logs per Assert Presence call:

1. appium_page_source.py: add 500 ms sleep between assert_elements retry
   iterations (the time.sleep(0.3) was commented out, causing a CPU
   busy-loop that ran dozens of full page-source scans per second)

2. logging_config.py: deduplicate file handlers in initialize_handlers —
   clear existing RotatingFileHandlers before adding new ones so repeated
   calls (CLI startup + ConfigHandler + Session) don't register multiple
   identical file handlers, each writing every log line to the same file

3. utils.py: remove two per-element no-match debug lines from compare_text
   — the fuzzy score line and "no match found" line fired for every UI
   element that didn't match the target, producing O(elements × retries)
   lines; success-path logs (exact, partial, fuzzy hit) are kept

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reverts the removal of the fuzzy-score and no-match debug lines — those
are intentional debug-level output, not noise. Only removes the second
(less informative) of each duplicate pair logged per match type (exact,
partial, fuzzy-hit each previously emitted two identical debug lines for
the same event).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread optics_framework/common/error_detection.py
…dition

# Conflicts:
#	optics_framework/engines/drivers/appium.py
#	optics_framework/helper/version.py
#	pyproject.toml
Resolve reviewer feedback on the on-screen error-detection subsystem:

- Rename the misleading "pattern" column to "match_string" everywhere
  (sample CSV header, ErrorDefinitions model, CSVDataReader, execute.py
  header sniffing, optics.py, JUnit handler, test runner log, and the
  matched_on value). The value is a case-insensitive substring, not a
  regex; the new name makes that explicit.
- Fix attribute-regex false positives: the malformed-XML fallback used a
  \b boundary that partially matched data-text/aria-label/tab-name as
  text/label/name. Use a (?<![\w-]) lookbehind so only whole attribute
  names match.
- Narrow the broad except around BeautifulSoup parsing to
  (ValueError, TypeError) and log it; the ImportError is handled at module
  load, so it is never masked. Raw HTML is never returned (avoids CSS/class
  false positives) — empty string is returned instead, with a warning.
- Document the case-insensitive substring semantics in docstrings and a new
  usage/error_detection.md docs page (wired into the mkdocs nav).
- Add beautifulsoup4 to the resolved poetry.lock.

The "pattern" naming, attribute-regex, broad-except, raw-HTML, missing
docs, and lockfile items were all flagged on the PR.
@sonarqubecloud

Copy link
Copy Markdown

@chinmayajha chinmayajha changed the title fix + feat: expose on-screen error detection to library consumers + fix to pre-existing bugs fix + feat: expose on-screen error detection to library consumers + review fixes Jun 30, 2026
@chinmayajha chinmayajha requested a review from thakur-patel June 30, 2026 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants