Skip to content

chenmingzhe666-hub/agentwebtest

Repository files navigation

Web Automation Test Agent

A comprehensive AI-powered web automation testing system similar to MAA, designed for testing job assistant applications. This system can simulate user interactions, automatically analyze UI elements, generate test flows, and verify system responses with the help of Ollama's qwen3-vl:8b multimodal model.

Features

  • Webpage Opening: Opens specified URLs in the default browser
  • Screen Capture: Captures screenshots of the current screen
  • AI UI Analysis: Uses qwen3-vl:8b to analyze UI elements from screenshots
  • Test Flow Generation: Automatically generates comprehensive test flows based on UI analysis
  • Automated Execution: Simulates clicks, typing, and other user actions
  • AI Validation: Validates operation results using AI analysis
  • Error Handling: Handles various exception scenarios
  • Test Reporting: Generates detailed test reports
  • OCR Integration: Includes OCR functionality for text recognition

Installation

  1. Clone the repository (or download the files):

    # Navigate to your desired directory
    cd e:\agent自动测试web
  2. Install dependencies:

    pip install -r requirements.txt
  3. Ensure Ollama is running:

    • Install Ollama from ollama.com
    • Pull the qwen3-vl:8b model:
      ollama pull qwen3-vl:8b
    • Start the Ollama server (it should run automatically)

Configuration

  1. Edit run_test.py:
    • Set BASE_URL to your job assistant application URL
    • No need to manually set function positions - AI will automatically detect UI elements

Usage

  1. Run the test:

    python run_test.py
  2. Test process:

    • The agent will open the specified URL
    • Capture a screenshot of the initial page
    • Send the screenshot to qwen3-vl:8b for analysis
    • AI will generate a comprehensive test flow
    • The agent will execute each test step automatically
    • Validate each step with AI analysis
    • Generate a detailed test report
  3. View results:

    • Test reports will be generated in the test_results directory
    • Screenshots for each test step will be saved
    • Detailed logs will be available in test_report.txt

Project Structure

web-automation-tester/
├── web_automation_tester.py  # Main test class with AI integration
├── ocr_module.py            # OCR functionality for text recognition
├── run_test.py              # Test runner with configuration
├── requirements.txt         # Dependencies
├── README.md               # This documentation
└── test_results/           # Generated test results (auto-created)
    ├── initial_page.png     # Initial page screenshot for AI analysis
    ├── step_1_[element].png # Screenshots for each test step
    └── test_report.txt      # Detailed test report

Core Class: WebAutomationTester

Methods

  • open_webpage(url): Opens a webpage
  • get_screenshot(filename): Captures and saves a screenshot
  • load_screenshot(filename): Loads a saved screenshot
  • click_position(x, y, description): Clicks at specified coordinates
  • encode_image(image_path): Encodes image to base64 for AI analysis
  • validate_with_ollama(prompt, image_path): Validates results using Ollama with optional image
  • generate_test_flow(): Generates test flow using AI based on screenshot
  • execute_test_flow(test_flow): Executes test flow generated by AI
  • test_all_functions(function_positions): Tests all functions (AI-generated or manual)
  • generate_test_report(): Generates a comprehensive report

OCR Module

The system includes an OCR module (ocr_module.py) that provides text recognition capabilities, similar to the reference OCRer code. This module can be used to extract text from images for additional analysis.

Key Features:

  • Text recognition from images
  • Post-processing of OCR results
  • Text replacement based on rules
  • Filtering and validation of OCR results

Customization

  1. Change AI model:

    • Update the model parameter in the __init__ method of WebAutomationTester
    • Supported models: qwen3-vl:8b (recommended), other multimodal models
  2. Customize test flow generation:

    • Modify the prompt in the generate_test_flow method to adjust AI instructions
    • Add specific test requirements to the prompt
  3. Adjust execution speed:

    • Modify sleep times in the click_position and execute_test_flow methods
  4. Add manual fallback:

    • If AI test flow generation fails, you can provide manual function positions
    • Pass a function_positions dictionary to test_all_functions

Troubleshooting

  • Ollama connection issues:

    • Ensure Ollama server is running
    • Verify the URL in ollama_url (default: http://localhost:11434/api/generate)
    • Check that the qwen3-vl:8b model is properly pulled
  • AI test flow generation issues:

    • Ensure the screenshot is clear and all UI elements are visible
    • Check that the browser window is not minimized or obscured
    • Try restarting Ollama if the API is unresponsive
  • Execution issues:

    • Ensure no other applications are interfering with mouse/keyboard
    • Check that the browser window is in focus during testing
    • Adjust coordinates in the AI-generated test flow if clicks are inaccurate
  • Performance issues:

    • Adjust sleep times in the code if tests are too fast or too slow
    • Large screenshots may take longer to process - consider resizing before analysis

AI Capabilities

The system leverages qwen3-vl:8b's multimodal capabilities to:

  1. UI Analysis: Identify buttons, links, input fields, and other UI elements
  2. Coordinate Detection: Estimate screen coordinates for interactive elements
  3. Test Flow Generation: Create logical test sequences based on UI structure
  4. Result Validation: Analyze screenshots to verify successful operations
  5. Error Detection: Identify and report issues in the application

Performance Considerations

  • AI Processing Time: Each screenshot analysis may take 5-15 seconds
  • Network Usage: Sending images to Ollama requires network bandwidth
  • System Resources: Running qwen3-vl:8b requires significant RAM (16GB+ recommended)
  • Test Duration: Complex applications may take several minutes to test completely

License

This project is open for personal and commercial use. Modify and extend it as needed for your specific testing requirements.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors