Skip to content

Enhancement: Update Waiting Logic to handle Cloudflare#9

Open
Binary-Ape wants to merge 2 commits intotechinz:mainfrom
Binary-Ape:Feature/Cloudflare_AutoWait
Open

Enhancement: Update Waiting Logic to handle Cloudflare#9
Binary-Ape wants to merge 2 commits intotechinz:mainfrom
Binary-Ape:Feature/Cloudflare_AutoWait

Conversation

@Binary-Ape
Copy link

Overview

Currently the code waits an arbitrary amount of time (default: 6 seconds) and then re-check for the success criteria. This slows down the solving process if the page or turnstile returns more quickly then that.

Changing the logic to wait for an event that should allow it to check sooner based on when we get a response.

Not sure if we want to update/change the variable from the solve_click_delay to something that is more accurate on its description - solve_success_timeout possibly?

Cloudflare - Turnstile

This had a bug in it where it would always return as successful as the success element is technically always there - it is just hidden.

        success_elements = await search_shadow_root_elements(framework, iframe, 'div[id="success"]')
        challenge_solved = bool(success_elements)

Changing the logic to grab that element and wait for it to become visible. The original checkbox disappears completely from the DOM - so there could be a check to see if it is there:

    await checkbox.wait_for_element_state("hidden", timeout=solve_click_delay * 1000)

Cloudflare - Challenge Page

Change this to wait for a new load event - so when the page re-directs it will then evaluate if the action was successful.

The on("load") event triggers when the event next fires - so it will only happen if a new event fires (and doesn't check the current page state)

*Overview*
The current method of verifying if the Captcha is solved is to wait an arbitrary amount of time then re-check for the elements.

Changing this to wait for the success criteria with a timeout using the original value. Allowing solving to happen faster if things go smoothly.

**Turnstile**

This had a bug in it where it would always return as successful as the `success` element is technically always there - it is just hidden.

Changing the logic to grab that element and wait for it to become visible.

**Challenge Page**
Change this to wait for a new `load` event - so when the page re-directs it will then check to see if the iFrames are there.

The `on("load")` event triggers when the event next fires.
**shadow_root.py**

Changed this to use the `element_handle`.`wait_for_celector` so that it will auto wait for a selector to appear in the shadow root instead of looping with an arbitrary wait time inbetween.

Since a document can have multiple shadow roots we will kick off a task to all await at the same time and break when we find the turnstile. This could be enhanced to handle multiple turnstiles with a flag to check if we should break on the first element or not.

**solve_by_click.py**

Modified the steps to merge Click and Validate into one process as `turnstiles` and `interstitial` have different success criteria. They both still wait asynchronously with different triggers (page reloading vs `success` element becoming visible).

Broke the click_checkbox into a separate function to keep it clean as that is shared code between both solving styles.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant