Multimodal LLMs reliably solve many CAPTCHA tasks but can be defended by adding fine-grained localization and implicit counting that drops state-of-the-art success from over 95% to 0%.
Hacking Google reCAPTCHA v3 using Reinforcement Learning
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
We present a Reinforcement Learning (RL) methodology to bypass Google reCAPTCHA v3. We formulate the problem as a grid world where the agent learns how to move the mouse and click on the reCAPTCHA button to receive a high score. We study the performance of the agent when we vary the cell size of the grid world and show that the performance drops when the agent takes big steps toward the goal. Finally, we used a divide and conquer strategy to defeat the reCAPTCHA system for any grid resolution. Our proposed method achieves a success rate of 97.4% on a 100x100 grid and 96.7% on a 1000x1000 screen resolution.
fields
cs.CR 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
COGNITION: From Evaluation to Defense against Multimodal LLM CAPTCHA Solvers
Multimodal LLMs reliably solve many CAPTCHA tasks but can be defended by adding fine-grained localization and implicit counting that drops state-of-the-art success from over 95% to 0%.