
Security News
GitHub Actions Pricing Whiplash: Self-Hosted Actions Billing Change Postponed
GitHub postponed a new billing model for self-hosted Actions after developer pushback, but moved forward with hosted runner price cuts on January 1.
pydoll-python
Advanced tools
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
A 100% Typed, async-native automation library built for modern bot evasion and high-performance scraping.
📖 Full Documentation • 🚀 Getting Started • ⚡ Advanced Features • 🧠 Deep Dives • 💖 Support This Project
Pydoll is built on a simple philosophy: powerful automation shouldn't require you to fight the browser.
Forget broken webdrivers, compatibility issues, or being blocked by navigator.webdriver=true. Pydoll connects directly to the Chrome DevTools Protocol (CDP), providing a natively asynchronous, robust, and fully typed architecture.
It's designed for modern scraping, combining an intuitive high-level API (for productivity) with deep-level control over the network and browser behavior (for evasion), allowing you to bypass complex anti-bot defenses.
Pydoll is proudly sponsored by Thordata: a residential proxy network built for serious web scraping and automation. With 190+ real residential and ISP locations, fully encrypted connections, and infrastructure optimized for high-performance workflows, Thordata is an excellent choice for scaling your Pydoll automations.
Sign up through our link to support the project and get 1GB free to get started.
Pydoll excels at behavioral evasion, but it doesn't solve captchas. That's where CapSolver comes in. An AI-powered service that handles reCAPTCHA, Cloudflare challenges, and more, seamlessly integrating with your automation workflows.
Register with our invite code and use code PYDOLL to get an extra 6% balance bonus.
asyncio and 100% type-checked with mypy. This means top-tier I/O performance for concurrent tasks and a fantastic Developer Experience (DX) with autocompletion and error-checking in your IDE.tab.request to make blazing-fast API calls that inherit the entire browser session.tab.find() for 90% of cases and tab.query() for complex CSS/XPath selectors.pip install pydoll-python
That's it. No webdrivers. No external dependencies.
humanize=True)Pydoll now includes a humanized typing engine that simulates realistic human typing behavior:
interval parameter: Just use humanize=True for anti-bot evasion# Old way (detectable)
await element.type_text("hello", interval=0.1)
# New way (human-like, anti-bot)
await element.type_text("hello", humanize=True)
humanize=True)The scroll API now features a Cubic Bezier curve physics engine for realistic scrolling:
# Smooth scroll (CSS animation, predictable timing)
await tab.scroll.by(ScrollPosition.DOWN, 500, smooth=True)
# Humanized scroll (physics engine, anti-bot)
await tab.scroll.by(ScrollPosition.DOWN, 500, humanize=True)
await tab.scroll.to_bottom(humanize=True)
| Mode | Parameter | Use Case |
|---|---|---|
| Instant | smooth=False | Speed-critical operations |
| Smooth | smooth=True | General browsing simulation |
| Humanized | humanize=True | Anti-bot evasion |
Thanks to its async architecture and context managers, Pydoll is clean and efficient.
import asyncio
from pydoll.browser import Chrome
from pydoll.constants import Key
async def google_search(query: str):
# Context manager handles browser start() and stop()
async with Chrome() as browser:
tab = await browser.start()
await tab.go_to('https://www.google.com')
# Intuitive finding API: find by HTML attributes
search_box = await tab.find(tag_name='textarea', name='q')
# "Human-like" interactions simulate typing
await search_box.insert_text(query)
await search_box.press_keyboard_key(Key.ENTER)
# Find by text and click (simulates mouse movement)
first_result = await tab.find(
tag_name='h3',
text='autoscrape-labs/pydoll', # Supports partial text matching
timeout=10,
)
await first_result.click()
# Wait for an element to confirm navigation
await tab.find(id='repository-container-header', timeout=10)
print(f"Page loaded: {await tab.title}")
asyncio.run(google_search('pydoll python'))
Pydoll is a complete toolkit for professional automation.
Tired of manually extracting and managing cookies to use requests or httpx? Pydoll solves this.
Use the UI automation to pass a complex login (with CAPTCHAs, JS challenges, etc.) and then use tab.request to make authenticated API calls that automatically inherit all cookies, headers, and session state from the browser. It's the best of both worlds: the robustness of UI automation for auth, and the speed of direct API calls for data extraction.
# 1. Log in via the UI (handles CAPTCHAs, JS, etc.)
await tab.go_to('https://my-site.com/login')
await (await tab.find(id='username')).type_text('user')
await (await tab.find(id='password')).type_text('pass123')
await (await tab.find(id='login-btn')).click()
# 2. Now, use the browser's session to hit the API!
# This request automatically INHERITS the login cookies
response = await tab.request.get('https://my-site.com/api/user/profile')
user_data = response.json()
print(f"Welcome, {user_data['name']}!")
Take full control of the network stack. Pydoll allows you to not only monitor traffic for reverse-engineering APIs but also to intercept requests in real-time.
Use this to block ads, trackers, images, or CSS to dramatically speed up your scraping and save bandwidth, or even to modify headers and mock API responses for testing.
import asyncio
from pydoll.browser.chromium import Chrome
from pydoll.protocol.fetch.events import FetchEvent, RequestPausedEvent
from pydoll.protocol.network.types import ErrorReason
async def block_images():
async with Chrome() as browser:
tab = await browser.start()
async def block_resource(event: RequestPausedEvent):
request_id = event['params']['requestId']
resource_type = event['params']['resourceType']
url = event['params']['request']['url']
# Block images and stylesheets
if resource_type in ['Image', 'Stylesheet']:
await tab.fail_request(request_id, ErrorReason.BLOCKED_BY_CLIENT)
else:
# Continue other requests
await tab.continue_request(request_id)
await tab.enable_fetch_events()
await tab.on(FetchEvent.REQUEST_PAUSED, block_resource)
await tab.go_to('https://example.com')
await asyncio.sleep(3)
await tab.disable_fetch_events()
asyncio.run(block_images())
A User-Agent isn't enough. Pydoll gives you granular control over Browser Preferences, allowing you to modify hundreds of internal Chrome settings to build a robust and consistent fingerprint.
Our documentation doesn't just give you the tool; it explains in detail how canvas, WebGL, font, and timezone fingerprinting works, and how to use these preferences to defend your automation.
options = ChromiumOptions()
# Create a realistic and clean browser profile
options.browser_preferences = {
'profile': {
'default_content_setting_values': {
'notifications': 2, # Block notification popups
'geolocation': 2, # Block location requests
},
'password_manager_enabled': False # Disable "save password" prompt
},
'intl': {
'accept_languages': 'en-US,en', # Make consistent with your proxy IP
},
'browser': {
'check_default_browser': False, # Don't ask to be default browser
}
}
Pydoll is built for scale. Its async architecture allows you to manage multiple tabs and browser contexts (isolated sessions) in parallel.
Furthermore, Pydoll excels in production architectures. You can run your browser in a Docker container and connect to it remotely from your Python script, decoupling the controller from the worker. Our documentation includes guides on how to set up your own remote server.
# Example: Scrape 2 sites in parallel
async def scrape_page(url, tab):
await tab.go_to(url)
return await tab.title
async def concurrent_scraping():
async with Chrome() as browser:
tab_google = await browser.start()
tab_ddg = await browser.new_tab() # Create a new tab
# Execute both scraping tasks concurrently
tasks = [
scrape_page('https://google.com/', tab_google),
scrape_page('https://duckduckgo.com/', tab_ddg)
]
results = await asyncio.gather(*tasks)
print(results)
Reliable Engineering: Pydoll is fully typed, providing a fantastic Developer Experience (DX) with full autocompletion in your IDE and error-checking before you even run your code. Read about our Type System.
Robust-by-Design: The @retry decorator turns fragile scripts into production-ready automations. It doesn't just "try again"; it lets you execute custom recovery logic (on_retry), like refreshing the page or rotating a proxy, before the next attempt.
from pydoll.decorators import retry
from pydoll.exceptions import ElementNotFound, NetworkError
@retry(
max_retries=3,
exceptions=[ElementNotFound, NetworkError], # Only retry on these specific errors
on_retry=my_recovery_function, # Run your custom recovery logic
exponential_backoff=True # Wait 2s, 4s, 8s...
)
async def scrape_product(self, url: str):
# ... your scraping logic ...
Pydoll is not a black box. We believe that to defeat anti-bot systems, you must understand them. Our documentation is one of the most comprehensive public resources on the subject, teaching you not just the "how," but the "why."
Understand how bots are detected and how Pydoll is designed to win.
canvas, WebGL, and fonts create your unique ID.Proxies are more than just --proxy-server.
Software engineering you can trust.
Browser, Tab, and WebElement.find() API.asyncio and the CDP.We would love your help to make Pydoll even better! Check out our contribution guidelines to get started.
If you find Pydoll useful, consider sponsoring my work on GitHub. Every contribution helps keep the project alive and drives new features!
Pydoll is licensed under the MIT License.
Pydoll — Web automation, taken seriously.
FAQs
Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
We found that pydoll-python demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
GitHub postponed a new billing model for self-hosted Actions after developer pushback, but moved forward with hosted runner price cuts on January 1.

Research
Destructive malware is rising across open source registries, using delays and kill switches to wipe code, break builds, and disrupt CI/CD.

Security News
Socket CTO Ahmad Nassri shares practical AI coding techniques, tools, and team workflows, plus what still feels noisy and why shipping remains human-led.