Crawlee is a comprehensive web scraping and browser automation library for Python, designed to build reliable crawlers. It provides tools for efficient data extraction and persistent storage, with configurations to fly under the radar of modern bot protections. Available on PyPI, it supports BeautifulSoupCrawler for fast HTML parsing and PlaywrightCrawler for handling JavaScript-heavy pages. The library features types hints, proxy rotation, automatic retries, and more. Explore its documentation for detailed guides and examples. Contributions and bug reports are welcome on GitHub.

5m read timeFrom github.com
Post cover image
Table of contents
InstallationExamplesFeaturesRunning on the Apify platformSupportContributingLicense

Sort: