AnyCrawl is a high-performance Node.js/TypeScript web crawling application that supports multiple scraping engines (Cheerio, Playwright, Puppeteer) and search engine result extraction. It features multi-threading architecture, batch processing capabilities, and is optimized for LLM data preparation. The tool offers Docker deployment, proxy support, and REST API endpoints for both single-page scraping and comprehensive site crawling.

Sort: