Cloudflare's Browser Rendering now offers a /crawl endpoint in open beta that lets you crawl entire websites with a single API call. Submit a starting URL and the service automatically discovers, renders pages in a headless browser, and returns content as HTML, Markdown, or structured JSON. Crawl jobs run asynchronously. Key features include configurable crawl depth and page limits, URL pattern filtering, automatic page discovery via sitemaps or links, incremental crawling with modifiedSince/maxAge parameters, a static mode that skips browser rendering for faster crawls, and robots.txt compliance. Available on both Workers Free and Paid plans. Useful for training ML models, building RAG pipelines, and content monitoring.

2m read timeFrom developers.cloudflare.com
Post cover image

Sort: