Best of Crawling — December 2025

1
Article
Product Hunt·24w
BrowserBook: The Browser Automation IDE
BrowserBook is an AI-powered IDE that combines a Jupyter-style notebook interface with an inline browser and context-aware coding assistant for building Playwright-based browser automations. It addresses common issues with browser agents (cost, speed, reliability, debugging) by shifting AI assistance to the coding phase rather than execution. Key features include interactive browser testing, notebook-style cell execution, DOM-aware code suggestions, built-in authentication management, screenshot tools, data extraction helpers, and API deployment capabilities for production use.
51
2
Video
Oxylabs·23w
n8n Web Scraping: Complete Automation Guide
n8n enables no-code web scraping through its visual workflow builder. The platform offers multiple approaches: basic HTTP Request and HTML nodes for static sites, Markdown conversion for AI processing, and third-party tools like Oxylabs AI Studio for JavaScript-heavy pages. Workflows can be configured with error handling, retry logic, and rate limiting. Scraped data integrates directly with databases, spreadsheets, and LLMs. Both cloud-hosted and self-hosted deployment options are available, with self-hosted being free but requiring infrastructure management.
37
3
Article
Carlos González·21w
Stop maintaining PDF infrastructure. Just send JSON.
PDF generation in production often involves memory-intensive tools like Puppeteer, layout inconsistencies, and deployment overhead for minor changes. Hundred Docs offers an alternative approach: a visual template editor for non-technical users and a JSON-based API for developers that handles PDF generation as a service. The platform includes 1,000 free monthly PDF generations and aims to eliminate infrastructure maintenance and CSS layout struggles.
12
7
4
Article
Hacker News·22w
Backing up Spotify
Anna's Archive created a comprehensive backup of Spotify containing 256 million track metadata entries and 86 million music files (~300TB), representing 99.6% of listens. The archive uses torrents for distribution, prioritizes tracks by popularity, stores files in OGG Vorbis (160kbit/s for popular tracks) and OGG Opus (75kbit/s for unpopular tracks), and includes the world's largest publicly available music metadata database with 186 million unique ISRCs. The metadata is structured in SQLite databases covering artists, albums, tracks, playlists, audio features, and file information, making it the first fully open music preservation archive that can be easily mirrored.
11

See all Crawling archives