A walkthrough of a production-grade Amazon price comparison and web scraping system built entirely in Python. The architecture covers an event-driven backend using FastAPI and Ingest for orchestration, BeautifulSoup for HTML parsing, a residential proxy network (ThorData) to bypass IP blocks and scrape from multiple countries, MongoDB for raw data storage, Qdrant as a vector database for AI-powered querying, and a LangChain agent with OpenAI for natural language product queries. All services are containerized with Docker Compose. The video also demonstrates how an LLM was used to generate scraping code from raw HTML, and how the system can scale to handle unlimited products across regions.

•23m watch time

Sort: