A developer experimented with using GPT-4o's structured outputs for web scraping, creating an AI-assisted web scraper. While the model performed well with simple and complex tables, it struggled with combined rows and generating XPaths. Cost is a concern due to the model's character volume requirements. Future improvements

•6m read time•From blancas.io
Post cover image
Table of contents
Asking GPT-4o to scrape dataParsing complex tablesCombined rows break the modelAsking GPT-4o to return XPathsCombining the two approachesGPT-4o is very expensiveConclusions and demo
15 Comments

Sort: