Two engineering interns at Agoda developed a system to analyze historical bug data and identify patterns that could help prevent future bugs. They built an ETL pipeline using PySpark on Kubeflow to collect code complexity and coverage metrics, created a Superset dashboard for visualization, and used LLMs to accurately identify bug-causing files with 90% accuracy. While code metrics alone couldn't reliably predict bugs, their work led to a successful integration with Agoda's AI-assisted code review tool, providing developers with historical bug context during reviews. The feature is now deployed across 400 projects and receives 78% positive feedback.
Table of contents
How It All BeganOur Path to the GoalCollecting the DataVisualizing the Metrics in the DashboardIdentifying CorrelationsGet Agoda Engineering’s stories in your inboxFocusing on Actual Bug FilesIntegrating Into the AI Internal Code Review ToolVisibility and SupportConclusionSort: