Test Smarter, Not Harder: Risk-Based Data Quality Without Pipeline Paralysis
This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).
Vinted's data engineering team shares how they overcame frequent finance pipeline failures caused by overly strict data quality tests. By applying financial accounting concepts like materiality and the Informational Quality Framework, they developed a risk-based testing framework that classifies tests by impact and frequency. High-impact tests block the main pipeline run, while low-impact tests run separately for monitoring only. In dbt, this is implemented via tags and exclusion flags, with Airflow reading meta configuration to pass --exclude flags to the dbt CLI. The result: fewer false alarms, actionable alerts, and pipelines that reliably deliver data at the start of the business day.
Table of contents
We obsessed over data quality. Why?Materiality and informational qualityThe risk-based approachManaging a process changeDesigning for continuity: first principles and inversion as a mental modelAppendixSort: