Pinterest built an AI-powered system to measure prevalence of policy-violating content by tracking what users actually see rather than relying solely on reports. The system uses ML-assisted sampling weighted by impressions and risk scores, then labels content at scale using multimodal LLMs validated against human reviewers. This approach provides daily metrics with 95% confidence intervals across policy areas, enabling faster detection of emerging threats, real-time monitoring of interventions, and data-driven resource allocation while reducing measurement latency by 15x compared to human-only review.

9m read timeFrom medium.com
Post cover image
Table of contents
Why Prevalence MattersWhat We MeasureMethods at a GlanceSystem OverviewImplementation NotesDashboard and AlertingImpactConstraints and Trade‑offsFuture FocusAcknowledgements

Sort: