Moving a machine learning model from exploration into production is never a straight path.
Academic success often relies on clean datasets, controlled environments, and benchmark metrics—but in the real world, models face data drift, latency constraints, integration challenges, and the risk of wrong predictions directly impacting customers and business processes.
In this talk, we share our journey of bringing fraud detection models from research into production at Swiss Post. We highlight the role of the shadow model approach, where new models run in parallel with production systems to safely validate performance on live traffic.
You’ll learn how shadow testing helps measure robustness, monitor data drift, and align with business KPIs—without introducing operational risk.
Attendees will take away concrete practices for bridging the gap between academic experimentation and real-world operations: how to design safe rollouts, build monitoring pipelines, and decide when a model is ready for prime time.

Devoxx

A solution architect at Swiss Post presents how they evolved their ML fraud detection system (IRIS) from manual deployments to a fully automated, reliable pipeline. The talk covers their AWS-based architecture using SageMaker for training and inference, EKS for serving, and Kafka for streaming. The core technique is shadow model deployment: running a new candidate model in parallel with the production model on real traffic, with zero user impact. This enables validation of operational metrics (latency), data drift, concept drift, and model-to-model drift using real production data. Synthetic requests with known labels are used to confirm drift direction. Once the shadow model proves superior, it is promoted to production via a riskless switch using Lambda and EventBridge, with no downtime.

From Shadows to Spotlight - How Swiss Post Performs Reliable ML Deployment by Giovanni Degiorgi