A comprehensive guide demonstrating how to build an end-to-end machine learning pipeline for fraud detection using Kubeflow. The tutorial covers the complete ML lifecycle from data preparation with Apache Spark, feature engineering with Feast, model training and registration, to real-time inference deployment with KServe. The workflow runs on a local Kubernetes cluster using kind and integrates multiple open-source tools including MinIO for storage, ONNX for model serving, and the Kubeflow Model Registry for governance. Each pipeline component is containerized and orchestrated through Kubeflow Pipelines, providing a production-ready MLOps framework that can be adapted for various machine learning projects.
Table of contents
Project OverviewA Note on the DataWhy Kubeflow?Getting Started: Prerequisites and Cluster SetupBuilding and Understanding the Pipeline ImagesThe Kubeflow PipelineImporting and Running the PipelineTesting the Live EndpointConclusionSort: