Hybrid Neuro-Symbolic Fraud Detection: Guiding Neural Networks with Domain Rules

An experiment injecting domain knowledge as a differentiable soft constraint into a neural network's loss function for fraud detection on a severely imbalanced dataset (0.17% positive rate). The hybrid neuro-symbolic approach adds a rule penalty that fires on transactions with high amounts and unusual PCA norms, even when labeled fraud is absent from a batch. Across 5 random seeds, the hybrid shows a consistent but small ROC-AUC improvement (0.970 vs 0.967) while F1 and PR-AUC differences fall within noise range. Key lessons: symmetric threshold evaluation is critical for fair model comparison on imbalanced data, single-seed results are unreliable, and high lambda values (≥1.0) can override the BCE signal and degrade performance. Full code with lambda sweep and variance analysis is available on GitHub.

#deep-learning

#pytorch

#fraud-detection

Mar 10•14m read time•From towardsdatascience.com

Table of contents

Abstract The Problem: When ROC-AUC Lies The Setup The Model The Rule Loss Tuning Lambda Results Variance Analysis — 5 Random Seeds Why Does the Rule Loss Help ROC-AUC?On Threshold Evaluation in Imbalanced Classification Things to Watch Out For Closing Thoughts References Disclosure

Comment

Bookmark

Copy

Sort: