A developer building Marginalia Search documents their journey implementing an NSFW content filter from scratch. After finding transformer-based models too slow for real-time search use, they tried fasttext but got poor results due to biased training data. They solved the labeling problem by using ollama+qwen3.5 as an LLM-based annotator to generate training samples at scale. The final solution is a single hidden layer neural network with hand-picked features and chi-squared scoring for disambiguation, implemented in Java. The post includes full math derivations for forward propagation, backpropagation, and gradient descent, plus evaluation metrics showing ~90% accuracy on training data with caveats about real-world false positive rates due to low NSFW base rates.

14m read timeFrom marginalia.nu
Post cover image
Table of contents
FasttextThe Neural NetworkPredictionTraining

Sort: