Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

A developer building Marginalia Search documents their journey implementing an NSFW content filter from scratch. After finding transformer-based models too slow for real-time search use, they tried fasttext but got poor results due to biased training data. They solved the labeling problem by using ollama+qwen3.5 as an LLM-based annotator to generate training samples at scale. The final solution is a single hidden layer neural network with hand-picked features and chi-squared scoring for disambiguation, implemented in Java. The post includes full math derivations for forward propagation, backpropagation, and gradient descent, plus evaluation metrics showing ~90% accuracy on training data with caveats about real-world false positive rates due to low NSFW base rates.

An NSFW filter for Marginalia Search