A practical hack to normalize BM25 lexical search scores to the 0-1 range. Lucene's TF component is already bounded 0-1 by dropping the (k1+1) numerator. For IDF, dividing by log(N) caps it at 1. Multiplying the two gives a BM25 score in 0-1 while preserving ranking order. Trade-offs include small resulting values due to multiplying fractions, no need for corpus-wide score distribution knowledge, and no calibration to actual relevance probabilities.
Sort: