A comparison of two approaches to evaluating search quality: query-based evaluation (aggregating clicks per query into relevance labels) vs. session-based evaluation (replaying individual user sessions). Session-based eval offers better sampling accuracy by treating each user interaction equally, similar to probability-based
Sort: