Google DeepMind and Kaggle launched the FACTS Benchmark Suite to systematically evaluate LLM factual accuracy across four areas: parametric knowledge (internal knowledge recall), search-augmented retrieval, multimodal reasoning with images, and grounding (context-based answers). The suite contains 3,513 publicly available
Sort: