<p>Stanford and Caltech researchers just published the first comprehensive taxonomy of how llms fail at reasoning

not a list of cherry-picked gotchas. a 2-axis framework that finally lets you compare failure modes across tasks instead of treating each one as a random anecdote

the https://t.co/YsF4UTjBci</p>

Robert Youssef

Stanford and Caltech researchers just published the first comprehensive taxonomy of how llms fail at reasoning

not a list of cherry-picked gotchas. a 2-axis framework that finally lets you compare failure modes across tasks instead of treating each one as a random anecdote

the findings are uncomfortable