An informal experiment benchmarking several LLMs (Claude, GPT, Gemini, Qwen, Kimi, GLM) on their ability to predict the cooling curve of hot coffee in a ceramic mug. Each model was asked to produce a temperature-over-time equation given specific physical parameters. The author then ran the actual experiment and compared

5m read timeFrom dynomight.net
Post cover image

Sort: