A software engineer ran two vibecoding challenges on Lobsters, asking participants to complete programming tasks using AI coding tools. The post evaluates the results, finding that LLMs failed to build working solutions because they lack 'Naur theories' — the mental models humans develop about how systems behave. The author argues that AI models confabulate results, produce hacky code that doesn't fit existing theory, and overfit to training data rather than generalizing. The challenges involved RPython-based tasks including a Brainfuck interpreter, a compiler port, and an NP-hard optimization problem. Only one external participant (using OpenCode) made a meaningful attempt, placing in C tier. The author concludes that copying someone's output doesn't transfer their underlying mental model, and that AI coding tools cannot replace genuine theory-building in software engineering.
Table of contents
Propagate the Fuck (Can't Propagate the Fuck)Late, as in the late unknown-linux-muslDon't you know? Python makes you fast. (Haha, one! )An Object FinaleThe unattemptedAnalysisNo, like, analysis of the vibecoded outputsConclusionSort: