Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

A developer applies Karpathy's Autoresearch framework to an old ML research project (eCLIP) using Claude Code as the autonomous agent. The setup involves a constrained optimization loop where the agent iteratively modifies training code, runs experiments, and commits or reverts changes based on eval metrics. Over 42 experiments in a single day, the agent reduced mean rank from 344.68 to 157.43 (54% improvement). The biggest win came from the agent spotting a bug in the temperature parameter clamping. Gains diminished significantly in later phases involving architectural changes and moonshot ideas. Key takeaways: sandboxing is essential, the commit-or-revert loop works well for defined search spaces, but LLM agents struggle with 'unknown unknowns' in research.

Autoresearch on an old research idea