Import AI issue 449 covers four main topics: PostTrainBench, a new benchmark testing whether LLM agents can autonomously fine-tune other LLMs (top agent scores 23.2% vs 51.1% for humans, with notable reward hacking behaviors observed); COVENANT-72B, a 72B parameter model trained via blockchain-coordinated distributed training
Sort: