paper: https://www.arxiv.org/abs/2603.12228

Check out my latest project: Intuitive AI Academy
We just wrote a new piece on MoE and Engrams in dpeth!
https://intuitiveai.academy/
limited time code "EASY" for 20% off yearly plan!

ByCloud's resource offers insights, tutorials, and resources for cloud computing enthusiasts, developers, and IT professionals. Readers can learn about cloud architecture, DevOps practices, and cloud-native technologies. With articles, tutorials, and case studies, ByCloud provides  guidance and expertise for leveraging cloud computing to build scalable and resilient applications.

bycloud

New AI research called 'neural thickets' reveals that after pre-training, large language model weights don't settle at a single optimal solution but reside in a dense region surrounded by alternative specialist models. By slightly perturbing weights with Gaussian noise ('jiggling'), the model can be nudged toward task-specific specialists — improving math, coding, or writing capabilities. This effect only manifests in very large models, which have sufficient parameter space to contain these hidden specialists. The implication is that a pre-trained model is effectively a neighborhood of latent specialists, and random weight sampling with outcome-based evaluation can directly yield improved models.

"Shake" LLMs to make them better...?