A deep dive into 'Bonsai Networks' — extremely sparse computational graphs created by training small RNNs with custom components and then pruning them down to their minimal form. The author details a custom activation function designed for boolean logic, a modified RNN cell architecture, a custom sparsity regularizer, and trainable initial states. A grokking-like phase transition during training was key to achieving high sparsity (>90% weights pruned). The post then reverse-engineers the learned solutions for several logic problems including delay lines, parenthesis balancing, and a multi-mode logic gate selector, showing how the networks discovered compact FSM-like representations.

24m read timeFrom cprimozic.net
Post cover image
Table of contents
BackgroundBonsai Networks OverviewCustom Neural Network ComponentsTrainingConstructing the GraphsResultsConclusion

Sort: