A summary of my research and experiments on growing sparse computational graphs by training small RNNs. This post describes the architecture, training process, and pruning method used to create the graphs and then examines some of the learned solutions to a variety of objectives.

Casey Primozic

A deep dive into 'Bonsai Networks' — extremely sparse computational graphs created by training small RNNs with custom components and then pruning them down to their minimal form. The author details a custom activation function designed for boolean logic, a modified RNN cell architecture, a custom sparsity regularizer, and trainable initial states. A grokking-like phase transition during training was key to achieving high sparsity (>90% weights pruned). The post then reverse-engineers the learned solutions for several logic problems including delay lines, parenthesis balancing, and a multi-mode logic gate selector, showing how the networks discovered compact FSM-like representations.

Growing Bonsai Networks with RNNs