A critique of GPU-accelerated deep learning from 2016, arguing that GPUs excel at dense, parallelizable operations like convolutional neural networks but struggle with logarithmic data structures required for techniques like hierarchical softmax in word2vec. The author suggests that for NLP and collaborative filtering at billion-parameter scale, CPU-friendly sparse/logarithmic architectures may outperform brute-force GPU approaches, and proposes hybrid CPU+GPU architectures as a promising research direction.

2m read timeFrom erikbern.com
Post cover image

Sort: