A critique of GPU-accelerated deep learning from 2016, arguing that GPUs excel at dense, parallelizable operations like convolutional neural networks but struggle with logarithmic data structures required for techniques like hierarchical softmax in word2vec. The author suggests that for NLP and collaborative filtering at
Sort: