Andrej Karpathy implements LLM training in just over 1,000 lines of C on top of CUDA. Includes a tutorial about implementing LayerNorm by porting an implementation from Python.
•1m read time• From simonwillison.net
Sort:
Andrej Karpathy implements LLM training in just over 1,000 lines of C on top of CUDA. Includes a tutorial about implementing LayerNorm by porting an implementation from Python.
Sort: