Best of PyTorch — June 2024

1
Video
Sam Witteveen·2y
Gemma 2 - Local RAG with Ollama and LangChain
Gemma 2 has been released for multiple formats including Keras, PyTorch, and Hugging Face transformers. This post details the author's experience using the 9B and 27B models in Ollama, highlighting the better performance of the 9B model for real-time responses. A straightforward script is provided to create a fully local Retrieval-Augmented Generation (RAG) system using Gemma 2, Nomic embeddings, and ChromaDB, all executed within VSCode. The steps involve setting up an indexer, embedding transcripts from Alex Hormozi's YouTube channel, and handling text splitting methods. Debugging tips and additional add-ons for the RAG system are also discussed.
26
1
2
Video
Community Picks·2y
Let's reproduce GPT-2 (124M)
This post discusses the process of reproducing the GPT-2 (124M) model, including loading the weights, implementing the model from scratch, and generating text. It also introduces the Tiny Shakespeare dataset and shows how to use it for training. The author demonstrates how to calculate loss and perform optimization using PyTorch.
21
3
Article
Towards AI·2y
Build your own Large Language Model (LLM) From Scratch Using PyTorch
A step-by-step guide to building and training a Large Language Model (LLM) using PyTorch. The model's task is to translate texts from English to Malay language. The core foundation of LLMs is the Transformer architecture, and this post provides a comprehensive explanation of how to build it from scratch.
20
4
Article
Hacker News·2y
From Scratch - Generative Adversarial Networks
Generative Adversarial Networks (GANs) are a method in generative AI that aims to train a Generator (G) model and a Discriminator (D) model simultaneously. The G model learns to generate samples from a given distribution, while the D model learns to distinguish between real and generated samples. The training regime involves updating the D model to maximize the probability of correct classification, and updating the G model to maximize the probability of the D model making a mistake. The Discriminator model has 4 linear layers with dropout and ReLU activations.
17
5
Article
GoPenAI·2y
Understanding Kolmogorov-Arnold Networks (KANs) and Their Application in Variational Autoencoders
Kolmogorov-Arnold Networks (KANs) are based on a mathematical theorem that allows any continuous function of multiple variables to be represented as a combination of one-dimensional functions. These networks could revolutionize neural network design, particularly for Variational Autoencoders (VAEs), by improving efficiency, interpretability, and flexibility. Key methods involve using splines and piecewise polynomials. Although the post features a standard VAE implementation, it discusses how KAN layers could be incorporated, highlighting potential future research directions in KAN-based models.
16
6
Article
Medium·2y
Want to Learn Quantization in The Large Language Model?
This post provides a detailed guide on quantization for large language models, explaining its benefits, and demonstrating how to apply it using PyTorch. It covers the definition and necessity of quantization, various methods like asymmetric and symmetric quantization, and includes step-by-step coding instructions for implementing quantization and de-quantization on model weight parameters.
15
7
Article
GoPenAI·2y
Yoga-LLM, Part 2: Instruction Fine-tuning
This is a tutorial on fine-tuning a large language model (LLM) specifically for answering questions on Yoga. The process involves instruction tuning using methods like LoRA for parameter-efficient fine-tuning. The post discusses the selection of tools and frameworks like HuggingFace, Unsloth, and LitGPT for the task. The implementation steps are detailed, including preparation of data and setting training parameters. Inference methods are also covered, using a trained Gemma 2B model for demonstrating the fine-tuning process.
10

See all PyTorch archives