Best of PyTorch — 2024

1
Article
Community Picks·1y
🤗 Transformers
🤗 Transformers provides APIs and tools for easily downloading and training state-of-the-art pretrained models for tasks in natural language processing, computer vision, audio, and multimodal categories. It supports interoperability between PyTorch, TensorFlow, and JAX, allowing for flexible model training and deployment. The library also offers comprehensive documentation, tutorials, and guides to help users get started and achieve specific goals.
103
9
2
Article
Hugging Face·1y
Visualize and understand GPU memory in PyTorch
This tutorial explains how to visualize and understand GPU memory usage in PyTorch during model training. It provides step-by-step instructions on generating and interpreting memory profiles using PyTorch's built-in tools. The tutorial also covers how to estimate and optimize memory requirements for training large models, offering practical tips to manage GPU memory efficiently.
66
3
Article
Machine Learning Mastery·2y
5 Tips for Getting Started with Deep Learning
Deep learning, a subset of machine learning inspired by the human brain, has become essential in areas like computer vision, speech recognition, and text generation. To get started, focus on understanding machine learning basics, select a comfortable deep-learning framework (such as TensorFlow, PyTorch, or Keras), learn neural network architectures, start with simple projects, and practice regularly while engaging with the community for feedback and guidance.
60
1
4
Article
Towards AI·2y
The Ultimate Beginner to Advance guide to Machine learning
Learn machine learning from scratch with a structured three-phase approach. Start with Python basics and small projects, then delve into essential libraries like Pandas, Numpy, and Matplotlib. Finally, explore foundational machine learning concepts and tools like TensorFlow or PyTorch. The guide provides resources, tips, and recommended learning paths for advancing to more complex topics like Natural Language Processing, Generative AI, and Computer Vision.
59
2
5
Article
GoPenAI·2y
How to build Neural Network with real-world dataset using PyTorch
Learn how to build and train a neural network model using the FitBit Fitness Tracker Dataset and PyTorch. The post provides a step-by-step guide and covers topics such as importing libraries, loading and preparing the data, defining the model, training and evaluating the model, and making predictions on new data. By following the post, readers can build and train their own neural network models for various use cases.
52
6
Article
Community Picks·2y
mlflow/mlflow: Open source platform for the machine learning lifecycle
MLflow is an open-source platform designed to streamline machine learning development. It facilitates tracking experiments, packaging code into reproducible runs, and deploying models. Key components include MLflow Tracking for logging and comparing experiments, MLflow Projects for sharing code, MLflow Models for deploying models, and the MLflow Model Registry for managing model lifecycles. It supports various ML libraries and can be integrated into local and cloud environments.
51
7
Article
Hacker News·2y
PyTorch is dead. Long live JAX.
The post critiques PyTorch's effectiveness in industrial-scale scientific computing, arguing it wasn't designed for large-scale, distributed systems. In contrast, JAX, developed by DeepMind, offers a compiler-centered approach with better scalability and performance, making it more suitable for large-scale AI research. JAX's commitment to functional programming and reproducibility further enhances its utility, while PyTorch's attempts to integrate multiple backends lead to fragmentation and inefficiency. The post urges the adoption of JAX for improved research productivity and reliability.
50
2
8
Article
Towards AI·2y
A Practical Guide to Building GPT-2 with PyTorch (Part 1)
Learn how to build and train a GPT-2 language model from scratch using PyTorch. This guide outlines steps to create a custom tokenizer, data loader, and a simple language model, demonstrating the process with Taylor Swift and Ed Sheeran song data. Follow along with the code provided to understand and implement each part of the model.
45
9
Article
gitconnected·2y
Let’s Build our own GPT Model from Scratch with PyTorch
Learn how to build a basic Generative Pre-trained Transformer (GPT) model from scratch using PyTorch. This tutorial covers auto-regressive models, character-level tokenization, data batching, and training using text in the style of William Shakespeare. It provides a detailed implementation of a bi-gram language model including the use of multi-head attention, forward and training operations, and generating new text tokens.
43
10
Article
freeCodeCamp·2y
How to Use the Hugging Face Transformer Library
Learn about the Hugging Face Transformer Library, its user-friendliness, and how to use it to implement a text summarization script.
35
1
11
Article
Machine Learning News·2y
SpeechBrain: A PyTorch-based Speech Toolkit
SpeechBrain is a PyTorch-based toolkit designed to address the complexities of modern speech and audio processing tasks, including automatic speech recognition, text-to-speech synthesis, and speaker recognition. It offers a modular and flexible framework that leverages PyTorch’s efficient tensor operations and GPU acceleration to enable faster training and inference. Researchers and developers can experiment with different neural network architectures and techniques to adapt models to specific tasks and datasets, achieving state-of-the-art results.
29
12
Video
Sam Witteveen·2y
Gemma 2 - Local RAG with Ollama and LangChain
Gemma 2 has been released for multiple formats including Keras, PyTorch, and Hugging Face transformers. This post details the author's experience using the 9B and 27B models in Ollama, highlighting the better performance of the 9B model for real-time responses. A straightforward script is provided to create a fully local Retrieval-Augmented Generation (RAG) system using Gemma 2, Nomic embeddings, and ChromaDB, all executed within VSCode. The steps involve setting up an indexer, embedding transcripts from Alex Hormozi's YouTube channel, and handling text splitting methods. Debugging tips and additional add-ons for the RAG system are also discussed.
26
1
13
Article
DigitalOcean Community·2y
PyTorch 101: Understanding Hooks
Learn how to use hooks in PyTorch for debugging and visualization during the training process. This tutorial explains the concept and functionality of hooks, including both forward and backward hooks, and provides code examples to demonstrate their usage. It also discusses the intricacies of using hooks with tensors and nn.Module objects, cautioning about potential complications in complex networks.
25
14
Article
Hacker News·2y
SylphAI-Inc/LightRAG: The "PyTorch" library for LLM applications.
LightRAG is a PyTorch library designed to assist developers with building and optimizing Retriever-Agent-Generator (RAG) pipelines for large language model (LLM) applications. It emphasizes a light, modular, and robust codebase that is 100% readable. LightRAG caters to diverse LLM use cases, from general AI applications like chatbots and summarization to traditional NLP tasks. With a clean, customizable setup, developers can trust and effectively implement it in production.
25
15
Article
Uber Engineering·2y
Open Source and In-House: How Uber Optimizes LLM Training
Uber uses a mix of open-source and closed-source models to optimize the performance of large language models (LLMs) for various applications such as Uber Eats recommendations, customer support chatbots, and code development. The training infrastructure leverages robust tools like PyTorch, Kubernetes, Ray, and DeepSpeed for distributed training on both on-premises and cloud-based NVIDIA GPUs. Through continuous pre-training and fine-tuning, Uber enhances models to handle large-scale traffic efficiently, achieving performance comparable to industry-leading models like GPT-4.
24
16
Article
Daily Dose of Data Science | Avi Chawla | Substack·2y
Building Multi-task Learning Models
A practical guide to building multi-task learning models in PyTorch.
24
17
Article
Towards AI·2y
An Introduction to PyTorch versus TensorFlow for Deep Learning
PyTorch and TensorFlow are the most popular frameworks in the deep learning community, providing customizable boilerplates for coding neural network architectures and optimizing computations with GPU resources. Without these frameworks, deep learning models had to be coded from scratch using Numpy, which is more cumbersome and slower without GPU optimization. Familiarity with these frameworks enhances the development of neural networks significantly.
21
2
18
Video
Community Picks·2y
Let's reproduce GPT-2 (124M)
This post discusses the process of reproducing the GPT-2 (124M) model, including loading the weights, implementing the model from scratch, and generating text. It also introduces the Tiny Shakespeare dataset and shows how to use it for training. The author demonstrates how to calculate loss and perform optimization using PyTorch.
21
19
Article
Replicate·2y
FLUX is fast and it's open source
FLUX is now much faster on Replicate, and all optimizations have been made open-source for the community. Key improvements include model optimization using torch.compile and fast CuDNN attention kernels, along with a new synchronous HTTP API. The open-source initiative aims to make these enhancements accessible for further advancements. Users can fine-tune, edit, and deploy custom versions of FLUX, and explore model outputs on a new playground.
20
1
20
Article
Towards AI·2y
Build your own Large Language Model (LLM) From Scratch Using PyTorch
A step-by-step guide to building and training a Large Language Model (LLM) using PyTorch. The model's task is to translate texts from English to Malay language. The core foundation of LLMs is the Transformer architecture, and this post provides a comprehensive explanation of how to build it from scratch.
20
21
Article
Machine Learning News·2y
Top Ten Python Libraries for Machine Learning and Deep Learning in 2024
Top ten Python libraries for machine learning and deep learning in 2024, including TensorFlow, PyTorch, Scikit-learn, Keras, XGBoost, LightGBM, JAX, FastAI, Hugging Face Transformers, and OpenCV.
20
22
Article
Daily Dose of Data Science | Avi Chawla | Substack·2y
A Subtle Trick to Optimize Neural Network Training
Discover a subtle optimization trick for neural network training that involves normalizing data after transferring it to the GPU. This simple rearrangement can significantly reduce data transfer time, especially in tasks like image classification where pixel values are initially 8-bit integers. While the technique may not apply to all use cases, such as NLP, it can offer noticeable performance gains in applicable scenarios.
19
23
Article
Hacker News·2y
KwaiVGI/LivePortrait: Make one portrait alive!
LivePortrait is a GitHub repository containing the official PyTorch implementation of the LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control paper. The repository includes the initial version of the inference code and models, with continuous updates. Users can clone the repo, set up the environment using conda, install necessary dependencies, download pretrained weights, and run various scripts to animate portraits. The post also offers performance evaluation results on an RTX 4090 GPU and provides a Gradio interface for enhanced usability.
19
24
Article
Daily Dose of Data Science | Avi Chawla | Substack·2y
Deep Learning Models Can Learn Non-Existing Patterns
Deep learning models can sometimes learn non-existing patterns, especially when data is not properly shuffled during training. This post illustrates an example where a classification neural network failed to converge due to label-ordered data but performed well when the data was shuffled. Shuffling helps in mini-batch gradient descent by ensuring that each mini-batch contains a balanced representation of classes. Be mindful of this and other potential pitfalls to improve model generalization and performance.
18
25
Article
Towards AI·1y
Build And Train GPT From Scratch
Learn how to build and train a Generative Pretrained Transformer (GPT) model from scratch using Python and PyTorch. Understand the internal mechanisms of GPT models, including self-attention and multi-head attention. Follow step-by-step instructions to construct the GPT architecture, tokenize data, implement self-attention, and train the model on a dataset. Discover techniques to improve model performance and optimize training and inference processes.
17

See all PyTorch archives