Best of LLM — September 2024

1
Article
Medium·2y
Start Building These Projects to Become an LLM Engineer
To become an LLM engineer, start by building practical projects that showcase skills in API usage and real-world applications, like chatbots for WhatsApp, Discord, or Telegram. Initial projects could include summarizing YouTube videos or handling various user queries via chatbots. The post also introduces a project-based course to help you build LLM applications and serve them as WhatsApp chatbots.
201
2
Article
The Developing Dev·2y
A New Era of Writing Code
Large language models (LLMs) are revolutionizing coding by automating repetitive tasks, making debugging easier, and providing more efficient ways to carry out coding tasks. However, relying on LLMs can lead to a loss of understanding of the code, and they struggle with open-ended tasks and complex environment setups. Despite these limitations, they’re valuable for focused changes, basic UI, and transpiling. Knowing how to query and troubleshoot LLMs effectively is crucial for maximizing their benefits.
169
7
3
Article
Go·2y
Building LLM-powered applications in Go
As Large Language Models (LLMs) and embedding models improve, more developers are integrating LLMs into their applications. Go excels in building LLM-powered applications due to its support for REST/RPC protocols, concurrency, and performance. This post demonstrates creating a Retrieval Augmented Generation (RAG) server in Go, which uses HTTP endpoints to add documents to a knowledge base and answer user questions. It explores implementing this with tools like Google Gemini API, Weaviate, LangChainGo, and Genkit for Go, highlighting Go's strengths in cloud-native application development.
109
1
4
Article
GoPenAI·2y
Introduction to LLM Agents: How to Build a Simple Reasoning and Acting Agent from Scratch (Part 1)
Learn the fundamental concepts of building AI agents by implementing a simple reasoning and acting agent. This guide uses Ollama to run large language models locally and demonstrates the agent's ability to understand user queries, leverage web searches, and provide responses. The post outlines setting up necessary dependencies, implementing core functionalities, and explains the workflow of agent interaction, making it an excellent starting point for AI agent development.
79
1
5
Article
GoPenAI·2y
Build an Advanced RAG App: Query Routing
This post explores how to build an advanced RAG application using a technique called Query Routing. Query Routing enables the application to make decisions based on a user's query, selecting the most appropriate action from predefined choices such as retrieving context from multiple data sources, using different indexes, or performing a web search. Various types of Query Routers are discussed, including LLM Selector Router, LLM Function Calling Router, Semantic Router, and more. Example implementations demonstrate how to create Query Routers and enhance the decision-making capabilities of RAG applications.
71
3
6
Article
Collections·2y
Build Your Own Llama, LLMs From Scratch, and Understanding Meta’s Transfusion Model
Discover a comprehensive guide to improving your Large Language Model (LLM) skills in 2024. Learn to create a local-first vector database with RxDB and transformers.js, delve into Meta's Transfusion model merging text and image generation, and understand how to build the Llama 3 architecture from scratch using PyTorch. Explore essential concepts like genetic algorithms and neural networks, and discover strategies for local-first AI solutions for document management and chat applications. Engage with the community and remember the importance of taking breaks for productivity.
63
6
7
Article
Laravel News·2y
PHP and LLMs with Alfred Nutile
Alfred Nutile explores the relationship between Laravel and Large Language Models (LLMs), discussing how LLMs are influencing programming and content creation. With extensive experience, including introducing Laravel at Pfizer, Alfred shares insights on leveraging LLMs in development and their potential for the Laravel ecosystem.
43
2
8
Article
Community Picks·2y
Building an Advanced RAG System With Self-Querying Retrieval
Learn how to build an advanced Retrieval Augmented Generation (RAG) system that leverages self-querying retrieval to improve search relevance. This tutorial covers extracting metadata filters from natural language queries, combining metadata filtering with vector search, and generating structured outputs using LLMs. The guide focuses on developing an investment assistant to answer financial questions using MongoDB as the vector store and LangGraph for orchestration.
37
9
Article
GoPenAI·2y
Building LLM Agents from Scratch (Part 2) : A Conversational Search Agent with Ollama
Explore the advanced development of a conversational search agent using Ollama, Llama 3.1, Jina Embeddings, and ChromaDB. Learn how function calling enables the agent to interact with external web search tools for more accurate responses. Step-by-step instructions are provided for installing dependencies, creating a custom search tool, and setting up an agent system, complete with memory and tool execution capabilities.
35
10
Article
Data Science Central·2y
30 Features that Dramatically Improve LLM Performance – Part 2
The post explores 10 additional features that significantly enhance LLM performance, particularly in terms of speed, latency, relevancy, memory use, and security. It discusses distillation for concise outputs, reproducibility with PRNGs, explainable AI using few parameters, and the benefits of no-training LLMs. The author also touches on the advantages of transformer-free LLMs, taxonomy-based evaluation, prompt data augmentation, and the importance of backend over frontend engineering. The importance of cautious use of NLP tools to avoid glitches is also highlighted.
33
11
Article
Docker·2y
Using an AI Assistant to Read Tool Documentation
Exploring the use of AI assistants, such as LLMs, to improve workflows by reading and executing tool documentation within Docker containers. This approach leverages Docker's isolated environments and standard documentation retrieval methods like `man` pages and `--help` options. Experiments demonstrate how AI can help with running commands, handling errors, and iterating tool usage, while highlighting the potential benefits and challenges in integrating AI with traditional Unix tool ecosystems.
29
12
Video
bycloud·2y
The Unreasonable Effectiveness of Prompt "Engineering"
Prompt engineering is about effectively communicating with AI to enhance its performance. This involves providing context and resembling the data the model was trained on. Techniques like Chain of Thought and various prompting methods help AI articulate reasoning steps, improving its performance on complex tasks. The post discusses how prompt engineering has evolved to become critical in AI applications and introduces a free guide from HubSpot on leveraging AI for business. It also touches upon new language models and the importance of proper training and synthetic data in AI development.
26
1
13
Article
Medium·2y
An AI Agent Architecture & Framework Is Emerging
AI agents are evolving with new architectures that enable autonomous operation and dynamic interaction within digital environments. Core components include Large Action Models (LAMs), which enable meaningful actions, and Model Orchestration, which leverages smaller specialized models for specific tasks. Function calling enhances AI's ability to perform structured actions, while vision-enabled models allow for interaction with digital environments. The integration of tools, including human-in-the-loop mechanisms, extends the capabilities and modularity of AI agents.
26
14
Article
InfoWorld·2y
Using PostgreSQL as a vector database in RAG
Learn how to build a local retrieval-augmented generation (RAG) application using PostgreSQL with the pgvector extension, Ollama, and the Llama 3 large language model. This guide describes how Postgres can store both vector and tabular data, making it a versatile option for medium-sized RAG applications. It covers setting up a vector database, ingesting text from multiple sources, conducting similarity searches, and querying a large language model to generate answers. Practical coding examples and step-by-step instructions are provided to help developers get started quickly.
26
15
Article
freeCodeCamp·2y
From PhD drop-out to Google Data Scientist with Megan Risdal [Podcast #142]
Megan Risdal, a data scientist and Product Manager at Google's Kaggle, discusses the platform that hosts 300k open data sets and runs weekly data science competitions. She also compares the communication styles in academia versus tech, and contrasts her experiences with Stack Overflow and Kaggle. Additionally, Megan touches upon the importance of linguistics in AI research and her work on Google's Gemma open models project.
25
16
Article
ML & AI·2y
Setting Up a REST API Service for AI with Local LLMs Using Ollama!
Guide on setting up a REST API using local Large Language Models (LLMs) with Ollama for developers who want to run AI models locally and maintain control over their data.
23
17
Article
Community Picks·2y
tcsenpai/ol1-p1: ol1-p1: Using Ollama or Perplexity to create o1-like reasoning chains
This repository showcases an early prototype for improving LLM's reasoning through o1-like reasoning chains using either local Ollama models or Perplexity API. It enhances LLM's ability to solve logical problems by employing multiple reasoning steps, re-examination, and exploring alternative answers. This experimental setup aims to inspire the open-source community to develop new strategies for dynamic reasoning. Initial testing shows it can achieve 60-80% accuracy on tasks that typically stump leading models.
23
18
Video
IBM Technology·2y
RAG vs. Fine Tuning
Retrieval augmented generation (RAG) and fine-tuning are two techniques for enhancing large language models. RAG retrieves external, up-to-date information to augment responses, making it effective for dynamic data sources and mitigating model hallucinations. Fine-tuning adapts a model to a specific domain or style by incorporating labeled and targeted data into the model's weights, providing more specialized and consistent outputs. Both techniques have their strengths and weaknesses, and the choice between them or a combination depends on specific use cases, data requirements, and desired model behavior.
23
19
Article
Baeldung·2y
Building a RAG App Using MongoDB and Spring AI
Learn how to build a Retrieval-Augmented Generation (RAG) Wiki application using MongoDB and Spring AI. The tutorial details setting up MongoDB Atlas Vector Search for storing documents, adding necessary dependencies, and configuring the application to save and retrieve documents based on context. The application leverages a vector store for similarity search and utilizes LLM for generating responses, making it suitable for developing chatbots, automated wikis, and search engines.
22
20
Article
It's Foss·2y
Generative AI & LLMs: How are They Different or Similar?
Generative AI and Large Language Models (LLMs) are distinct technologies, differing in purpose, architecture, and capabilities. Generative AI creates new content like images or music by learning patterns from data, while LLMs focus on understanding and generating human language using NLP techniques. Combining these technologies has transformative potential in content creation, chatbot enhancement, document interaction, and translation. However, significant challenges include bias, hallucinations, resource intensiveness, and ethical concerns regarding data privacy and misuse.
18
21
Article
Machine Learning News·2y
MemLong: Revolutionizing Long-Context Language Modeling with Memory-Augmented Retrieval
MemLong introduces a novel memory-augmented retrieval mechanism to tackle the limitations of handling long contexts in Large Language Models (LLMs). By integrating an external retrieval system, MemLong efficiently extends context length without sacrificing model performance. It stores past contexts in a memory bank, retrieves relevant historical data during text generation, and maintains distributional consistency. The approach significantly reduces training costs and has shown superior performance in long-context benchmarks, managing up to 80,000 tokens on a single GPU.
18
22
Article
GoPenAI·2y
Orchestrating Intelligence: A Deep Dive into AI/ML Agents and Frameworks
AI agents, which can operate sequentially, in parallel, or hierarchically, bring a new level of dynamism and complexity to the AI field. Various frameworks support these agents, including LangChain, OpenAI, Google Gemini, CrewAI, and AutoGen, each offering unique features for creating and deploying AI agents. The choice of framework and type of agent depends on the project’s specific requirements, whether it involves simple step-by-step tasks, handling multiple inputs simultaneously, or managing complex, multi-step processes.
15
23
Article
The Art of Simplicity·2y
Interact with Ollama through C#
C# developers can interact with Ollama, which supports Local Large Language Models, using Semantic Kernel or OllamaSharp. Semantic Kernel now has an Ollama connector that uses OllamaSharp, allowing for the use of .NET bindings with Ollama's API. A tutorial demo shows how to set up and use OllamaSharp in a simple console application.
14
24
Article
Couchbase·2y
From Concept to Code: LLM + RAG with Couchbase
The post discusses the creation and implementation of a recommendation system that leverages Generative AI (GenAI) and Retrieval Augmented Generation (RAG) techniques. The system uses Couchbase for high-availability architecture and vector similarity search, coupled with LangChain and LangGraph for managing application flows. The focus is on transforming data into embeddings for similarity searches, setting up Couchbase collections and indices, and integrating the results into an LLM application to provide event recommendations.
14
25
Article
portkey·2y
DSPy in Production
DSPy transforms the approach to AI by focusing on building and optimizing AI pipelines rather than crafting prompts. It manages complexity, allows model independence, and automates prompt optimization. A case study at Zoro UK demonstrates its ability to standardize product attributes across numerous suppliers, improving efficiency and reducing costs. Metrics are key to DSPy's optimization, ranging from simple accuracy measures to sophisticated approaches. The future of AI development lies in orchestrating rather than arguing with AI, making DSPy a vital tool for AI systems in production.
14

See all LLM archives