Best of LLM — April 2025

1
Article
Hacker News·1y
Get the hell out of the LLM as soon as possible
Large Language Models (LLMs) should not be used for decision-making or implementing business logic due to their poor performance in these areas. Instead, LLMs should be employed as an interface for translating user inputs into API calls, with the actual logic handled by specialized systems. This approach enhances performance, debugging, and reliability. LLMs are best utilized for tasks involving transformation, interpretation, and communication, rather than maintaining critical application state.
449
46
2
Article
Javarevisited·1y
5 Best Books to Learn AI and LLM Engineering in 2025 (That Aren’t a Waste of Time)
Discover the top five books recommended for mastering AI and LLM engineering in 2025. These selections focus on practical systems design, deployment, and real-world applications, helping readers save time and effectively build production-ready models. Written by experienced practitioners, these books offer guidance for those serious about becoming proficient in large language models and AI systems.
388
9
3
Article
freeCodeCamp·1y
How to Build RAG AI Agents with TypeScript
Learn how to build a Retrieval-Augmented Generation (RAG) AI agent using TypeScript and Langbase SDK. This comprehensive tutorial covers setting up your project, creating AI memory for storing and retrieving context, uploading documents, adding API keys, and generating responses using LLMs like OpenAI. By the end, you'll have a context-aware AI agent capable of handling complex tasks and queries with precision.
359
5
4
Article
Daily Dose of Data Science | Avi Chawla | Substack·1y
9 RAG, LLM, and AI Agent Cheat Sheets
This post provides visual cheat sheets for AI engineers covering various topics, including Transformer vs. Mixture of Experts in LLMs, fine-tuning techniques, RAG vs Agentic RAG, strategies for chunking in RAG, levels of agentic AI systems, and more. These resources are designed to help cultivate essential skills for developing impactful AI and ML systems in the industry.
334
1
5
Article
SwirlAI·1y
The evolution of Modern RAG Architectures.
The post delves into the evolution of Retrieval Augmented Generation (RAG) architectures, discussing their development from Naive RAG to advanced techniques, including Cache Augmented Generation (CAG) and Agentic RAG. It highlights the challenges addressed at each stage, advanced methods to improve accuracy, and the potential future advancements in RAG systems.
308
5
6
Article
Hugging Face·1y
Tiny Agents: a MCP-powered agent in 50 lines of code
Discover how to implement a small and powerful AI agent using Model Context Protocol (MCP) in just 50 lines of code. The post covers the integration of MCP with large language models (LLMs) to create agentic AI, featuring JavaScript and TypeScript components with Hugging Face's SDKs and tools. It also demonstrates the use of MCP servers and shows how tools can be utilized within an LLM inference client.
182
2
7
Article
freeCodeCamp·1y
How to Build Autonomous Agents using Prompt Chaining with AI Primitives (No Frameworks)
Autonomous agents are AI systems capable of making decisions and taking actions independently using large language models (LLMs), tools, and memory. This tutorial emphasizes building these agents using AI primitives without relying on heavy frameworks. By leveraging Langbase’s agentic architecture called prompt chaining, tasks are segmented into sequential prompts, enhancing the ease of debugging and output quality. It walks through setting up a TypeScript Project with Langbase SDK to create a prompt chaining agent for transforming raw product descriptions into refined marketing copy.
141
8
Article
Tinybird·1y
Using LLMs to generate user-defined real-time data visualizations
Developers are increasingly using Tinybird to track LLM usage, costs, and performance in AI applications. A new app template called the LLM Performance Tracker allows users to generate real-time data visualizations. The core components include a Tinybird datasource, a Tinybird pipe, a React component, and an AI API route. The backend processes user input to generate chart parameters, while the frontend visualizes the data. This approach emphasizes the importance of performant analytics backends and cautious LLM usage for secure and scalable data visualization.
117
9
Article
Hacker News·1y
HandsOnLLM/Hands-On-Large-Language-Models: Official code repo for the O'Reilly Book
The Hands-On Large Language Models repository provides code examples from the book by Jay Alammar and Maarten Grootendorst. The book, known for its visual educational approach with almost 300 custom-made figures, covers practical tools and concepts needed to use Large Language Models. The authors recommend using Google Colab for running examples, but any cloud provider should work. Additional visual guides related to LLMs are also available. The book is a valuable resource for understanding and working with state-of-the-art language models.
96
1
10
Article
InfoWorld·1y
Vibe code or retire
Vibe coding refers to the use of advanced code generation tools powered by large language models (LLMs) like GitHub Copilot. Embracing these tools is essential for staying relevant in software development. While the initial learning curve may be steep, mastering vibe coding can significantly boost productivity. Developers are encouraged to experiment, adapt, and integrate these tools into their workflows to avoid falling behind in the industry.
93
33
11
Article
Machine Learning Mastery·1y
Advanced Techniques to Build Your RAG System
Learn advanced techniques to optimize retrieval-augmented generation (RAG) systems, focusing on improving query prompts, hybrid retrieval methods, and implementing multi-stage retrieval with re-ranking to enhance document retrieval and generation quality.
85
12
Video
Aaron Jack·1y
Learn AI Agents - How they Work & Build Your Own
Understanding AI agents is crucial for staying relevant in tech. AI agents are composable building blocks that augment or replace code functions, allowing for complex workflows and decision-making processes. Learning how to utilize and build these agents can be a career-boosting skill. The post also discusses running large language models locally and using structured courses to get immersed in AI.
81
13
Video
Tech With Tim·1y
Build Anything With a CUSTOM MCP Server - Python Tutorial
Learn to build a custom MCP server in Python and connect it to an AI agent. The tutorial walks through the process step-by-step, from installing necessary packages like UV to setting up the MCP Python SDK. The guide provides clear instructions on making tools and resources accessible to AI agents, including writing, reading, and summarizing notes with code examples. Additionally covers debugging, integrating with Claude desktop, and highlights the versatility of MCP servers.
75
14
Article
Docker·1y
Run LLMs Locally with Docker: A Quickstart Guide to Model Runner
Running large language models (LLMs) locally can be challenging. Docker Model Runner, now available in Beta with Docker Desktop 4.40 for macOS on Apple silicon, simplifies this process by enabling easy pulling, running, and experimentation with LLMs on local machines. It features GPU acceleration, integration with the OpenAI API, and a collection of popular models available as standard OCI artifacts. The guide provides steps to enable Model Runner, use its CLI, and integrate it into applications.
40
1
15
Article
Daily Dose of Data Science | Avi Chawla | Substack·1y
Implement Planning Agentic Pattern from Scratch
Part 11 of the AI Agents crash course introduces the Planning agentic pattern, focusing on implementing it from scratch using Python and a Language Learning Model (LLM). The post highlights the importance of structured planning to improve the thoroughness of LLM decisions, covering recent research, the Planning loop pattern, and best practices. It also includes detailed instructions for creating a system prompt and a lightweight agent class, along with examples of both manual and automated Planning loops.
36
16
Article
Meilisearch·1y
Why you shouldn't use vector databases for RAG
This post argues against the use of vector databases for retrieval augmented generation (RAG) systems, highlighting their limitations in query refinement and precision. It suggests a more effective approach using hybrid search that combines full-text and semantic capabilities, mirroring human search behaviors to improve relevance and simplicity.
34
2
17
Article
Lightbend·1y
Demystifying AI, LLMs, and RAG
Kevin Hoffman simplifies the topics of vectors, embeddings, prompts, prompt engineering, RAG, agentic, and agentic AI in a developer-friendly manner.
31
18
Article
DigitalOcean Community·1y
MCP Server in Python — Everything I Wish I’d Known on Day One
This guide provides a step-by-step tutorial on building and integrating an MCP server in Python. The MCP (Model Context Protocol) allows Large Language Models (LLMs) like GPT and Claude to query external data sources and execute real-world actions. By following this guide, developers can set up an MCP server that interacts with applications such as Cursor and Claude Desktop, enabling the AI to perform tasks like querying databases, sending emails, and more.
25
19
Article
Shaaf·1y
TechTalk - Java + LLMs: A hands-on guide to building LLM Apps in Java with Jakarta
Langchain4j is a preferred framework for working with large language models (LLMs) in Java. Recently, Bazlur and the author presented various demos at Jakarta Tech Talk, focusing on practical applications such as prompt engineering, tools, RAG, and Model Context Protocol. The presentation highlighted an increased general understanding of LLMs and included a comprehensive demonstration to engage the audience. Source code and a step-by-step guide are available on GitHub.
22
1
20
Article
InfoWorld·1y
MarkItDown: Microsoft’s open-source tool for Markdown conversion
Microsoft has introduced MarkItDown, an open-source Python utility that converts various file formats into Markdown. The tool is designed to help with fine-tuning large language models (LLMs) and building retrieval-augmented generation (RAG) systems. MarkItDown preserves document structures, supports multi-modal data like images and audio files, and integrates with LLMs for enhanced functionality. Despite some limitations, it addresses key challenges in document processing and offers a modular and extensible architecture for developers.
18
21
Article
portkey·1y
The hidden technical debt in LLM apps
Interest in large language models (LLMs) has surged, leading to hidden technical debt that can harm scalability, maintainability, and cost-efficiency. Key areas of concern in LLM apps include prompt engineering, fragile pipelines, lack of observability and feedback, and cost unpredictability. Effective management strategies include investing in prompt management systems, implementing observability, automating evaluation and feedback loops, abstracting model providers, centralizing cost controls, and enforcing security and compliance. Using LLMOps tools can help mitigate these issues and build sustainable AI products.
16
1
22
Article
Community Picks·1y
Vali-98/ChatterUI: Simple frontend for LLMs built in react-native.
ChatterUI is a native mobile frontend for LLMs that allows you to run models locally or connect to various APIs. It offers features such as chat structuring, character cards, and text-to-speech integration. You can download the latest APK from the releases page and set up development easily with Java SDK, Android SDK, and Node.js.
16
23
Video
Fireship·1y
Redditors shocked to learn they’re arguing with AI bots
Reddit users are upset after researchers from the University of Zurich conducted an unauthorized study using AI bots to infiltrate the Change My View subreddit, revealing these bots to be significantly more persuasive than human users. The study raises ethical concerns due to lack of disclosure, leading to a possible legal action by Reddit. The post also discusses AI's impact on scams, like voice cloning, and highlights threats like prompt injection within coding communities.
15
2
24
Article
Community Picks·1y
Up-to-date documentation for LLMs and AI code editors
Generate context with up-to-date documentation for LLMs and AI code editors. Easily copy the latest documentation and code for any library and paste it into tools like Cursor or Claude.
15
1
25
Article
Daily Dose of Data Science | Avi Chawla | Substack·1y
Implement ReAct Agentic Pattern from Scratch
Learn how to implement the ReAct pattern from scratch using Python and a large language model (LLM). This approach combines reasoning and acting to enable intelligent decision-making in AI agents. Discover step-by-step instructions to build a ReAct agent, including system prompts, lightweight agent classes, and manual and automated ReAct loops.
15

See all LLM archives