Best of LLM — June 2025

1
Article
Daily Dose of Data Science | Avi Chawla | Substack·50w
9 MCP Projects for AI Engineers
A comprehensive collection of 9 Model Control Protocol (MCP) projects designed for AI engineers, covering various applications from local MCP clients and agentic RAG systems to voice agents and synthetic data generators. The projects demonstrate how to integrate MCP with popular tools like Claude Desktop and Cursor IDE, enabling developers to build more sophisticated AI applications with enhanced tool connectivity and context sharing capabilities.
614
12
2
Article
Sebastian Raschka·50w
Coding LLMs from the Ground Up: A Complete Course
Sebastian Raschka shares a comprehensive video course series on building Large Language Models from scratch using Python and PyTorch. The course covers seven key areas: environment setup, text data preprocessing and tokenization, attention mechanisms implementation, LLM architecture coding, pretraining on unlabeled data, classification fine-tuning, and instruction fine-tuning. The content serves as supplementary material to his book 'Build a Large Language Model (From Scratch)' and emphasizes hands-on learning through implementation rather than using pre-built frameworks.
420
5
3
Article
WebCraft·47w
prompts.chat
A directory website that curates and organizes AI prompts for various use cases. The platform serves as a resource for finding pre-written prompts to use with AI language models like ChatGPT and other LLMs.
251
13
4
Article
ByteByteGo·49w
EP167: Top 20 AI Concepts You Should Know
A comprehensive overview of 20 essential AI concepts including machine learning, deep learning, neural networks, NLP, computer vision, and transformers. Also covers the AI application stack for building RAG applications, featuring components like large language models, frameworks, vector databases, data extraction tools, and text embeddings. Additionally includes insights into Shopify's tech stack architecture and job opportunities in AI and software engineering.
204
1
5
Article
SwirlAI·51w
Breaking into AI Engineering in 2025.
A comprehensive roadmap for becoming an AI Engineer in 2025, covering essential skills from Python fundamentals and LLM APIs to advanced topics like AI agents, RAG systems, and observability. The guide emphasizes learning fundamentals while building practical skills, starting with basic LLM integration and progressing through vector databases, prompt engineering, agentic systems, infrastructure deployment, and security considerations. Key recommendations include mastering FastAPI and Pydantic, understanding different LLM types and structured outputs, implementing RAG with proper data preprocessing, and learning agent design patterns like ReAct and task decomposition.
204
2
6
Article
Towards Data Science·51w
How to Design My First AI Agent
A comprehensive guide to designing AI agents covering model selection, tooling choices, and reliability strategies. Explores different LLM options including OpenAI GPT-4, DeepSeek, Claude, and Mistral, each suited for specific use cases. Discusses infrastructure considerations, frameworks like LangGraph and Pydantic-AI, and security aspects. Emphasizes the importance of structured prompting techniques like Chain-of-Thought and ReAct, output validation, and failure handling to build reliable production-ready agents.
188
7
Article
Machine Learning Mastery·47w
Your First Local LLM API Project in Python Step-By-Step
A comprehensive guide for setting up a local large language model API using Python, Ollama, and FastAPI. The tutorial covers downloading and running LLMs locally, creating a REST API endpoint, and testing the setup through a web interface. This approach enables developers to interact with language models without relying on external cloud services, providing complete control over the inference process.
161
1
8
Article
freeCodeCamp·51w
The Open Source LLM Agent Handbook: How to Automate Complex Tasks with LangGraph and CrewAI
LLM agents are proactive AI systems that can break down complex tasks, make decisions, and use tools autonomously, unlike traditional reactive chatbots. The guide demonstrates building agents using open-source frameworks LangGraph and CrewAI to automate daily tasks like email summarization and schedule generation. LangGraph provides graph-based workflows for single agents, while CrewAI enables multi-agent collaboration with specialized roles. The tutorial includes practical code examples for creating an email processing agent that extracts meetings and deadlines, then formats them into organized daily schedules. Both frameworks integrate with OpenAI's models and offer structured approaches to agent development without requiring extensive custom code.
151
9
Article
Supabase·48w
Build a Personalized AI Assistant with Postgres
A comprehensive guide to building a personalized AI assistant using PostgreSQL as the backbone for long-term memory and data management. The system combines LLMs with a scoped database schema, scheduled tasks via pg_cron, vector search using pgvector, and external integrations through Zapier MCP. Key features include three-layer memory architecture (message history, semantic search, structured data), autonomous scheduling capabilities, and secure database access controls. The tutorial covers practical use cases like run tracking, meal planning, and feedback analysis, with complete implementation steps using Supabase, OpenAI, and Telegram. Total monthly operating costs are estimated at around $0.54 for moderate usage.
126
1
10
Article
Tarzzo Tech·51w
What is MCP (Model Context Protocol)?
Model Context Protocol (MCP) is a communication standard that enables AI models to interact with external systems and data sources. It provides a structured way for large language models to access and exchange contextual information, improving their ability to provide relevant and accurate responses by connecting them to real-time data and external services.
123
2
11
Article
LangChain·48w
The rise of "context engineering"
Context engineering is emerging as a critical skill for AI engineers, focusing on building dynamic systems that provide LLMs with the right information, tools, and formatting to accomplish tasks reliably. Unlike traditional prompt engineering, context engineering emphasizes providing complete, structured context rather than clever wording. The approach addresses the primary cause of agent failures: inadequate context rather than model limitations. Key components include dynamic information retrieval, appropriate tool selection, proper formatting, and comprehensive system design. LangGraph and LangSmith are positioned as enabling technologies for implementing effective context engineering practices.
113
12
Article
Hacker News·47w
The New Skill in AI is Not Prompting, It's Context Engineering
Context Engineering emerges as a more comprehensive approach than prompt engineering for building effective AI agents. Rather than focusing solely on crafting perfect prompts, it involves designing dynamic systems that provide LLMs with the right information, tools, and format at the right time. The concept encompasses system prompts, user inputs, conversation history, long-term memory, retrieved information (RAG), available tools, and structured outputs. The key difference between basic and sophisticated AI agents lies not in code complexity but in context quality - successful agents gather comprehensive contextual information before generating responses, while failures often stem from inadequate context rather than model limitations.
110
2
13
Article
Community Picks·48w
jujumilk3/leaked-system-prompts: Collection of leaked system prompts
A GitHub repository collecting leaked system prompts from popular LLM-based services. The project accepts contributions through pull requests with verifiable sources and reproducible prompts, while avoiding sensitive commercial code to prevent DMCA takedowns. The repository serves as a research resource cited in academic papers.
106
1
14
Video
The Coding Gopher·48w
99% of Developers Don't Get LLMs
Large language models work by predicting the next token in a sequence using transformer architecture with self-attention mechanisms. They're trained on massive text datasets to learn patterns, grammar, and relationships between concepts. The transformer processes all tokens simultaneously rather than sequentially, allowing better capture of long-range dependencies. Generation happens through probability distributions over vocabulary, with techniques like temperature and top-k sampling controlling randomness. Models become more capable with scale, exhibiting emergent behaviors not present in smaller versions. Raw models are aligned with human preferences through reinforcement learning with human feedback (RLHF). Despite their fluency, LLMs have significant limitations including hallucination, lack of persistent memory, and sensitivity to input phrasing.
104
10
15
Article
Daily Dose of Data Science | Avi Chawla | Substack·48w
10 MCP, RAG and AI Agents Projects
A curated collection of 10 advanced AI engineering projects covering MCP-powered applications, RAG systems, and AI agents. Projects include video RAG with exact timestamp retrieval, corrective RAG with self-assessment, multi-agent flight booking systems, voice-enabled RAG agents, and local alternatives to ChatGPT's research features. The repository contains 70+ hands-on tutorials focusing on real-world implementations of LLMs, memory-enabled agents, multimodal document processing, and performance optimization techniques like binary quantization for 40x faster RAG systems.
99
16
Article
Docker·51w
Learn how to make an AI chatbot from scratch
Docker Model Runner simplifies AI chatbot development by integrating LLM execution into familiar Docker workflows. The tutorial demonstrates building a production-ready chatbot with React frontend, Go backend, and comprehensive observability using Prometheus, Grafana, and Jaeger. Key benefits include local model execution for privacy and cost control, streaming responses, real-time metrics collection, and simplified deployment through Docker Compose. The architecture treats AI models as first-class services, eliminating complex setup while providing detailed performance insights including tokens per second, memory usage, and response latency.
88
17
Article
WunderGraph·51w
We accidentally built a backend framework for LLMs
WunderGraph accidentally created a backend framework for LLMs while solving API orchestration challenges. Their Cosmo Plugins system uses LLMs to generate proxy code that connects non-GraphQL services to a unified Supergraph, enabling companies to build modular monoliths or microservices without rewriting existing REST APIs. The framework leverages GraphQL Federation to create a single API entry point from multiple services, reducing complexity and API calls while maintaining deployment flexibility.
78
1
18
Article
Syncfusion·49w
Best 5 Open-Source LLMs for Developers: ChatGPT Alternatives in 2025
Five powerful open-source large language models offer cost-effective alternatives to ChatGPT for developers in 2025. Llama 3 delivers GPT-3.5-level performance with commercial licensing freedom, while Mistral AI provides exceptional efficiency with smaller parameter counts. Falcon offers flexible model sizes from 7B to 180B parameters, BLOOM supports 46 natural languages for global applications, and Pythia serves as a research-grade suite for AI interpretability studies. These models enable local deployment, complete customization, and freedom from API restrictions, though they require careful consideration of hardware requirements and deployment strategies.
70
19
Article
Neon·51w
app.build: An Open-Source AI Agent That Builds Full-Stack Apps
app.build is an open-source AI agent that automatically builds and deploys full-stack applications with end-to-end testing and automated deployments. The tool can be started with a simple npx command, creates GitHub repositories, and deploys apps with authentication, databases, and hosting infrastructure. The agent uses a divide-and-conquer approach, breaking app creation into smaller tasks with quality checks at each step to ensure working applications.
61
1
20
Article
Hacker News·50w
Fine-Tuning LLMs is a Huge Waste of Time
Fine-tuning advanced LLMs for knowledge injection is counterproductive because it overwrites existing valuable information stored in densely interconnected neurons. Instead of adding knowledge, fine-tuning risks destroying the carefully built ecosystem of an already trained model. Better alternatives include retrieval-augmented generation (RAG), adapter modules like LoRA, and contextual prompting, which inject new information without damaging the underlying model's knowledge base. These modular approaches preserve the integrity of pre-trained networks while achieving the desired knowledge enhancement goals.
56
2
21
Article
Laravel News·51w
Laravel OpenRouter
A new Laravel package enables easy integration with OpenRouter, a unified API for accessing multiple Large Language Models. The package supports both standard and streaming chat requests, allowing developers to interact with various AI models like Mistral through a single interface. It includes features for real-time streaming responses and can be easily installed via Composer.
54
1
22
Article
Towards Dev·50w
vLLM: A Quick Start
vLLM is an open-source library optimized for high-throughput serving of large language models in production. Its core innovation, PagedAttention, manages memory more efficiently by breaking the key-value cache into fixed-size pages instead of contiguous buffers, similar to virtual memory in operating systems. The tutorial covers installation on macOS M1, serving models via OpenAI-compatible API, using the native Python API, and integrating with LangChain for enhanced tooling capabilities.
52
23
Article
Hacker News·51w
Claude's System Prompt Changes Reveal Anthropic's Priorities
Analysis of Claude 4.0's system prompt reveals how Anthropic uses natural language instructions to program chatbot behavior. Key changes include removal of old hotfixes (now handled in training), encouragement of search functionality, expanded artifact use cases, context optimization for coding, and new cybersecurity guardrails. The 23,000-token system prompt consumes 11% of Claude's context window and demonstrates a user-driven development cycle where observed behaviors are first addressed through prompt modifications, then incorporated into model training.
48
24
Article
LangChain·49w
How and when to build multi-agent systems
Multi-agent systems require careful consideration of when and how to implement them effectively. Context engineering emerges as the most critical challenge, requiring sophisticated strategies to ensure each agent has appropriate context for their tasks. Systems focused on reading tasks (like research) are generally easier to implement than those emphasizing writing tasks, as read actions are more parallelizable and less prone to conflicting outputs. Production reliability requires durable execution, comprehensive debugging tools, and proper evaluation frameworks. Multi-agent architectures work best for breadth-first queries with high parallelization potential and tasks valuable enough to justify increased token costs.
42
25
Article
Simon Willison·49w
Design Patterns for Securing LLM Agents against Prompt Injections
A comprehensive research paper by 11 authors from IBM, Google, Microsoft and other organizations presents six design patterns to mitigate prompt injection attacks in LLM agents. The patterns include Action-Selector, Plan-Then-Execute, LLM Map-Reduce, Dual LLM, Code-Then-Execute, and Context-Minimization approaches. Each pattern trades some agent flexibility for security by constraining actions and preventing untrusted input from triggering arbitrary tasks. The paper includes ten detailed case studies covering practical applications like SQL agents, email assistants, and customer service chatbots, providing threat models and mitigation strategies for each scenario.
35

See all LLM archives