Best of ai-agents — October 2025

1
Article
Daily Dose of Data Science | Avi Chawla | Substack·32w
A 100% Open-source Alternative to n8n!
Sim is an open-source drag-and-drop platform for building agentic workflows that runs locally with any LLM. The article demonstrates building a finance assistant connected to Telegram using agents, MCP servers, and APIs. It also covers four RAG indexing strategies: chunk indexing (splitting documents into embedded chunks), sub-chunk indexing (breaking chunks into finer pieces while retrieving larger context), query indexing (generating hypothetical questions for better semantic matching), and summary indexing (using LLM-generated summaries for dense data).
156
7
2
Article
Simon Willison·30w
Claude Skills are awesome, maybe a bigger deal than MCP
Anthropic introduced Claude Skills, a new pattern for extending LLM capabilities using Markdown files with instructions, scripts, and resources. Skills are token-efficient (loading only when needed), depend on code execution environments, and are simpler to create than MCP implementations. The system enables general computer automation beyond just coding tasks, with skills shareable as single files or folders. Skills work with other models too, potentially sparking wider adoption than the Model Context Protocol.
138
6
3
Article
LangChain·31w
Not Another Workflow Builder
LangChain's CEO explains why they haven't built a visual workflow builder despite frequent requests. The argument centers on workflow builders being squeezed from two directions: simple use cases are better served by no-code agents (prompt + tools), while complex scenarios require code-based workflows like LangGraph. As AI models improve, the middle ground for visual workflow builders shrinks—agents handle more complexity reliably, and code generation lowers the barrier for building sophisticated workflows. The focus should shift to making no-code agents more reliable and improving code generation for LLM-powered systems.
64
1
4
Article
Hacker News·32w
The RAG Obituary: Killed by Agents, Buried by Context Windows
RAG (Retrieval-Augmented Generation) architectures are becoming obsolete as LLM context windows expand dramatically from 4K to 2M+ tokens. The author argues that agentic search systems using simple tools like grep and filesystem navigation outperform complex RAG pipelines involving chunking, embeddings, hybrid search, and reranking. Drawing from experience building financial research platforms, they demonstrate how agents can navigate complete documents and follow cross-references naturally, eliminating the infrastructure burden and accuracy problems inherent in fragment-based retrieval. The shift from context scarcity to abundance fundamentally changes how AI systems should process information.
55
3
5
Article
HelixML·28w
Technical Deep Dive on Streaming AI Agent Desktop Sandboxes: When Gaming Protocols Meet Multi-User Access
Helix adapted Moonlight, a gaming streaming protocol designed for single-player sessions, to stream GPU-accelerated desktop environments for AI agents to multiple users simultaneously. The team initially used "apps mode" with a workaround where their API pretended to be a client to start containers, but are migrating to "lobbies mode" which natively supports multi-user access to shared sessions. The solution enables low-latency (50-100ms) streaming of full Linux desktops with AI agents working in real IDEs and browsers, though challenges remain with input scaling and video corruption across different client resolutions.
46
1
6
Video
Fireship·31w
OpenAI just made your entire tech stack obsolete...
OpenAI announced several new features at their dev day, including ChatGPT apps platform with 800 million weekly active users, Agent Kit for building AI workflows without extensive coding, GitHub Actions integration for automated code reviews, and API access to GPT-5 Pro and Sora 2. The updates include smaller, cost-effective models for voice and image generation, positioning ChatGPT as a potential operating system for app interactions.
42
5
7
Article
LangChain·28w
Introducing DeepAgents CLI
DeepAgents CLI is a new command-line tool for building AI agents with persistent memory that can code, research, and execute tasks. The tool supports file operations, shell command execution with approval, web search, API requests, and cross-session memory retention. Agents store knowledge in local memory files and follow a memory-first protocol to recall information across sessions. Users can create multiple specialized agents for different projects, with the default using Anthropic's Claude Sonnet 4 model.
38
8
Article
The Register·28w
AI layoffs to backfire: half quietly rehired at lower pay
Forrester predicts that half of AI-attributed layoffs will be reversed, with 55% of employers regretting workforce cuts made in anticipation of AI automation. Many companies are laying off workers based on future AI promises rather than actual automation capabilities, with research showing AI agents achieving only 58% success rates on single-step tasks. Organizations like Klarna and Duolingo have already walked back aggressive AI strategies, while companies like Salesforce and Amazon continue cutting thousands of jobs citing AI efficiency gains.
35
8
9
Article
LangChain·28w
Introducing LangSmith’s No Code Agent Builder
LangSmith introduces Agent Builder, a no-code platform that enables non-developers to create AI agents without writing code. Unlike visual workflow builders, it focuses on agent-based decision-making through four core components: prompts, tools (via MCP integration), triggers, and subagents. The platform simplifies prompt creation through guided conversations and includes built-in memory that learns from corrections over time. Built on the deepagents package and informed by LangChain and LangGraph development, it targets internal productivity use cases like email assistants, chat automation, and Salesforce integrations.
34
10
Article
Daily Dose of Data Science | Avi Chawla | Substack·32w
[Hands-on] Build a Real-time Knowledge Base for Agents
Learn to build a real-time, bi-temporal knowledge base using Airweave, an open-source framework that enables AI agents to search across applications, databases, and document stores. The setup runs locally in Docker and integrates with tools like Notion, Google Drive, and SQL databases, exposing functionality through APIs and MCP servers.
34
2
11
Article
LangChain·29w
LangChain and LangGraph Agent Frameworks Reach v1.0 Milestones
LangChain and LangGraph have reached their 1.0 stable releases, marking a commitment to no breaking changes until 2.0. LangChain 1.0 introduces the create_agent abstraction with middleware support for customization, standardized content blocks across providers, and a streamlined package focused on core agent functionality. LangGraph 1.0 provides production-ready features including durable state, built-in persistence, and human-in-the-loop patterns for complex workflows. Both frameworks are backward compatible, with LangChain built on top of LangGraph's runtime, allowing developers to start with high-level abstractions and drop down to lower-level control when needed.
32
12
Article
databricks·30w
Build High-Quality, Domain-Specific Agents at 95% Lower Cost
Databricks introduces token-based pricing for MLflow GenAI evaluation, reducing costs by up to 95% compared to fixed-block pricing. The platform now supports custom judges using any LLM provider (OpenAI, Anthropic, or fine-tuned models) and open-sources production-tested evaluation prompts validated across finance, healthcare, and technical documentation domains. Teams can evaluate agents across metrics like correctness, faithfulness, relevance, and safety while maintaining full control over evaluation logic and scaling to production workloads.
21
13
Article
Daily Dose of Data Science | Avi Chawla | Substack·31w
AI Agent Deployment Strategies
Four deployment patterns for AI agents are explored: batch deployment for scheduled bulk processing with high throughput, stream deployment for continuous real-time data pipeline processing, real-time deployment via APIs for instant user interactions, and edge deployment on user devices for privacy and offline functionality. Each pattern serves different performance requirements, with batch optimizing throughput, stream enabling continuous monitoring, real-time providing sub-second responses, and edge ensuring data privacy without server dependencies.
19
14
Article
JetBrains·30w
JetBrains Is Sunsetting CodeCanvas
JetBrains announced the discontinuation of CodeCanvas, their cloud development environment platform launched in 2024. The company concluded that traditional CDEs are becoming obsolete in the AI-enabled development landscape and cannot meet evolving developer needs. CodeCanvas will stop accepting new licenses immediately, provide support until January 2026, and shut down completely by March 2026. JetBrains is pivoting to develop a new AI-first, cloud-native product focused on autonomous AI agents, with CDEs playing a supporting role in the new architecture.
13
4
15
Article
Daily Dose of Data Science | Avi Chawla | Substack·28w
Another MCP Moment by Anthropic?
Anthropic released Claude Skills, a feature designed to solve agent memory persistence by acting as standard operating procedures for AI agents. The announcement includes comparisons to Model Context Protocol (MCP), projects, and subagents, with practical examples of building custom skills. The piece also promotes a comprehensive MCP crash course series covering fundamentals, architecture, integration with frameworks like LangGraph and LlamaIndex, and real-world implementations.
12
16
Article
Hacker News·29w
Context engineering is sleeping on the humble hyperlink
Context engineering for LLMs faces a key challenge: providing all necessary context without overwhelming the model. While techniques like RAG and subagents help, hyperlinks offer an underutilized solution. By implementing a simple read_resources tool that accepts URIs, agents can dynamically load relevant context on-demand, similar to how humans navigate documentation. This approach is token-efficient, flexible, and enables just-in-time context loading. The Model Context Protocol (MCP) Resources provides the infrastructure needed, though most clients don't yet expose resources to models directly. The Firebase MCP Server demonstrates this pattern in practice with linked workflows for project initialization.
12
17
Article
Buildkite·31w
Make it work, make it better: What's new with the Buildkite MCP server
Buildkite released major updates to their Model Context Protocol (MCP) server, introducing a fully managed remote server with OAuth authentication, dramatically improved log fetching and parsing using Apache Parquet format with smart caching, and specialized tooling for monitoring running builds. The updates address key pain points discovered after the initial release: handling massive build logs that overwhelm AI agents, eliminating local server maintenance overhead, and bridging the gap between API capabilities and practical AI agent usage. The server now enables developers to bootstrap pipelines, debug failures more efficiently, and integrate CI/CD workflows with AI tools like Claude and VS Code with minimal configuration.
11
18
Article
LangChain·28w
Doubling down on DeepAgents
LangChain announces version 0.2 of DeepAgents, a Python package for building autonomous agents capable of complex, long-running tasks. The major update introduces pluggable backends that allow agents to use various filesystems (LangGraph state, local filesystem, S3) instead of just virtual storage. Additional improvements include automatic handling of large tool results, conversation history compression, and tool call repair. DeepAgents is positioned as an "agent harness" that sits above LangChain (agent framework) and LangGraph (agent runtime), providing built-in features like planning tools and filesystem access for developers building autonomous agents.
10
19
Article
LangChain·29w
Agent Frameworks, Runtimes, and Harnesses- oh my!
LangChain's team proposes a taxonomy for AI agent tooling: frameworks (like LangChain) provide abstractions and mental models for building with LLMs; runtimes (like LangGraph) handle production infrastructure concerns such as durable execution, streaming, and persistence; harnesses (like DeepAgents) are batteries-included solutions with default prompts and opinionated tooling. The distinctions help developers understand when to use each layer, though the boundaries remain somewhat fluid as the space evolves.
10

See all ai-agents archives