Best of Daily Dose of Data Science | Avi Chawla | Substack — October 2025

1
Article
Daily Dose of Data Science | Avi Chawla | Substack·28w
A 100% Open-source Alternative to n8n!
Sim is an open-source drag-and-drop platform for building agentic workflows that runs locally with any LLM. The article demonstrates building a finance assistant connected to Telegram using agents, MCP servers, and APIs. It also covers four RAG indexing strategies: chunk indexing (splitting documents into embedded chunks), sub-chunk indexing (breaking chunks into finer pieces while retrieving larger context), query indexing (generating hypothetical questions for better semantic matching), and summary indexing (using LLM-generated summaries for dense data).
156
7
2
Article
Daily Dose of Data Science | Avi Chawla | Substack·24w
Every LangGraph User We know is Making the Same Mistake!
The supervisor pattern in LangGraph has a fundamental limitation: it routes queries to only one specialized agent at a time, failing when users ask multi-topic questions. An alternative approach using dynamic guideline matching (implemented in the open-source Parlant framework) loads multiple relevant guidelines simultaneously into context, enabling coherent responses across topics. While LangGraph excels at workflow automation, Parlant is designed for free-form conversations, and both can work together complementarily.
43
3
Article
Daily Dose of Data Science | Avi Chawla | Substack·28w
[Hands-on] Build a Real-time Knowledge Base for Agents
Learn to build a real-time, bi-temporal knowledge base using Airweave, an open-source framework that enables AI agents to search across applications, databases, and document stores. The setup runs locally in Docker and integrates with tools like Notion, Google Drive, and SQL databases, exposing functionality through APIs and MCP servers.
34
2
4
Article
Daily Dose of Data Science | Avi Chawla | Substack·25w
ARQ: A New Structured Reasoning Approach for LLMs
Researchers introduced Attentive Reasoning Queries (ARQs), a structured reasoning approach that prevents LLM hallucinations by guiding models through explicit, domain-specific questions encoded in JSON schemas. Unlike free-form techniques like Chain-of-Thought, ARQs force LLMs to follow controlled reasoning steps, achieving a 90.2% success rate compared to 86.1% for CoT. The approach is implemented in Parlant, an open-source framework for building instruction-following agents, where ARQs are integrated into guideline proposers, tool callers, and message generators to maintain alignment throughout multi-turn conversations.
22
5
Article
Daily Dose of Data Science | Avi Chawla | Substack·27w
AI Agent Deployment Strategies
Four deployment patterns for AI agents are explored: batch deployment for scheduled bulk processing with high throughput, stream deployment for continuous real-time data pipeline processing, real-time deployment via APIs for instant user interactions, and edge deployment on user devices for privacy and offline functionality. Each pattern serves different performance requirements, with batch optimizing throughput, stream enabling continuous monitoring, real-time providing sub-second responses, and edge ensuring data privacy without server dependencies.
19

See all Daily Dose of Data Science | Avi Chawla | Substack archives