Best of Daily Dose of Data Science | Avi Chawla | SubstackAugust 2025

  1. 1
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·35w

    8 RAG Architectures for AI Engineers

    Eight different RAG (Retrieval-Augmented Generation) architectures are explained with their specific use cases: Simple Vector RAG for basic semantic matching, Multi-modal RAG for cross-modal retrieval, HyDE for handling dissimilar queries, Self-RAG for validation against trusted sources, Graph RAG for structured relationships, Hybrid RAG combining vector and graph approaches, Adaptive RAG for dynamic query handling, and Agentic RAG for complex workflows with AI agents.

  2. 2
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·34w

    JSON prompting for LLMs

    JSON prompting improves LLM outputs by providing structured format instead of vague natural language instructions. This technique leverages AI models' training on structured data from APIs and web applications, resulting in more consistent and predictable responses. JSON prompts eliminate ambiguity, enable output control, and create reusable templates for scalable AI workflows. While JSON is effective, alternatives like XML for Claude and Markdown also work well - the key is structure rather than specific syntax.

  3. 3
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·37w

    The Full MLOps/LLMOps Blueprint

    MLOps extends beyond model training to encompass the entire production ML system lifecycle, including data pipelines, deployment, monitoring, and infrastructure management. The crash course covers foundational concepts like why MLOps matters, differences from traditional DevOps, and system-level concerns, followed by hands-on implementation of the complete ML workflow from training to API deployment. MLOps applies software engineering and DevOps practices to manage the complex infrastructure surrounding ML code, ensuring reliable delivery of ML-driven features at scale.

  4. 4
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·33w

    Build a 100% local MCP Server and Client

    Learn to build a completely local Model Context Protocol (MCP) server and client setup for enterprise-grade AI applications. The tutorial covers creating MCP servers using FastMCP, building secure local clients with mcp-use library, and integrating with Stagehand for browser automation. This approach keeps data on your own servers while enabling AI agents to perform tasks like web scraping and form filling through natural language commands.

  5. 5
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·36w

    Make RAG systems 32x Memory Efficient!

    Binary quantization can make RAG systems 32x more memory efficient by converting float32 embeddings to binary vectors. The technique involves ingesting documents, generating binary embeddings, storing them in a vector database like Milvus, and using Hamming distance for retrieval. A complete implementation demonstrates querying 36M+ vectors in under 30ms using LlamaIndex, Milvus, and Groq for inference, with deployment via Beam Cloud.