Best of Daily Dose of Data Science | Avi Chawla | SubstackJune 2025

  1. 1
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·44w

    9 MCP Projects for AI Engineers

    A comprehensive collection of 9 Model Control Protocol (MCP) projects designed for AI engineers, covering various applications from local MCP clients and agentic RAG systems to voice agents and synthetic data generators. The projects demonstrate how to integrate MCP with popular tools like Claude Desktop and Cursor IDE, enabling developers to build more sophisticated AI applications with enhanced tool connectivity and context sharing capabilities.

  2. 2
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·45w

    48 Most Popular Open ML Datasets

    A comprehensive compilation of 48 widely-used open machine learning datasets organized by domain including computer vision (ImageNet, COCO), natural language processing (SQuAD, GLUE), recommendation systems (MovieLens, new Yambda-5B), tabular data (UCI datasets, Titanic), reinforcement learning (OpenAI Gym), and multimodal learning (LAION-5B, VQA). Each dataset is briefly described with its primary use case and key characteristics, serving as a reference guide for researchers and practitioners selecting appropriate datasets for their ML projects.

  3. 3
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·43w

    10 MCP, RAG and AI Agents Projects

    A curated collection of 10 advanced AI engineering projects covering MCP-powered applications, RAG systems, and AI agents. Projects include video RAG with exact timestamp retrieval, corrective RAG with self-assessment, multi-agent flight booking systems, voice-enabled RAG agents, and local alternatives to ChatGPT's research features. The repository contains 70+ hands-on tutorials focusing on real-world implementations of LLMs, memory-enabled agents, multimodal document processing, and performance optimization techniques like binary quantization for 40x faster RAG systems.

  4. 4
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·45w

    An MCP-powered Voice Agent

    A technical demonstration of building a voice agent using Model Context Protocol (MCP) that can query databases and perform web searches. The system uses AssemblyAI for speech-to-text, Firecrawl for web search, Supabase as the database, LiveKit for orchestration, and Qwen3 as the LLM. The agent transcribes user speech, determines whether to query the database or search the web, and responds via text-to-speech.