Best of MediumJuly 2025

  1. 1
    Article
    Avatar of medium_jsMedium·44w

    The Open Source Project That Became an Essential Library for Modern AI Engineering

    A GitHub repository collecting system prompts from AI tools has grown from 12,000 to 70,000 stars, becoming a collaborative library for understanding AI behavior. System prompts are configuration files that define AI model behavior, personality, and ethical boundaries before user interaction. The project provides transparency into how popular AI tools like Cursor work, but raises dual-use concerns as the same information could help both developers build better AI and malicious actors bypass safety features. The author advocates for transparency over security through obscurity, believing an informed community is the best defense. Future plans include better organization, quality control, and expanded security resources.

  2. 2
    Article
    Avatar of medium_jsMedium·45w

    How to get Kimi-K2 Free API?

    Moonshot AI released Kimi K2, a 1 trillion parameter open source model that outperforms Claude 4 Sonnet, GPT 4.1, and DeepSeek V3. While the model requires significant GPU resources to run locally, developers can access it for free through OpenRouter's unified API platform. The guide provides step-by-step instructions to obtain a free API key and includes sample Python code for making requests to the Kimi K2 model through OpenRouter's endpoint.

  3. 3
    Article
    Avatar of medium_jsMedium·47w

    Why PDF Extraction Still Feels LikeHack

    PDF extraction remains challenging because the format was designed for print fidelity, not machine readability. Created in 1991 to solve cross-platform document consistency, PDFs treat content as positioned text boxes rather than structured data. Modern AI tools now require complex multi-layer processing (layout analysis, OCR, vision models) to extract meaningful information from PDFs. While Tagged PDF and other standards attempt to add structure, adoption remains limited. The solution involves choosing semantic formats for new content and supporting open standards that preserve both visual fidelity and machine readability.

  4. 4
    Article
    Avatar of medium_jsMedium·44w

    The problem with Claude Code and Cursor: The AI Coding "Death Spiral"

    AI coding assistants like Claude Code and Cursor often create a "death spiral" where fixing one error leads to multiple new problems. This happens because LLMs are pattern-matching machines that confidently generate plausible but incorrect solutions. The author proposes teaching AI assistants better principles by encoding timeless software wisdom (like Rich Hickey's "Simple Made Easy") into specific rules. By creating a simple-mindset.md file with actionable guidelines, developers can transform their AI's behavior from generating complex, intertwined solutions to building simple, maintainable code that actually works.

  5. 5
    Article
    Avatar of medium_jsMedium·45w

    SmolLM3 : The best small LLM for everything

    SmolLM3 is a 3-billion parameter language model from Hugging Face that outperforms larger models through extensive training on 11.2 trillion tokens. Key features include extended thinking mode for step-by-step reasoning, native 64k token context length (extendable to 128k), multilingual support for six languages, and built-in tool calling capabilities. The model excels in benchmarks for math, reasoning, and programming tasks while being deployable on edge devices and single-GPU setups through various frameworks like transformers, vLLM, and llama.cpp.

  6. 6
    Article
    Avatar of medium_jsMedium·44w

    Real-Time Server-Sent Events in ASP.NET Core and .NET 10

    Server-Sent Events (SSE) in .NET 10 provide a lightweight alternative to SignalR for one-way real-time communication from server to client. The new TypedResults.ServerSentEvents API enables streaming data over HTTP with built-in reconnection support and Last-Event-ID handling. SSE works over plain HTTP, requires minimal setup, and is ideal for live feeds, notifications, and progress tracking where bidirectional communication isn't needed. The implementation includes creating async enumerable streams, handling reconnections, and consuming events via JavaScript's EventSource API.

  7. 7
    Article
    Avatar of medium_jsMedium·44w

    Flutter’s Dirty Little Secret: How to Cut Your Build Time in Half (2025 Guide)

    Five practical techniques to significantly reduce Flutter build times without requiring architectural changes. The guide covers using dart-define flags for faster debug builds, enabling parallel dependency downloads, optimizing Gradle settings, explicitly listing assets to avoid bundling unnecessary files, and using DevTools to identify build bottlenecks. These optimizations can cut build times from over 3 minutes to under 90 seconds and make Hot Reload nearly instant.

  8. 8
    Article
    Avatar of medium_jsMedium·48w

    Using Gemma for Flutter Apps

    Gemma 3N enables on-device AI capabilities in Flutter apps through the flutter_gemma package, offering offline functionality, enhanced privacy, and no server costs. The tutorial demonstrates building an offline menu translator that can process both text and images locally on mobile devices, covering model downloading from Hugging Face, chat instance creation, and real-time response generation without internet connectivity.

  9. 9
    Article
    Avatar of medium_jsMedium·46w

    Chat with your documents tool — RAG (vector DBs + cosine sim.) & Claude API implementation

    A detailed implementation of a RAG system for a law firm that processes 1TB of legal documents using vector embeddings, FAISS indexing, and Claude API. The system chunks documents, creates embeddings with a trilingual MiniLM model, performs cosine similarity search, and includes citation verification to prevent hallucinations. Key features include OCR processing, privacy-focused local deployment, sub-20ms query response times, and costs around $0.02 per query.