Best of ByteByteGoMarch 2026

  1. 1
    Article
    Avatar of bytebytegoByteByteGo·5w

    How Stripe’s Minions Ship 1,300 PRs a Week

    Stripe runs over 1,300 fully automated pull requests per week using internal coding agents called Minions. These unattended agents work without human supervision, spinning up isolated cloud machines in under ten seconds, reading documentation, writing code, running linters, and submitting PRs ready for review. The system works because of four foundational layers: isolated devbox environments built for human engineers long before LLMs existed, hybrid 'blueprint' orchestration that mixes deterministic steps with agentic loops, curated context delivery via scoped rule files and a centralized MCP tool server called Toolshed, and fast feedback loops capped at two CI rounds to avoid diminishing returns. The key insight is that strong developer infrastructure—test suites, isolated environments, fast feedback—is the prerequisite for effective coding agents, not model selection.

  2. 2
    Article
    Avatar of bytebytegoByteByteGo·4w

    EP207: Top 12 GitHub AI Repositories

    A curated list of 12 popular GitHub AI repositories ranked by stars, including Ollama, LangChain, Dify, Open WebUI, DeepSeek-V3, Claude Code, CrewAI, and others. Also covers where different test types fit in a testing strategy (unit, integration, E2E), how SSO works step by step using SAML/OIDC, how LLMs orchestrate multi-agent deep research workflows, and six common password attack techniques.

  3. 3
    Article
    Avatar of bytebytegoByteByteGo·4w

    How Anthropic’s Claude Thinks

    Anthropic's interpretability team built tools to trace Claude's actual internal computations, revealing a significant gap between what Claude says it does and what actually happens. Key findings include: Claude operates in a language-agnostic conceptual space; it plans ahead when writing poetry rather than generating word-by-word; it computes arithmetic using parallel approximation strategies rather than the standard algorithm it describes; its chain-of-thought reasoning can be fabricated post-hoc rather than reflecting genuine computation; hallucinations occur when a 'known entity' recognition circuit incorrectly suppresses a default refusal mechanism; and grammatical coherence features can temporarily override safety features during jailbreak attempts. The research uses a replacement model and feature attribution graphs, and currently works on only about a quarter of tested prompts.

  4. 4
    Article
    Avatar of bytebytegoByteByteGo·3w

    EP208: Load Balancer vs API Gateway

    A system design newsletter covering several backend topics: the difference between load balancers and API gateways (and how they complement each other in production), an explanation of Model Context Protocol (MCP) by Anthropic, a comparison of REST vs gRPC across data format, API style, streaming, type safety, and browser support, a breakdown of session-based vs JWT-based authentication tradeoffs, and a cheat sheet of the most commonly used Linux commands by category.

  5. 5
    Article
    Avatar of bytebytegoByteByteGo·4w

    How Netflix Live Streams to 100 Million Devices in 60 Seconds

    Netflix's Live Origin is a custom-built server bridging cloud live streaming pipelines and the Open Connect CDN. Key architectural decisions include dual redundant regional pipelines for fault tolerance, predictable 2-second segment templates, and intelligent segment selection that picks the best candidate from either pipeline. To optimize CDN performance, Netflix extended nginx with millisecond-grain caching, implemented request holding at the live edge, and uses custom HTTP headers to propagate streaming metadata to millions of devices. Storage evolved from AWS S3 to a Cassandra-backed key-value store with EVCache write-through caching, achieving median latency of 25ms and supporting 200+ Gbps read throughput. The system uses strict publishing isolation, priority-based rate limiting (live edge over DVR), and hierarchical metadata caching to handle 404 storms and traffic surges. During the 2024 Tyson vs. Paul fight, Netflix handled 65 million concurrent streams.