Best of ByteByteGoFebruary 2026

  1. 1
    Article
    Avatar of bytebytegoByteByteGo·9w

    How OpenAI Scaled to 800 Million Users With Postgres

    OpenAI scaled PostgreSQL to handle millions of queries per second for 800 million ChatGPT users using a single-primary architecture with read replicas. Their approach focused on three pillars: minimizing primary database load through read offloading and write optimization, query and connection optimization using PgBouncer for connection pooling, and preventing cascading failures with cache locking and rate limiting. They addressed PostgreSQL's MVCC constraints by migrating write-heavy workloads to sharded systems and enforcing strict schema change rules. The system achieves five-nines availability with low double-digit millisecond p99 latency through systematic optimization rather than sharding.

  2. 2
    Article
    Avatar of bytebytegoByteByteGo·8w

    How Uber Reinvented Access Control for Microservices

    Uber built Charter, an attribute-based access control (ABAC) system to handle authorization across thousands of microservices at microsecond latency. Traditional role-based policies couldn't express complex conditions like region-matching or ownership relationships. Charter distributes policies to services, which evaluate them locally using an embedded authfx library. Conditions are written in Google's Common Expression Language (CEL) and evaluated against attributes fetched at runtime from typed attribute stores (actor, resource, action, environment). A real-world example shows how a single ABAC policy replaced thousands of individual Kafka topic policies by dynamically checking ownership data from Uber's uOwn service. Since adoption, 70 Uber services use attribute-based policies, gaining fine-grained, dynamic, and scalable authorization without code deployments.

  3. 3
    Article
    Avatar of bytebytegoByteByteGo·10w

    How LinkedIn Built a Next-Gen Service Discovery for 1000s of Services

    LinkedIn replaced its decade-old Zookeeper-based service discovery system with a next-generation architecture using Kafka for writes and gRPC/xDS for reads. The new system handles hundreds of thousands of service instances with 10x better median latency (P50 < 1s vs 10s) and 6x better P99 latency. Key improvements include horizontal scalability through Go-based Observer components, eventual consistency over strong consistency, multi-language support via xDS protocol, and cross-fabric capabilities. The migration used a dual-mode strategy where applications ran both systems simultaneously, with automated dependency analysis to safely transition thousands of services without downtime.

  4. 4
    Article
    Avatar of bytebytegoByteByteGo·11w

    How Grab Built a Vision LLM to Scan Images

    Grab built a custom 1B-parameter Vision LLM to extract information from Southeast Asian documents for eKYC verification. Starting with Qwen2-VL 2B, they progressed from LoRA fine-tuning to full parameter training, then built a lightweight model from scratch combining Qwen2-VL's vision encoder with Qwen2.5's compact language decoder. The four-stage training process included projector alignment, vision enhancement, language-specific visual training on synthetic OCR data, and task-specific fine-tuning. The final model achieved comparable accuracy to the 2B version while delivering 48-56% faster latency, addressing challenges with non-Latin scripts and diverse document formats across the region.