Best of Distributed Systems — August 2025

1
Article
System Design Codex·39w
A Quick Guide to RabbitMQ
RabbitMQ is a message broker that enables asynchronous communication between applications by acting as a middleman. Messages flow from producers to exchanges, which route them to queues based on bindings and routing keys, where consumers can process them. The system supports different exchange types (direct, topic, fanout) for various routing patterns, providing decoupling, scalability, and reliability for distributed systems.
125
2
Article
Javarevisited·38w
How ByteByteGo Makes System Design Easy for Visual Learners?
ByteByteGo excels at teaching system design through visual-first learning, using clear diagrams and step-by-step breakdowns to explain complex concepts like caching, load balancing, and distributed systems. The platform offers consistent visual materials across books, videos, and courses, featuring real-world case studies of systems like YouTube, Twitter, and Uber. Visual learners benefit from the diagram-driven approach that transforms abstract concepts into clear, memorable mental maps, making it particularly effective for technical interview preparation.
123
3
Video
Awesome·40w
The complete system designs crash course
A comprehensive overview of system design fundamentals covering web protocols, load balancing, databases, caching strategies, messaging systems, scalability patterns, security measures, and fault tolerance. Explains key concepts like CAP theorem, microservices communication, horizontal vs vertical scaling, and practical applications through examples like URL shorteners and file storage systems.
88
1
4
Article
Programming Digest·39w
How to Keep Services Running During Failures?
Graceful degradation is a design principle that allows systems to maintain essential functionality during failures by operating at reduced capacity rather than crashing completely. Key strategies include rate limiting to control traffic, request coalescing to reduce duplicate queries, load shedding to prioritize critical requests, retry mechanisms with jitter to prevent thundering herd problems, circuit breakers to isolate failing services, request timeouts to prevent resource exhaustion, and comprehensive monitoring with alerting for proactive issue detection.
56
5
Article
Javarevisited·39w
ByteByteGo Books vs. ByteByteGo Course — Which Should You Buy?
A detailed comparison between ByteByteGo's system design books and online course for interview preparation. The books offer affordable, structured text-based learning with 15+ problems and clear explanations, while the course provides 100+ visual lessons with animations, regular updates, and interactive content. Books are better for budget-conscious beginners who prefer reading, while the course suits visual learners seeking comprehensive, up-to-date preparation. The author recommends starting with books for foundation and upgrading to the course for advanced practice, especially during current 50% discount offers.
35
2
6
Article
Lobsters·40w
Release v2.0.0 · syncthing/syncthing
Syncthing v2.0.0 introduces major architectural changes including a switch from LevelDB to SQLite database backend, structured logging with per-package log levels, automatic deletion of old database entries after six months, modernized command line options, removal of rolling hash detection, multiple device connections by default, and improved conflict resolution for deleted files. The release includes numerous bug fixes and performance improvements but drops prebuilt binaries for several platforms due to SQLite cross-compilation complexities.
33
7
Article
Distributed Computing Musings·39w
Thundering Herd Problem: Preventing the Stampede – Distributed Computing Musings
The thundering herd problem occurs when multiple concurrent requests cause cache misses for the same key, leading all requests to hit the database simultaneously and defeating the purpose of caching. The post demonstrates this issue with a Spring Boot application using Redis cache and Postgres database, then presents two solutions: distributed locking using Redis and in-process synchronization with CompletableFuture and ConcurrentHashMap. The distributed lock approach works across multiple nodes but requires additional network calls, while in-process synchronization is faster but only coordinates requests within a single node.
28
1
8
Article
swizec.com·40w
Quick tips for distributed event-based systems
Moving from stuffing side-effects into endpoints to event-based systems improves reliability and performance. Key practices include using thin task packets with minimal data, implementing idempotent task functions that can handle retries, maintaining backup scheduling mechanisms, and proper logging throughout the process. Queue systems provide persistent storage and retry capabilities, reducing error rates from compounding side-effect failures.
17
9
Article
SingleStore·40w
Scaling PostgreSQL vs. SingleStore: Overcoming Performance & Complexity Limits
PostgreSQL faces significant scaling challenges as applications grow, requiring complex workarounds like read replicas, manual partitioning, and additional tools for caching and analytics. These solutions create operational overhead and architectural complexity. The database's row-based storage struggles with analytical workloads, while lock contention limits write throughput. AI workloads with vector embeddings further strain the system. SingleStore is presented as an alternative that combines transactional and analytical capabilities in a unified, horizontally scalable SQL engine with built-in vector search and hybrid storage formats.
14
1

See all Distributed Systems archives