Best of Distributed Systems — September 2024

1
Article
System Design Codex·2y
Message Queues & Message Brokers
Message queues enable asynchronous communication between producers and consumers by storing messages in FIFO order. They are useful for processing background tasks, distributing tasks, email services, buffering, and payment retries. Message brokers manage these queues and provide additional features like message routing, transformation, protocol translation, and support for the publish-subscribe pattern. This allows seamless integration and communication between different services in applications, such as in an e-commerce platform where order, inventory, shipping, and notification services interact efficiently.
242
4
2
Article
Community Picks·2y
Things I Wished More Developers Knew About Databases
Effective database management involves understanding numerous critical concepts such as ACID properties, network reliability, transaction isolation levels, optimistic locking, and the impact of auto-incrementing IDs. Appreciating the complexity of database design helps in predicting potential issues like dirty reads, data loss, and write skews. Additionally, proper handling of latency, sharding, and clock skews can prevent operational surprises. It's essential to evaluate performance requirements per transaction, avoid nested transactions, and understand query planners for optimized database performance. Navigating online migrations smoothly ensures minimal downtime and accurate data migration.
194
4
3
Article
System Design Codex·2y
3 Interview Questions on Event-Driven Patterns
System Design interviews often test candidates on event-driven patterns such as Competing Consumer, Retry Messages, and Async Request-Response. The Competing Consumer Pattern allows multiple instances to process messages concurrently. The Retry Messages Pattern manages transient errors by retrying failed transactions with mechanisms like exponential backoff. The Async Request-Response Pattern involves using correlation IDs to relate requests and responses across multiple instances, ensuring asynchronous communication between services.
182
1
4
Article
Tech World With Milan·2y
What are the main Cloud Design Patterns?
Cloud Design Patterns offer practical solutions to address the common fallacies associated with distributed computing, such as network reliability, latency, and security concerns. These patterns are essential for building dependable, scalable, and secure cloud systems. Key groups of patterns include data management, design and implementation, messaging, security, and reliability. Learn about implementing these patterns to improve system performance and resilience, as well as the various load-balancing options available in Azure for enhanced system availability and performance.
134
1
5
Video
YouTube·2y
When to Use Kafka or RabbitMQ | System Design
Kafka and RabbitMQ serve different purposes in distributed systems. Kafka is designed for high-throughput stream processing, fanning out messages to multiple consumers, and handling uniform, short processing tasks. RabbitMQ is a traditional message queue system better suited for complex routing, long-running tasks, and handling sporadic data flow with acknowledgments for message processing. Choose Kafka for scenarios requiring high-speed, real-time data distribution and RabbitMQ for more controlled message queuing and processing.
102
1
6
Article
DEV·2y
How to implement a Distributed Lock using Redis
Running multiple instances of an application can create issues with concurrent database writes, potentially leading to inconsistent states. Distributed locking, particularly using Redis, provides a solution by ensuring only one instance can perform critical operations at a time. The Redlock algorithm is an effective method for implementing distributed locks across multiple Redis instances, ensuring consistency even if some instances fail.
98
1
7
Article
ByteByteGo·2y
How Uber Scaled Cassandra for Tens of Millions of Queries Per Second?
Uber's Cassandra database service supports tens of millions of queries per second, petabytes of data, and tens of thousands of nodes, ensuring high reliability and low latency for mission-critical workloads. The dedicated Cassandra team manages daily operations, implements new features, and ensures 99.99% availability. Uber's custom framework and service discovery mechanisms enable efficient lifecycle management and real-time node discovery. Key challenges tackled include unreliable node replacement, lightweight transaction errors, and data inconsistency issues, which were addressed through various engineering improvements.
71
8
Article
Quastor Daily·2y
Introduction to Chaos Engineering
Chaos Engineering involves applying the scientific method to distributed systems to discover potential failure modes by intentionally introducing disruptions. This practice helps increase confidence in system reliability. Key principles include running experiments in production, minimizing the blast radius, and automating experiments. Chaos Engineering differs from Fault Injection Testing by uncovering unknown unknowns rather than testing specific conditions. Examples from companies like Facebook, LinkedIn, Audible, Twitch, and Target illustrate varied implementations and benefits.
64
1
9
Article
Trendyol Tech·2y
Optimizing Kafka Performance Through Data Compression
Data compression in Kafka is essential for improving system efficiency and performance by reducing message size, which lowers network and storage needs and enhances disk I/O. The study benchmarks various algorithms like Gzip, Zstd, Lz4, and Snappy, highlighting their trade-offs in terms of compression ratio, speed, and resource consumption. Zstd at level 3 was found to be the most optimal for balancing compression efficiency and resource usage. Implementing the right compression strategy can significantly optimize Kafka's handling of large datasets, reduce costs, and maintain high performance under heavy loads.
34
10
Article
Hacker News·2y
ATProto for distributed systems engineers
AT Protocol aims to revolutionize social networking by decentralizing backend systems, allowing for shared state and user accounts across applications. It leverages eventual consistency and stream processing architectures to achieve high scalability. Key components include NoSQL data repositories, cryptographically signed records, and public APIs for external service integration. The protocol combines high-scale systems practices with peer-to-peer technology to create a highly scalable, open network.
21
11
Article
Hacker News·2y
ergo-services/ergo: An actor-based Framework with network transparency for creating event-driven architecture in Golang. Inspired by Erlang. Zero dependencies.
The Ergo Framework brings Erlang-inspired ideas and design patterns to Golang, offering a robust solution for creating scalable, distributed, and fault-tolerant systems. Key features include the actor model for isolated actor interaction through message passing, network transparency for location-independent actor interaction, and observability for streamlined service discovery and route management. The framework also provides ready-to-use components for ease of development, built-in support for clustered systems, and robust fault tolerance with supervisor trees. New tools and a separate Erlang network stack module enhance its capabilities, backed by comprehensive documentation and examples.
20

See all Distributed Systems archives