Best of Distributed Systems โ€” 2024

  1. 1
    Article
    Avatar of gcgitconnectedยท2y

    Message Queues in System Design

    Message queues are durable components that support asynchronous communication, helping to decouple events and handle tasks without immediate processing. This allows better scalability and durability, especially under high traffic. Different types of queues like FIFO and priority queues, as well as different models like push-based and pull-based queues, provide versatile solutions for various needs. Examples of message queues include RabbitMQ for versatility, Kafka for high throughput, and Amazon SQS for managed cloud-based services.

  2. 2
    Article
    Avatar of systemdesigncodexSystem Design Codexยท2y

    8 Strategies for Reducing Latency

    High latency can render an application unusable, frustrating users and negatively impacting business outcomes. Developers need to understand low-latency strategies such as caching, using Content Delivery Networks (CDNs), load balancing, asynchronous processing, database indexing, data compression, pre-caching, and utilizing keep-alive connections to mitigate these issues and improve performance.

  3. 3
    Article
    Avatar of javarevisitedJavarevisitedยท2y

    How to Design Twitter (X) in a System Design Interview?

    Designing a system like Twitter (X) in a system design interview involves outlining core functionalities such as composing and sharing tweets, following users, and favoriting tweets. Non-functional requirements like scalability, high availability, and stability are crucial for handling large-scale operations. Key aspects include capacity estimation, API design, database design, and understanding queries per second (QPS). Employing a structured approach and utilizing tools like Redis for caching, MySQL for data consistency, and Amazon S3 for media storage are essential. Detailed component design includes load balancers, CDNs, and handling failure scenarios to ensure robust system performance.

  4. 4
    Article
    Avatar of medium_jsMediumยท2y

    40 Must-Read White Papers to Learn System Design and Software Architecture

    This post lists 40 essential white papers for learning system design and software architecture. It is geared towards those preparing for system design interviews or aiming to understand complex system architectures. Each white paper provides in-depth technical insights from industry leaders like Google and AWS, covering topics from distributed file systems to data processing models and consensus algorithms.

  5. 5
    Article
    Avatar of javarevisitedJavarevisitedยท2y

    Most-Used Distributed System Design Patterns

    Distributed system design patterns offer architectural solutions and best practices for developing distributed applications. This post discusses widely-used patterns like Ambassador for proxy tasks, Circuit Breaker to prevent cascading failures, CQRS for separating read and write databases, Event Sourcing for recording events, Sidecar for managing cross-cutting concerns, Leader Selection for electing a single node leader, Publisher/Subscriber for asynchronous communication, Sharding for data distribution, Bulkhead to isolate system components, and Cache-Aside for optimized caching strategies. Examples of tools and implementations for each pattern are provided to illustrate their applications and benefits.

  6. 6
    Article
    Avatar of medium_jsMediumยท2y

    System Design: Load Balancer

    Load balancers are essential in distributing workloads effectively across multiple servers in distributed applications. They can operate at various application layers and employ static or dynamic algorithms to manage requests. Static algorithms depend on predefined parameters while dynamic ones use real-time system state data. Popular load balancing strategies include Round Robin (and its variations), Least Connections, Least Response Time, IP Hashing, and URL Hashing. The choice of strategy depends on specific system needs and configurations to ensure optimal performance.

  7. 7
    Article
    Avatar of javarevisitedJavarevisitedยท2y

    System Design Basics โ€” Rate Limiter

    A rate limiter is a mechanism used in software systems and network communications to control the rate at which requests or operations are performed. It helps maintain system stability, prevent resource overuse, and ensure fair usage among users. Rate limiters are critical in high-traffic, distributed architectures. Common rate limiting algorithms include Token Bucket, Leaky Bucket, and Sliding Window. Understanding rate limiting is important for system design interviews, where it is often discussed alongside concepts like API gateways and load balancers.

  8. 8
    Article
    Avatar of systemdesigncodexSystem Design Codexยท2y

    Message Queues & Message Brokers

    Message queues enable asynchronous communication between producers and consumers by storing messages in FIFO order. They are useful for processing background tasks, distributing tasks, email services, buffering, and payment retries. Message brokers manage these queues and provide additional features like message routing, transformation, protocol translation, and support for the publish-subscribe pattern. This allows seamless integration and communication between different services in applications, such as in an e-commerce platform where order, inventory, shipping, and notification services interact efficiently.

  9. 9
    Article
    Avatar of communityCommunity Picksยท2y

    9 Software Architecture Patterns for Distributed Systems

    In modern software development, distributed systems require efficient design to manage data and communication between components. Key architectural patterns like Peer-to-Peer, API Gateway, Pub-Sub, Request-Response, Event Sourcing, ETL, Batching, Streaming Processing, and Orchestration offer solutions for reliability, scalability, and maintainability. These patterns are essential not only for system robustness but also for system design interviews, providing a deep understanding of their strengths and trade-offs.

  10. 10
    Article
    Avatar of hnHacker Newsยท2y

    taubyte/tau: Open source distributed Platform as a Service (PaaS). A self-hosted Vercel / Netlify / Cloudflare alternative.

    Tau is an open-source, distributed Platform as a Service (PaaS) designed to compete with major providers like Vercel, Netlify, and Cloudflare. It's a developer-friendly framework focused on minimal configuration, auto-discovery, and peer-to-peer networking. Using Git for infrastructure management, Tau emphasizes local development and seamless production deployment. Features include WebAssembly support, content-addressed storage, and a plugin system for extensibility.

  11. 11
    Article
    Avatar of communityCommunity Picksยท2y

    Things I Wished More Developers Knew About Databases

    Effective database management involves understanding numerous critical concepts such as ACID properties, network reliability, transaction isolation levels, optimistic locking, and the impact of auto-incrementing IDs. Appreciating the complexity of database design helps in predicting potential issues like dirty reads, data loss, and write skews. Additionally, proper handling of latency, sharding, and clock skews can prevent operational surprises. It's essential to evaluate performance requirements per transaction, avoid nested transactions, and understand query planners for optimized database performance. Navigating online migrations smoothly ensures minimal downtime and accurate data migration.

  12. 12
    Article
    Avatar of bytebytegoByteByteGoยท2y

    EP126: The Ultimate Kafka 101 You Cannot Miss

    This edition of the ByteByteGo newsletter covers several key topics, including a guide to understanding Apache Kafka, tips for efficient API design, an overview of AWS Services, and an advertisement for QA Wolf, an automated testing solution. Kafka is detailed with its core concepts like messages, topics, partitions, producers, consumers, clusters, and use cases. The AWS Services cheat sheet simplifies the exploration of AWS's expansive offerings. Additionally, the newsletter includes 8 practical tips for better API design.

  13. 13
    Article
    Avatar of systemdesigncodexSystem Design Codexยท2y

    3 Interview Questions on Event-Driven Patterns

    System Design interviews often test candidates on event-driven patterns such as Competing Consumer, Retry Messages, and Async Request-Response. The Competing Consumer Pattern allows multiple instances to process messages concurrently. The Retry Messages Pattern manages transient errors by retrying failed transactions with mechanisms like exponential backoff. The Async Request-Response Pattern involves using correlation IDs to relate requests and responses across multiple instances, ensuring asynchronous communication between services.

  14. 14
    Article
    Avatar of medium_jsMediumยท2y

    How Did LinkedIn Handle 7 Trillion Messages Daily With Apache Kafka?

    LinkedIn uses Apache Kafka to manage and process up to 7 trillion messages daily. They achieve reliability and scalability through a multi-tiered Kafka deployment across multiple data centers, leveraging local and aggregate clusters. LinkedIn ensures message completeness with an internal auditing tool that tracks sent and consumed messages. They maintain a close relationship with the open-source Kafka community by regularly integrating features and patches from their internal branches into the upstream Kafka branch.

  15. 15
    Article
    Avatar of tinybirdTinybirdยท2y

    How to choose the right type of database

    Understanding the different types of databases, factors to consider when choosing a database, and the implications of the CAP theorem on database selection.

  16. 16
    Article
    Avatar of techworld-with-milanTech World With Milanยท2y

    What are the main Cloud Design Patterns?

    Cloud Design Patterns offer practical solutions to address the common fallacies associated with distributed computing, such as network reliability, latency, and security concerns. These patterns are essential for building dependable, scalable, and secure cloud systems. Key groups of patterns include data management, design and implementation, messaging, security, and reliability. Learn about implementing these patterns to improve system performance and resilience, as well as the various load-balancing options available in Azure for enhanced system availability and performance.

  17. 17
    Article
    Avatar of communityCommunity Picksยท2y

    How Does Facebook Manage to Serve Billions of Users Daily?

    Understanding how Facebook manages to serve billions of users daily involves exploring their use of caching systems, particularly Memcache. Cache stores data to anticipate future requests, enabling quicker data retrieval compared to databases. Facebook's Memcache optimizes performance through techniques like parallel requests with DAG, batching requests, and leasing to prevent stale data and manage heavy loads. These strategies allow efficient handling of massive user requests while maintaining data integrity.

  18. 18
    Article
    Avatar of communityCommunity Picksยท2y

    gRPC

    gRPC is a high-performance, open-source RPC framework that connects services across data centers and to backend services from devices, mobile apps, and browsers. It uses Protocol Buffers for service definitions, supports quick scaling, works across various languages and platforms, and offers bi-directional streaming with integrated authentication.

  19. 19
    Article
    Avatar of bytebytegoByteByteGoยท1y

    How Tinder Recommends To 75 Million Users with Geosharding

    Tinder has improved its recommendation engine for over 75 million users by implementing geosharding, where user data is divided into geographically bound shards. This approach enhances performance, reduces latency, and improves scalability. The system leverages tools like Google's S2 Library and Apache Kafka, and addresses consistency challenges and traffic imbalances by using smart load balancing and dynamic adjustments. As a result, Tinder can manage 20 times more computations efficiently while maintaining low latency.

  20. 20
    Article
    Avatar of communityCommunity Picksยท2y

    Microservices Architecture, The Hard Parts : Trap of Distributed Monolith

    Seasoned Senior Software Engineers often encounter significant challenges when implementing Microservices Architecture. Initial enthusiasm can give way to difficulties, particularly when releasing new features or managing performance and latency due to service interdependencies. Identifying and addressing issues such as inadequate service boundaries, excessive synchronous communication, overly fine-grained services, service coupling, and shared code without versioning are critical to preventing the creation of a Distributed Monolith.

  21. 21
    Article
    Avatar of freecodecampfreeCodeCampยท2y

    How to Build Resilient Microservice Systems โ€“ SOLID Principles for Microservices

    Learn about the SOLID principles and best practices for building efficient microservices.

  22. 22
    Video
    Avatar of youtubeYouTubeยท2y

    When to Use Kafka or RabbitMQ | System Design

    Kafka and RabbitMQ serve different purposes in distributed systems. Kafka is designed for high-throughput stream processing, fanning out messages to multiple consumers, and handling uniform, short processing tasks. RabbitMQ is a traditional message queue system better suited for complex routing, long-running tasks, and handling sporadic data flow with acknowledgments for message processing. Choose Kafka for scenarios requiring high-speed, real-time data distribution and RabbitMQ for more controlled message queuing and processing.

  23. 23
    Article
    Avatar of devtoDEVยท2y

    How to implement a Distributed Lock using Redis

    Running multiple instances of an application can create issues with concurrent database writes, potentially leading to inconsistent states. Distributed locking, particularly using Redis, provides a solution by ensuring only one instance can perform critical operations at a time. The Redlock algorithm is an effective method for implementing distributed locks across multiple Redis instances, ensuring consistency even if some instances fail.

  24. 24
    Video
    Avatar of communityCommunity Picksยท2y

    7 Must-know Strategies to Scale Your Database

    Understanding when and why to scale your database is essential to maintain optimal performance as your application grows. Key strategies include indexing for quick data retrieval, using materialized views for pre-computed snapshots of data, and implementing denormalization to simplify complex queries. Vertical scaling, adding resources to a single server, and caching frequently accessed data in a fast storage layer can enhance responsiveness. Replication bolsters availability and fault tolerance by creating database copies on multiple servers. Sharding, which involves splitting a database into smaller sections, enables horizontal scaling and manages large data loads efficiently.

  25. 25
    Article
    Avatar of hnHacker Newsยท2y

    exo-explore/exo: Run your own AI cluster at home with everyday devices ๐Ÿ“ฑ๐Ÿ’ป ๐Ÿ–ฅ๏ธโŒš

    Run an AI cluster at home using exo, a software that unifies everyday devices into a powerful GPU. It supports LLaMA and other popular models, and uses a peer-to-peer connection without a master-worker architecture. Install it from source with Python>=3.12.0 and access models via a ChatGPT-compatible API endpoint.