Best of Distributed SystemsOctober 2024

  1. 1
    Article
    Avatar of communityCommunity Picks·2y

    gRPC

    gRPC is a high-performance, open-source RPC framework that connects services across data centers and to backend services from devices, mobile apps, and browsers. It uses Protocol Buffers for service definitions, supports quick scaling, works across various languages and platforms, and offers bi-directional streaming with integrated authentication.

  2. 2
    Article
    Avatar of hnHacker News·2y

    How to do distributed locking — Martin Kleppmann’s blog

    The post examines the Redlock algorithm for distributed locking on Redis, highlighting its limitations in terms of timing assumptions and lack of support for fencing tokens, which are crucial for ensuring correctness. It argues that while Redis is suitable for some non-critical locking scenarios, Redlock may not be reliable for critical applications requiring strong consistency. It suggests using more robust consensus algorithms like ZooKeeper when correctness is essential.

  3. 3
    Article
    Avatar of systemdesigncodexSystem Design Codex·2y

    Eventual Consistency is Tricky

    Eventual consistency is essential in distributed systems, allowing scalability despite temporary inconsistencies. Key patterns to achieve eventual consistency include Event-Based, Background Sync, Saga, and CQRS. Each pattern has specific use cases, pros, and cons, ranging from loosely coupled systems to complex long-running transactions, ensuring data congruence through various methods.

  4. 4
    Article
    Avatar of threedotslabsThree Dots Labs·2y

    Distributed Transactions in Go: Read Before You Try

    The post discusses the complexities of using distributed transactions in microservices with Go. It warns against using distributed transactions due to their complications and instead suggests alternatives like embracing eventual consistency and using the outbox pattern. The post also provides a detailed implementation approach for using event-driven architecture with Redis and Watermill in Go, including handling events asynchronously and ensuring data consistency. It emphasizes the importance of correct service boundaries and provides guidance on testing and monitoring event-driven systems.

  5. 5
    Article
    Avatar of p99confP99 Conf·2y

    14 Books by P99 CONF Speakers: Latency, Wasm, Databases & More

    P99 CONF features over 60 speakers sharing insights on performance topics like distributed databases, Rust, C++, Go, Wasm, and more. The post highlights 14 books authored by the speakers, providing a rich resource for attendees. Registrants get a 30-day access to O’Reilly’s library and discounts from Manning publications. The highlighted books cover diverse topics such as data-intensive applications, technical writing, latency reduction, distributed systems, and eBPF.

  6. 6
    Article
    Avatar of quastorQuastor Daily·2y

    How Stripe synchronizes time across their distributed system

    Stripe employs both physical and logical clocks to maintain accurate timekeeping in their distributed billing system. Physical clocks are synchronized using protocols like NTP and PTP to avoid drift, while logical clocks order events without depending on real-world time, using methods like Vector and Lamport Clocks. Stripe combines these approaches with hybrid logical clocks to ensure accurate billing and simulate future events for testing purposes. Additional highlights include a deep dive on caching in system design, eBay's use of LLMs for developer productivity, and Eventbrite’s CSRF defense mechanisms.

  7. 7
    Article
    Avatar of infoworldInfoWorld·2y

    The best Python libraries for parallel processing

    The post introduces seven Python libraries that help distribute a heavy workload across multiple CPUs or compute clusters, addressing Python's single-threaded limitations. Libraries discussed include Ray, Dask, Dispy, Pandar·lel, Ipyparallel, Joblib, and Parsl, each catering to different needs such as machine learning, data science, and general parallel processing tasks. Highlights include Ray's minimal syntax and cluster management, Dask's centralized scheduler and actor model, and Joblib's efficient disk caching and parallelization capabilities.

  8. 8
    Video
    Avatar of codeheimcodeHeim·2y

    #58 Golang - Asynchronous Messaging with RabbitMQ

    Learn how to integrate RabbitMQ with Go for building scalable distributed systems. The post covers the basic architecture of RabbitMQ, how to set up a producer to send messages, and a consumer to receive messages using the AMQP protocol. The tutorial uses the Gin framework to set up a web server and demonstrates sending and receiving messages through HTTP requests. Additionally, it discusses handling errors and system interrupts.

  9. 9
    Article
    Avatar of tdsTowards Data Science·2y

    Dataflow Architecture-Derived Data Views and Eventual Consistency

    The post outlines the evolution of a health and fitness data pipeline through versions 3.0 to 5.0, focusing on how a distributed event-driven architecture improves user experience. Starting with personalized fitness insights, it progresses to community-driven gamification and finally to personalized challenges through recommendation engines. The key technological concepts discussed include the deterministic nature of dataflow, eventual consistency, and the choreography of materialization processes. The piece highlights the importance of derived data views for efficient query performance and resilience in distributed systems.

  10. 10
    Article
    Avatar of infoqInfoQ·2y

    Building a Global Caching System at Netflix: A Deep Dive to Global Replication

    Netflix uses a global replication strategy with EVCache, a distributed key-value store, to ensure data availability across four regions. EVCache handles 30 million global replication events and 400 million operations per second, leveraging 200 Memcached clusters and 22,000 servers. Features include client-initiated replication, topology-aware clients, and batch compression, which reduce network costs and enhance performance. The replication process involves client-initiated data mutations, Kafka for metadata handling, and SQS for robust error handling.

  11. 11
    Article
    Avatar of milanjovanovicMilan Jovanović·2y

    Implementing Idempotent REST APIs in ASP.NET Core

    Implementing idempotency in REST APIs is crucial for enhancing service reliability and consistency, ensuring identical requests yield the same result. This is particularly beneficial in distributed systems to prevent unintended duplicates and gracefully handle network issues. The post provides a comprehensive guide on how to achieve idempotency using ASP.NET Core, including utilizing unique keys, response caching, and handling concurrency issues.

  12. 12
    Article
    Avatar of milanjovanovicMilan Jovanović·2y

    Implementing the Outbox Pattern

    The Outbox Pattern addresses the challenge of maintaining data consistency in distributed systems by ensuring atomicity between database operations and message publication. This pattern saves messages to an Outbox table within the same database transaction and later publishes them via a separate process, ensuring at-least-once delivery. Implementation details, such as creating the Outbox table and a processing job using Quartz, are discussed along with considerations for scalability, idempotency, and database performance.

  13. 13
    Video
    Avatar of bytebytegoByteByteGo·2y

    Scalability Simply Explained in 10 Minutes

    Understanding scalability in system design is crucial for handling sudden traffic surges. Effective scalability means managing increased loads efficiently by adding resources without compromising performance. Key principles include statelessness, loose coupling, and asynchronous processing. Techniques like load balancing, caching, sharding, and modular design help distribute workload and avoid bottlenecks. Scaling can be approached vertically by upgrading single machines or horizontally by adding multiple machines. Continuous monitoring and adaptation are essential as scalability needs evolve.

  14. 14
    Article
    Avatar of itnextITNEXT·2y

    Mesh

    Mesh is an architectural pattern that uses interconnected, layered shards as middleware, enhancing scalability and fault tolerance in distributed environments. It supports dynamic scaling and high availability but may introduce communication artifacts and performance overhead. Meshes vary significantly by structure, connectivity, and layers, and are implemented in various forms such as Peer-to-Peer Networks, Leaf-Spine Architecture, Actors, and Service Meshes.

  15. 15
    Article
    Avatar of programmingdigestProgramming Digest·2y

    Practices of Reliable Software Design

    The post explores various practices for reliable software design, including building high-performance in-memory caches and understanding the CAP Theorem in distributed systems. It also covers the efficiency of different data structures for associative arrays, highlights tools for speeding up code reviews with AI, and addresses the impact of deployment speed on productivity.