Best of Datacenters2024

  1. 1
    Article
    Avatar of bytebytegoByteByteGo·2y

    How Facebook Syncs Time Across Millions of Servers

    Facebook faced significant challenges in maintaining precise time synchronization across millions of servers due to inaccuracies in internal clock oscillators. Initially, they used Network Time Protocol (NTP) and later switched to Precision Time Protocol (PTP) to achieve nanosecond-level precision. The transition was driven by the need for higher accuracy to support advanced systems and applications like the metaverse. PTP uses hardware timestamping and transparent clocks to mitigate latency and improve synchronization. This implementation enhances various operations, including logging, coordination, and handling user requests in a distributed system.

  2. 2
    Article
    Avatar of hnHacker News·2y

    teslamotors/ttpoe

    Tesla has open-sourced the Tesla Transport Protocol over Ethernet (TTPoE) and joined the Ultra Ethernet Consortium to help standardize high-speed/low-latency networking protocols for AI, ML, and datacenters. TTPoE operates without needing a CPU or OS and was initially deployed in Tesla's Dojo supercomputer. The protocol emphasizes simplicity and decentralized congestion management. The GitHub repo includes source code, compilation instructions, and unit tests.

  3. 3
    Article
    Avatar of jeffgeerlingJeff Geerling·1y

    AmpereOne: Cores are the new MHz

    High-core-count servers have become increasingly important in datacenters, with Ampere's 192-core Arm CPU being a significant player. It offers competitive performance and efficiency at a lower cost compared to AMD's EPYC CPUs and is particularly suited for Telco Edge deployments. The server is designed with front-facing ports for 5G use cases and supports extensive RAM and PCIe expansion. Despite its high idle power consumption, it provides significant performance per dollar value and can handle a variety of workloads including machine learning and web applications.