Best of KafkaJuly 2025

  1. 1
    Article
    Avatar of netflixNetflix TechBlog·45w

    Netflix Tudum Architecture: from CQRS with Kafka to CQRS with RAW Hollow

    Netflix migrated their Tudum fan site architecture from a CQRS pattern using Kafka and traditional caching to RAW Hollow, an in-memory compressed object database. The original architecture suffered from eventual consistency delays, taking minutes for content changes to appear. RAW Hollow eliminated the need for separate databases and Kafka infrastructure by storing the entire dataset in memory across application processes, reducing homepage construction time from 1.4 seconds to 0.4 seconds and enabling real-time content previews.

  2. 2
    Article
    Avatar of salesforceengSalesforce Engineering·47w

    Architecting AI Agent Auditing Systems in Agentforce

    Salesforce's Feedback and Audit Trail team built an AI auditing system for Agentforce that handles 20 million model interactions monthly across 500 enterprise customers. The system overcame significant integration challenges with Data Cloud by using Kafka-based ingestion to manage unpredictable AI traffic patterns. Key technical solutions included dynamic flow control mechanisms, Tiger Team coordination across 8-10 cross-functional teams, and iterative development approaches. The architecture prioritizes trust, security, and compliance while maintaining scalability through continuous performance monitoring and architectural improvements.

  3. 3
    Article
    Avatar of architectureweeklyArchitecture Weekly·45w

    The Order of Things: Why You Can't Have Both Speed and Ordering in Distributed Systems

    Distributed systems force a fundamental trade-off between ordering guarantees and performance. PostgreSQL prioritizes correctness through locking but suffers from contention under load. MongoDB optimizes for speed but requires handling eventual consistency at the application level. Kafka provides scalability through partitioning but only guarantees ordering within partitions. The article explores the technical mechanics behind these trade-offs, including transaction isolation, replication lag, and coordination costs, concluding that the solution is choosing appropriate guarantees for different use cases rather than seeking perfect solutions.