Best of KafkaDecember 2025

  1. 1
    Article
    Avatar of datadogDatadog·22w

    How microservice architectures have shaped the usage of database technologies

    Microservices have transformed database usage from monolithic, single-database architectures to distributed systems where organizations run multiple database technologies simultaneously. Analysis of 2.5 million services shows over half of organizations now use both SQL and NoSQL databases side by side, with many adopting 3+ different database technologies. This shift enables teams to choose the right tool for each service but introduces new challenges: fragmented schemas require data integration layers like GraphQL, analytics demands OLAP systems like Snowflake, and service communication relies heavily on message queues like Kafka and RabbitMQ for asynchronous decoupling.

  2. 2
    Article
    Avatar of bytebytegoByteByteGo·25w

    How Netflix Built a Distributed Write Ahead Log For Its Data Platform

    Netflix built a distributed Write-Ahead Log (WAL) system to solve data reliability issues across their platform. The WAL captures every data change before applying it to databases, enabling automatic retries, cross-region replication, and multi-partition consistency. Built on top of their Data Gateway Infrastructure, it uses Kafka and Amazon SQS as pluggable backends, supports multiple use cases through namespaces, and scales independently through sharded deployments. The system provides durability guarantees while allowing teams to configure retry logic, delays, and targets without code changes.

  3. 3
    Article
    Avatar of tinybirdTinybird·23w

    Build a Real-Time E-Commerce Analytics API from Kafka in 15 Minutes

    A step-by-step guide to building a real-time e-commerce analytics API using Kafka as the data source. Covers connecting to Kafka, ingesting order events, enriching data with dimension tables and PostgreSQL, creating materialized views for pre-aggregated metrics, and exposing multiple API endpoints. The tutorial progresses from a basic 5-minute setup querying raw Kafka data to advanced features including data enrichment, automated PostgreSQL syncing, and optimized aggregations using materialized views. All implementation uses SQL and configuration without requiring application code.

  4. 4
    Article
    Avatar of bytebytegoByteByteGo·24w

    Dropbox Multimedia Search: Making File Search More Useful

    Dropbox built multimedia search capabilities for Dropbox Dash by implementing a metadata-first indexing pipeline, geolocation-aware retrieval using reverse geocoding, and just-in-time preview generation. The architecture leverages their existing Riviera compute framework to process metadata from images, videos, and audio files while avoiding expensive pre-computation. Key design decisions include indexing lightweight metadata instead of deep content analysis, generating previews on-demand rather than upfront, and caching results for 30 days. The system handles location-based searches by converting GPS coordinates into hierarchical location IDs and parallelizes preview generation with ranking operations to minimize latency.

  5. 5
    Article
    Avatar of debeziumDebezium·22w

    Debezium 3.4.0.Final Released

    Debezium 3.4.0.Final has been released with over 125 new features, improvements, and fixes. Key updates include Kafka 4.1.1 support, PostgreSQL 18 compatibility, new geometry transformations, Quarkus DevService extensions for native CDC applications, and improved Oracle LogMiner metrics. Breaking changes affect IBMi string trimming, Oracle XML dependencies, PostgreSQL 13 support ending, and SQL Server streaming query modes. The release adds incremental snapshots for IBMi, AWS IAM authentication for PostgreSQL, multiple DevService support in Quarkus, and geometry data type handling in the JDBC sink.

  6. 6
    Article
    Avatar of bytebytegoByteByteGo·23w

    EP193: Database Types You Should Know in 2025

    A comprehensive overview of 13 database types for modern applications, including relational, columnar, key-value, in-memory, time-series, graph, document, vector, and others. Also covers the differences between Apache Kafka and RabbitMQ for message handling, explains HTTP protocol evolution and ecosystem components, details the DNS resolution process from browser to web server, and compares real-time update mechanisms including polling, WebSocket, and server-sent events.