ClickHouse Cloud's GCS ClickPipes now supports unordered mode, allowing data ingestion from Google Cloud Storage without requiring files to arrive in lexicographical order. Instead of polling the bucket every 30 seconds, the connector listens for Pub/Sub OBJECT_FINALIZE notifications and processes files as they land, regardless of naming order. This handles backfills, retries, and late-arriving data while maintaining exactly-once semantics to prevent duplicates. Setup requires configuring a Pub/Sub topic, a service account with appropriate IAM permissions, and enabling the 'Any order ingestion mode' in the ClickHouse Cloud console. The feature is also available for Amazon S3, with Azure Blob Storage support planned.

5m read timeFrom clickhouse.com
Post cover image
Table of contents
Why is this a big deal? #How does it work? #What’s next? #

Sort: