Best of Data Warehouse2025

  1. 1
    Article
    Avatar of medium_jsMedium·1y

    Building a modern Data Warehouse from scratch

    Learn how to build a modern data warehouse using SQL Server. The project guides you through designing data architecture with Medallion Architecture, setting up ETL pipelines, developing data models, and creating data analytics and reporting solutions. Key steps include setting up project tools, implementing data quality checks, and creating bronze, silver, and gold layers for data processing and hierarchy. Resources and detailed instructions are provided for each phase.

  2. 2
    Video
    Avatar of codingwithlewisCoding with Lewis·24w

    The Database Query That Cost $1,000,000

    Shopify nearly incurred $1 million in monthly BigQuery costs due to inefficient queries scanning 75 GB per request. By implementing database clustering to organize data by date, geography, and timestamp, they reduced query size to 508 MB, cutting costs to under $1,400 monthly. The case demonstrates how proper data warehouse optimization and partitioning strategies can prevent massive cloud infrastructure expenses.

  3. 3
    Article
    Avatar of databricksdatabricks·38w

    Architecting a High-Concurrency, Low-Latency Data Warehouse on Databricks That Scales

    A comprehensive guide to building high-performance data warehouses on Databricks that handle hundreds of concurrent users with sub-second query response times. Covers architectural best practices including SQL Serverless Warehouses, Liquid Clustering, Unity Catalog governance, and AI-powered optimizations. Provides a structured framework for assessment, implementation, and monitoring, with real-world case study showing how an email marketing platform reduced costs while improving performance through materialized views and modern data organization techniques.

  4. 4
    Article
    Avatar of duckdbDuckDB·1y

    Preview: Amazon S3 Tables in DuckDB

    DuckDB announces a new preview feature that supports Apache Iceberg REST Catalogs, enabling easy connection to Amazon S3 Tables and Amazon SageMaker Lakehouse. It allows DuckDB users to read and query Iceberg tables directly from these platforms. The guide provides detailed steps for installing necessary extensions from the core_nightly repository and setting up S3 table buckets. The feature is currently experimental and a stable release is expected later in the year.

  5. 5
    Article
    Avatar of programmingdigestProgramming Digest·47w

    Which Data Architecture Should I Choose for My Workplace? — A Data Engineer’s Approach

    A comprehensive guide comparing four major data architecture approaches: Data Warehouse, Data Lake, Data Lakehouse, and Data Mesh. The article explains when to use each approach, their advantages and challenges, and provides platform recommendations. It focuses on the Medallion Architecture with its Bronze, Silver, and Gold layers for modern data warehouse design, emphasizing the importance of requirement analysis and proper architectural selection based on data types, analytical needs, and organizational structure.