Amazon Redshift is AWS's fully managed, petabyte-scale data warehouse built on PostgreSQL that uses massively parallel processing and columnar storage for high-performance analytics. The platform offers provisioned clusters with RA3 nodes that decouple compute from storage, and Redshift Serverless for automatic scaling without cluster management. Key features include zero-ETL integrations with Aurora and RDS, Redshift Spectrum for querying S3 data, and comprehensive workload management. The article covers architecture, deployment options, data ingestion methods, schema design with distribution and sort keys, maintenance operations like vacuuming and analyzing, monitoring approaches, and troubleshooting common issues. A comparison with Snowflake highlights that Redshift excels for AWS-native environments but requires more manual tuning, while Snowflake offers zero-maintenance and cloud-agnostic capabilities.
Table of contents
Amazon Redshift Overview (and comparison to Snowflake)Architecture OverviewMPP and columnar storageNode types and managed storageServerless workgroupsDeployment and SetupTwo options: Creating a cluster or a workgroupNetwork and security configurationConnecting to RedshiftRedshift UIData IngestionCOPY commandZero‑ETL and streaming integrationsRedshift Spectrum and Federated QuerySchema Design and Performance TuningDistribution stylesDISTKEY Selection guidelinesSort keys and compressionSORTKEY Selection GuidelinesWorkload Management (WLM) and concurrency scalingMaintenance OperationsVacuumingANALYZE commandMonitoring and alertsCommon Errors and TroubleshootingBest PracticesSummaryWhen to Use Snowflake vs RedshiftWhat’s Next?Sort: