Data Warehouse Guide

Learn about data warehouse concepts, architecture and how to set up a warehouse.

Data Warehouse Concepts and Architecture

Data warehouses store summarized historical data from many different applications, with a one-to-many relationship between data warehouses and the applications that serve as data sources. Examples of data sources include but are not limited to customer relationship management (CRM), enterprise resource management (ERP), marketing, social media or product data. This data is used for analytics and reporting.

Broadly speaking, data warehouses come in two flavors: on-premise and cloud. The traditional approach has been on-premise, but in the last five years, data warehousing has shifted to the cloud for both technical and business reasons.

This guide provides a closer look at both data warehouse concepts and data warehouse architecture, as well as the differences between a data warehouse and other data structures, such as a data mart and database.

Amazon Redshift

Amazon Redshift is a fast, fully managed data warehouse for simple and cost-effective data analysis using standard SQL and business intelligence (BI) tools. It allows data engineers and data analysts to run complex analytic queries against petabytes of structured data, using sophisticated query optimization, columnar storage on high-performance local disks, and massively parallel query execution.

Explore the pages below to understand (1) how Amazon Redshift works and why it is a popular cloud data warehouse; (2) what columnar storage means for Redshift’s performance and the challenges that come with a columnar structure; and (3) how dynamic clustering enables Redshift to handle petabyte-scale data quickly.

For pragmatic advice on cluster set up, management, operations and challenges, this data warehouse guide provides that information at your fingertips.