4 min read
On this page

Analytical Databases

Overview

Analytical databases are optimized for complex queries over large datasets, prioritizing read throughput and aggregation performance over single-row transactional latency. The ecosystem spans traditional data warehouses, modern columnar engines, MPP cloud services, and the emerging lakehouse architecture.


OLAP vs OLTP

OLAP vs OLTP Architecture Comparison

Characteristic OLTP OLAP
Workload Short transactions, CRUD Complex aggregations, scans
Query pattern Point lookups, small range Full table scans, joins
Data volume per query Rows to thousands Millions to billions
Users Application servers Analysts, dashboards
Freshness Real-time Minutes to hours (or real-time)
Schema Normalized (3NF) Denormalized (star/snowflake)
Optimization Index lookup latency Scan throughput, compression

Dimensional Modeling

Star Schema

A central fact table references dimension tables via foreign keys. Simple, fast for queries with few joins.

-- Fact table: one row per transaction event
CREATE TABLE fact_sales (
    sale_id       BIGINT,
    date_key      INT REFERENCES dim_date(date_key),
    product_key   INT REFERENCES dim_product(product_key),
    store_key     INT REFERENCES dim_store(store_key),
    customer_key  INT REFERENCES dim_customer(customer_key),
    quantity      INT,
    unit_price    DECIMAL(10,2),
    total_amount  DECIMAL(12,2)
);

-- Dimension table
CREATE TABLE dim_product (
    product_key   INT PRIMARY KEY,
    product_name  VARCHAR(200),
    category      VARCHAR(100),
    subcategory   VARCHAR(100),
    brand         VARCHAR(100)
);

Snowflake Schema

Dimensions are further normalized into sub-dimensions (e.g., dim_product -> dim_category -> dim_department). Saves storage but adds joins.

Star:       fact_sales -> dim_product (category column inside)
Snowflake:  fact_sales -> dim_product -> dim_category -> dim_department

ETL / ELT

ETL (Extract, Transform, Load)

Data is extracted from sources, transformed in a staging area (cleaning, joining, aggregating), and loaded into the warehouse.

Sources (OLTP DBs, APIs, files)
    |  Extract
    v
Staging Area (transform: clean, deduplicate, conform)
    |  Load
    v
Data Warehouse (star schema)

ELT (Extract, Load, Transform)

Raw data is loaded into the warehouse (or data lake) first, then transformed using the warehouse's compute engine. Preferred in cloud-native architectures.

Sources -> Load (raw) -> Data Lake/Warehouse -> Transform (dbt, SQL)
                                                    |
                                                    v
                                              Curated Tables

dbt (data build tool) has become the standard for ELT transformations, managing SQL-based transformations as version-controlled code.


Materialized Views

Pre-computed query results that accelerate repeated analytical queries at the cost of storage and maintenance.

-- PostgreSQL materialized view
CREATE MATERIALIZED VIEW monthly_revenue AS
SELECT
    d.year, d.month, p.category,
    SUM(f.total_amount) AS revenue,
    COUNT(*) AS num_sales
FROM fact_sales f
JOIN dim_date d ON f.date_key = d.date_key
JOIN dim_product p ON f.product_key = p.product_key
GROUP BY d.year, d.month, p.category;

-- Refresh strategies
REFRESH MATERIALIZED VIEW monthly_revenue;                -- full refresh
REFRESH MATERIALIZED VIEW CONCURRENTLY monthly_revenue;   -- non-blocking

-- ClickHouse: incrementally maintained with aggregating merge tree
CREATE MATERIALIZED VIEW mv_revenue
ENGINE = AggregatingMergeTree()
ORDER BY (year, month, category)
AS SELECT
    toYear(event_date) AS year,
    toMonth(event_date) AS month,
    category,
    sumState(amount) AS revenue  -- partial aggregate state
FROM sales
GROUP BY year, month, category;

Columnar Analytical Engines

ClickHouse

Open-source columnar OLAP database with exceptional single-node scan performance.

-- MergeTree: primary engine family
CREATE TABLE events (
    event_date  Date,
    user_id     UInt64,
    event_type  LowCardinality(String),  -- dictionary encoded
    duration_ms UInt32
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (event_type, user_id, event_date)
SETTINGS index_granularity = 8192;

-- Sparse primary index: one entry per 8192 rows
-- Enables skipping entire granules during scans
-- Compression: LZ4 by default, ZSTD for cold data

DuckDB

In-process analytical database (the "SQLite for analytics"). Vectorized execution, columnar storage, zero-copy integration with Pandas/Arrow.

import duckdb

# Query Parquet files directly without loading
result = duckdb.sql("""
    SELECT category, SUM(amount) AS total
    FROM 'sales/*.parquet'
    WHERE sale_date >= '2025-01-01'
    GROUP BY category
    ORDER BY total DESC
""").fetchdf()  # returns Pandas DataFrame

# Zero-copy from Pandas
import pandas as pd
df = pd.read_csv("large_file.csv")
duckdb.sql("SELECT AVG(price) FROM df WHERE qty > 10")

Apache Druid

Real-time analytical database designed for sub-second OLAP queries on event data. Uses a combination of columnar storage, inverted indexes, and pre-aggregation.


MPP (Massively Parallel Processing) Systems

Amazon Redshift

  • Columnar storage on shared-nothing cluster of compute nodes
  • Distribution styles: KEY, EVEN, ALL (broadcast small tables)
  • Zone maps: min/max metadata per block for scan skipping
  • Redshift Spectrum: query S3 data without loading
  • RA3 nodes: managed storage decoupled from compute

Google BigQuery

  • Serverless: no cluster management, auto-scaling
  • Dremel execution engine: tree-structured dispatch
  • Capacitor columnar format on Colossus distributed filesystem
  • Slot-based pricing: query cost proportional to data scanned
  • BI Engine: in-memory acceleration for dashboards
-- BigQuery: query petabytes with standard SQL
SELECT
    EXTRACT(MONTH FROM trip_start) AS month,
    AVG(trip_distance) AS avg_distance,
    COUNT(*) AS num_trips
FROM `bigquery-public-data.chicago_taxi_trips.taxi_trips`
WHERE EXTRACT(YEAR FROM trip_start) = 2024
GROUP BY month
ORDER BY month;
-- Scans only relevant columns and partitions

Snowflake

  • Separation of storage (S3/GCS/Azure Blob), compute (virtual warehouses), and cloud services (metadata, optimization)
  • Virtual warehouses: independent compute clusters that scale independently
  • Automatic clustering: maintains sort order for efficient pruning
  • Time travel: access historical data up to 90 days
  • Zero-copy cloning: instant database/table copies via metadata
Architecture:
  Cloud Services Layer: query parsing, optimization, metadata
       |
  Compute Layer: Virtual Warehouses (T-shirt sizing: XS to 6XL)
       |
  Storage Layer: Columnar micro-partitions on object storage
       (immutable, 50-500MB compressed, automatically clustered)

Lakehouse Architecture

The lakehouse combines the raw storage economics of data lakes with the management features (ACID, schema enforcement, indexing) of data warehouses.

Delta Lake

Open-source storage layer on top of Parquet files, adding ACID transactions via a transaction log.

Delta Table Structure:
  /delta_table/
    /_delta_log/
      00000000000000000000.json  # initial commit
      00000000000000000001.json  # add/remove file actions
      00000000000000000010.checkpoint.parquet  # periodic checkpoint
    /part-00000.parquet
    /part-00001.parquet

Transaction Log Entry:
  {"add": {"path": "part-00002.parquet", "size": 1048576,
           "partitionValues": {"date": "2025-11-15"},
           "stats": {"numRecords": 50000, "minValues": {...}}}}

Key features: ACID transactions, schema evolution, time travel, Z-ordering for multi-dimensional clustering, MERGE (upsert) support.

Apache Iceberg

Table format with snapshot isolation, hidden partitioning, and schema evolution. Avoids file listing overhead via manifest files.

Iceberg Metadata Hierarchy:
  Metadata File (current snapshot pointer)
    -> Snapshot (list of manifest lists)
      -> Manifest List (list of manifests)
        -> Manifest File (list of data files + column stats)
          -> Data Files (Parquet/ORC/Avro)

Hidden Partitioning:
  -- No need for partition columns in queries
  ALTER TABLE events ADD PARTITION FIELD hours(event_time);
  -- Iceberg automatically maps event_time to partition
  -- Query: WHERE event_time > X  (Iceberg prunes partitions)

Apache Hudi

Optimized for incremental processing with copy-on-write and merge-on-read table types. Supports record-level upserts and deletes on data lakes.


Comparison of Lakehouse Formats

Feature Delta Lake Iceberg Hudi
ACID transactions Yes Yes Yes
Schema evolution Yes Yes (full) Yes
Time travel Yes Yes Yes
Hidden partitioning No Yes No
Record-level updates MERGE MERGE Native upsert
Partition evolution Limited Full Limited
Primary ecosystem Databricks Multi-engine AWS/Uber

Choosing an Analytical System

Scenario Recommended
Embedded analytics, local files DuckDB
Real-time event analytics ClickHouse, Druid
Cloud data warehouse (managed) Snowflake, BigQuery, Redshift
Data lake with warehouse features Delta Lake + Spark, Iceberg + Trino
Sub-second dashboards Druid, ClickHouse, BigQuery BI Engine