Analytical Databases
Overview
Analytical databases are optimized for complex queries over large datasets, prioritizing read throughput and aggregation performance over single-row transactional latency. The ecosystem spans traditional data warehouses, modern columnar engines, MPP cloud services, and the emerging lakehouse architecture.
OLAP vs OLTP

| Characteristic | OLTP | OLAP | |---------------|------|------| | Workload | Short transactions, CRUD | Complex aggregations, scans | | Query pattern | Point lookups, small range | Full table scans, joins | | Data volume per query | Rows to thousands | Millions to billions | | Users | Application servers | Analysts, dashboards | | Freshness | Real-time | Minutes to hours (or real-time) | | Schema | Normalized (3NF) | Denormalized (star/snowflake) | | Optimization | Index lookup latency | Scan throughput, compression |
Dimensional Modeling
Star Schema
A central fact table references dimension tables via foreign keys. Simple, fast for queries with few joins.
-- Fact table: one row per transaction event
CREATE TABLE fact_sales (
sale_id BIGINT,
date_key INT REFERENCES dim_date(date_key),
product_key INT REFERENCES dim_product(product_key),
store_key INT REFERENCES dim_store(store_key),
customer_key INT REFERENCES dim_customer(customer_key),
quantity INT,
unit_price DECIMAL(10,2),
total_amount DECIMAL(12,2)
);
-- Dimension table
CREATE TABLE dim_product (
product_key INT PRIMARY KEY,
product_name VARCHAR(200),
category VARCHAR(100),
subcategory VARCHAR(100),
brand VARCHAR(100)
);
Snowflake Schema
Dimensions are further normalized into sub-dimensions (e.g., dim_product -> dim_category -> dim_department). Saves storage but adds joins.
Star: fact_sales -> dim_product (category column inside)
Snowflake: fact_sales -> dim_product -> dim_category -> dim_department
ETL / ELT
ETL (Extract, Transform, Load)
Data is extracted from sources, transformed in a staging area (cleaning, joining, aggregating), and loaded into the warehouse.
Sources (OLTP DBs, APIs, files)
| Extract
v
Staging Area (transform: clean, deduplicate, conform)
| Load
v
Data Warehouse (star schema)
ELT (Extract, Load, Transform)
Raw data is loaded into the warehouse (or data lake) first, then transformed using the warehouse's compute engine. Preferred in cloud-native architectures.
Sources -> Load (raw) -> Data Lake/Warehouse -> Transform (dbt, SQL)
|
v
Curated Tables
dbt (data build tool) has become the standard for ELT transformations, managing SQL-based transformations as version-controlled code.
Materialized Views
Pre-computed query results that accelerate repeated analytical queries at the cost of storage and maintenance.
-- PostgreSQL materialized view
CREATE MATERIALIZED VIEW monthly_revenue AS
SELECT
d.year, d.month, p.category,
SUM(f.total_amount) AS revenue,
COUNT(*) AS num_sales
FROM fact_sales f
JOIN dim_date d ON f.date_key = d.date_key
JOIN dim_product p ON f.product_key = p.product_key
GROUP BY d.year, d.month, p.category;
-- Refresh strategies
REFRESH MATERIALIZED VIEW monthly_revenue; -- full refresh
REFRESH MATERIALIZED VIEW CONCURRENTLY monthly_revenue; -- non-blocking
-- ClickHouse: incrementally maintained with aggregating merge tree
CREATE MATERIALIZED VIEW mv_revenue
ENGINE = AggregatingMergeTree()
ORDER BY (year, month, category)
AS SELECT
toYear(event_date) AS year,
toMonth(event_date) AS month,
category,
sumState(amount) AS revenue -- partial aggregate state
FROM sales
GROUP BY year, month, category;
Columnar Analytical Engines
ClickHouse
Open-source columnar OLAP database with exceptional single-node scan performance.
-- MergeTree: primary engine family
CREATE TABLE events (
event_date Date,
user_id UInt64,
event_type LowCardinality(String), -- dictionary encoded
duration_ms UInt32
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (event_type, user_id, event_date)
SETTINGS index_granularity = 8192;
-- Sparse primary index: one entry per 8192 rows
-- Enables skipping entire granules during scans
-- Compression: LZ4 by default, ZSTD for cold data
DuckDB
In-process analytical database (the "SQLite for analytics"). Vectorized execution, columnar storage, zero-copy integration with Pandas/Arrow.
import duckdb
# Query Parquet files directly without loading
result = duckdb.sql("""
SELECT category, SUM(amount) AS total
FROM 'sales/*.parquet'
WHERE sale_date >= '2025-01-01'
GROUP BY category
ORDER BY total DESC
""").fetchdf() # returns Pandas DataFrame
# Zero-copy from Pandas
import pandas as pd
df = pd.read_csv("large_file.csv")
duckdb.sql("SELECT AVG(price) FROM df WHERE qty > 10")
Apache Druid
Real-time analytical database designed for sub-second OLAP queries on event data. Uses a combination of columnar storage, inverted indexes, and pre-aggregation.
MPP (Massively Parallel Processing) Systems
Amazon Redshift
- Columnar storage on shared-nothing cluster of compute nodes
- Distribution styles: KEY, EVEN, ALL (broadcast small tables)
- Zone maps: min/max metadata per block for scan skipping
- Redshift Spectrum: query S3 data without loading
- RA3 nodes: managed storage decoupled from compute
Google BigQuery
- Serverless: no cluster management, auto-scaling
- Dremel execution engine: tree-structured dispatch
- Capacitor columnar format on Colossus distributed filesystem
- Slot-based pricing: query cost proportional to data scanned
- BI Engine: in-memory acceleration for dashboards
-- BigQuery: query petabytes with standard SQL
SELECT
EXTRACT(MONTH FROM trip_start) AS month,
AVG(trip_distance) AS avg_distance,
COUNT(*) AS num_trips
FROM `bigquery-public-data.chicago_taxi_trips.taxi_trips`
WHERE EXTRACT(YEAR FROM trip_start) = 2024
GROUP BY month
ORDER BY month;
-- Scans only relevant columns and partitions
Snowflake
- Separation of storage (S3/GCS/Azure Blob), compute (virtual warehouses), and cloud services (metadata, optimization)
- Virtual warehouses: independent compute clusters that scale independently
- Automatic clustering: maintains sort order for efficient pruning
- Time travel: access historical data up to 90 days
- Zero-copy cloning: instant database/table copies via metadata
Architecture:
Cloud Services Layer: query parsing, optimization, metadata
|
Compute Layer: Virtual Warehouses (T-shirt sizing: XS to 6XL)
|
Storage Layer: Columnar micro-partitions on object storage
(immutable, 50-500MB compressed, automatically clustered)
Lakehouse Architecture
The lakehouse combines the raw storage economics of data lakes with the management features (ACID, schema enforcement, indexing) of data warehouses.
Delta Lake
Open-source storage layer on top of Parquet files, adding ACID transactions via a transaction log.
Delta Table Structure:
/delta_table/
/_delta_log/
00000000000000000000.json # initial commit
00000000000000000001.json # add/remove file actions
00000000000000000010.checkpoint.parquet # periodic checkpoint
/part-00000.parquet
/part-00001.parquet
Transaction Log Entry:
{"add": {"path": "part-00002.parquet", "size": 1048576,
"partitionValues": {"date": "2025-11-15"},
"stats": {"numRecords": 50000, "minValues": {...}}}}
Key features: ACID transactions, schema evolution, time travel, Z-ordering for multi-dimensional clustering, MERGE (upsert) support.
Apache Iceberg
Table format with snapshot isolation, hidden partitioning, and schema evolution. Avoids file listing overhead via manifest files.
Iceberg Metadata Hierarchy:
Metadata File (current snapshot pointer)
-> Snapshot (list of manifest lists)
-> Manifest List (list of manifests)
-> Manifest File (list of data files + column stats)
-> Data Files (Parquet/ORC/Avro)
Hidden Partitioning:
-- No need for partition columns in queries
ALTER TABLE events ADD PARTITION FIELD hours(event_time);
-- Iceberg automatically maps event_time to partition
-- Query: WHERE event_time > X (Iceberg prunes partitions)
Apache Hudi
Optimized for incremental processing with copy-on-write and merge-on-read table types. Supports record-level upserts and deletes on data lakes.
Comparison of Lakehouse Formats
| Feature | Delta Lake | Iceberg | Hudi | |---------|-----------|---------|------| | ACID transactions | Yes | Yes | Yes | | Schema evolution | Yes | Yes (full) | Yes | | Time travel | Yes | Yes | Yes | | Hidden partitioning | No | Yes | No | | Record-level updates | MERGE | MERGE | Native upsert | | Partition evolution | Limited | Full | Limited | | Primary ecosystem | Databricks | Multi-engine | AWS/Uber |
Choosing an Analytical System
| Scenario | Recommended | |----------|------------| | Embedded analytics, local files | DuckDB | | Real-time event analytics | ClickHouse, Druid | | Cloud data warehouse (managed) | Snowflake, BigQuery, Redshift | | Data lake with warehouse features | Delta Lake + Spark, Iceberg + Trino | | Sub-second dashboards | Druid, ClickHouse, BigQuery BI Engine |