A common table expression (CTE) is a named, query-scoped subquery you declare with WITH. CTEs are the analyst's most important readability tool — they let you decompose a 100-line query into named steps that read like prose.
From nested subquery to readable pipeline
The same query, two ways. First as a nested subquery — works, but you have to read it inside-out:
SELECT country, AVG(monthly_spend) AS avg_spendFROM (SELECT customer_id, country,SUM(amount) AS monthly_spendFROM (SELECT customer_id, country, DATE_TRUNC('month', order_date) AS m, amountFROM orders JOIN customers USING (customer_id)) rawGROUP BY customer_id, country) per_custGROUP BY country;
Now the same logic as a CTE pipeline — reads top-to-bottom like steps in a recipe:
WITH enriched_orders AS (SELECT customer_id, country, DATE_TRUNC('month', order_date) AS m, amountFROM ordersJOIN customers USING (customer_id)),customer_monthly AS (SELECT customer_id, country, SUM(amount) AS monthly_spendFROM enriched_ordersGROUP BY customer_id, country)SELECT country, AVG(monthly_spend) AS avg_spendFROM customer_monthlyGROUP BY country;
Naming is everything
A well-named CTE is documentation. 'enriched_orders' tells the reader (and reviewer, and future-you) what this step does. Anonymous subqueries don't get to do that.
Materialisation: CTE vs subquery, perf considerations
Historically (and still in older Postgres), CTEs were 'optimisation fences' — the engine materialised the CTE result before the outer query, even when inlining would have been faster. Postgres 12+, MySQL 8+, and most modern engines inline CTEs by default when there are no side effects.
- Postgres 12+ — CTEs are inlined unless WITH … AS MATERIALIZED is specified, or the CTE is referenced more than once, or there's a recursive or non-deterministic component
- MySQL 8+ — inlines aggressively
- Snowflake / BigQuery / DuckDB — CTEs are pure sugar, inlined transparently
- Older Postgres (≤11) — CTEs always materialised; that meant 'WITH x AS (slow_query) SELECT * FROM x' could be 10× slower than the equivalent subquery
When materialisation helps
If a CTE is referenced multiple times in the outer query and is expensive to compute, an explicit WITH … AS MATERIALIZED can be a perf win — the engine evaluates once and reuses. The default inlined behaviour would compute it twice.
Multi-CTE analytical pipelines
The dbt-style 'one CTE per logical step' pattern is the working analyst's house style. Each CTE has one job; the final SELECT is usually a one-liner that pulls from the last CTE.
WITH active_users AS (SELECT user_idFROM usersWHERE deleted_at IS NULL AND last_seen_at > NOW() - INTERVAL '30 days'),revenue_30d AS (SELECT user_id, SUM(amount) AS spend_30dFROM ordersWHERE order_date > NOW() - INTERVAL '30 days'GROUP BY user_id),feature_use AS (SELECT user_id, COUNT(DISTINCT feature) AS features_usedFROM eventsWHERE event_type = 'feature_used' AND event_date > NOW() - INTERVAL '30 days'GROUP BY user_id)SELECT a.user_id,COALESCE(r.spend_30d, 0) AS spend_30d,COALESCE(f.features_used, 0) AS features_usedFROM active_users aLEFT JOIN revenue_30d r USING (user_id)LEFT JOIN feature_use f USING (user_id);
Exercise
You have three tables: customers(id, signup_date, country), orders(customer_id, order_date, amount), churn_events(customer_id, churned_at). Write a multi-CTE query that returns: country, total customers, lifetime spend per customer (avg), and 90-day churn rate (% of customers signed up >90 days ago who have a churn_event). Use one CTE per logical step.