EXPLAIN tells you what the planner intends to do. EXPLAIN ANALYZE actually runs the query and tells you what it did, with real timings and row counts. The actual is what you read; the estimate is what you compare against.
Reading a plan top-to-bottom
Plans are trees — children execute first and feed rows upward to the parent. The top operator is the last thing that runs. The convention in most engines is that indentation shows the tree.
EXPLAIN (ANALYZE, BUFFERS)SELECT c.country, COUNT(*)FROM orders o JOIN customers c ON o.customer_id = c.idWHERE o.order_date > '2025-01-01'GROUP BY c.country;
HashAggregate (cost=1854.32..1854.45 rows=12 width=20) (actual time=42.1..42.2 rows=12 loops=1)Group Key: c.country-> Hash Join (cost=83.40..1684.21 rows=34022 width=12) (actual time=1.2..38.4 rows=34000 loops=1)Hash Cond: (o.customer_id = c.id)-> Index Scan using orders_date_idx on orders o (cost=0.42..950.31 rows=34022 width=4) (actual time=0.1..15.1 rows=34000 loops=1)Index Cond: (order_date > '2025-01-01'::date)-> Hash (cost=42.00..42.00 rows=3300 width=12) (actual time=1.0..1.0 rows=3300 loops=1)-> Seq Scan on customers c (cost=0.00..42.00 rows=3300 width=12)Planning Time: 0.21 msExecution Time: 42.5 ms
- Read the bottommost (most indented) lines first — that's what runs first
- Each node shows: estimated cost range (startup..total), estimated rows, actual time (startup..total), actual rows, loops
- Compare rows estimate vs actual. If they differ by >10×, statistics are wrong or skewed
- Time is cumulative — the parent's time includes its children
Scan types
- Seq Scan — read every row of the table. Right when the filter is unselective (>5% of rows) or no index exists
- Index Scan — walk an index to find matching rows, then fetch each row from the heap. Right for selective filters
- Index Only Scan — walk an index and never touch the heap because the index has all needed columns ('covering index'). Fastest read pattern
- Bitmap Heap Scan — build a bitmap of matching tuple IDs from one or more indexes, then read the heap in physical order. Right for medium-selectivity filters or multi-index AND/OR
Join algorithms
- Nested Loop — for each row in the outer, scan/probe the inner. Right when the outer is tiny or the inner has a perfect index
- Hash Join — build a hash table on the inner, probe with the outer. Right for two large tables with no useful indexes on the join keys. Spills to disk if hash > work_mem
- Merge Join — both sides pre-sorted on the join key; merge in linear time. Right when both inputs are already sorted (e.g., from an index)
The four canonical plan problems
- Seq Scan on a big table with a selective filter — missing index. Fix: CREATE INDEX on the filter column
- Nested Loop with millions of outer rows — wrong choice for big-on-big. Fix: ANALYZE; if still wrong, ensure both sides have stats and consider a join hint
- Sort spill to disk — operator exceeds work_mem. Fix: bump work_mem, or eliminate the sort (use an index that already provides order)
- Rows estimate 10× off — stale or insufficient statistics. Fix: ANALYZE, increase statistics target, or extended statistics for correlated columns
BUFFERS gives you the I/O picture
EXPLAIN (ANALYZE, BUFFERS) shows shared_hit (cached pages read) and shared_read (disk reads). A query with 1000 buffer reads is doing 8MB of I/O. If that number is large relative to the table size, you're not benefiting from cache or you're missing an index.
Exercise
Here's a plan: 'Seq Scan on transactions (cost=0..5.2M rows=50M actual time=8500..23000 rows=51234567 loops=1) Filter: (status='pending' AND created_at > '2025-04-01'). Rows Removed by Filter: 48,765,433.' The query has been running daily for months and just got slow this week. (1) Read this plan in plain English. (2) Diagnose the issue. (3) Propose three fixes ranked by effort.