Skip to content
Module 01 of 850 min readAdvanced

How a SQL engine executes a query

Parse → plan → execute. Cost-based optimisation. Why stale statistics break good queries.

13%

Listen along

Read “How a SQL engine executes a query” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Learning objectives

By the end of this module, you should be able to:

  • 01Describe the path of a query from SQL text to result set: parse → plan → execute
  • 02Explain the role of the cost-based optimiser and table statistics
  • 03Recognise when statistics are stale and the planner is making bad choices

Every SQL engine is doing the same three things behind your SELECT: parsing the text into a tree, planning the cheapest way to execute it, and running the plan. Advanced SQL starts when you stop treating that as a black box.

The parse → plan → execute lifecycle

  1. Parse — the SQL text is tokenised and built into a syntax tree. Errors at this stage are syntax errors.
  2. Rewrite — view definitions are expanded, simple algebraic simplifications are applied.
  3. Plan — the optimiser enumerates candidate plans (different join orders, scan types, aggregation strategies) and assigns each a cost estimate based on table statistics. The cheapest plan wins.
  4. Execute — the chosen plan runs. The engine streams rows through the operators (scan, join, sort, aggregate) and produces the result.

Cost-based optimisation

Modern engines (Postgres, MySQL, Snowflake, BigQuery, Trino) are cost-based — they compare candidate plans using estimated cost. Cost is a unit-less number combining I/O cost, CPU cost, and memory cost. The plan with the lowest cost is chosen.

Cost estimates depend on table statistics: row counts, column histograms, NDV (number of distinct values), null fractions, correlation between physical row order and column values. The planner reads these from pg_statistic (Postgres), information_schema.tables (MySQL), or equivalent catalogue tables.

Stale statistics = bad plans

If statistics say a table has 1,000 rows but it actually has 10M, the planner will think a sequential scan is cheap and skip the index. Symptom: a query that 'suddenly got slow' after a big data load. Fix: ANALYZE the table (Postgres) or ANALYZE TABLE (MySQL). Auto-vacuum/auto-analyze handles this in most cases, but heavy bulk loads can outrun it.

What the optimiser actually decides

  • Scan type per table: sequential scan, index scan, index-only scan, bitmap heap scan
  • Join order: A JOIN B JOIN C vs A JOIN (B JOIN C) etc. Theoretical N! combinations; the planner uses heuristics + dynamic programming
  • Join algorithm per pair: nested loop, hash join, merge join
  • Aggregation strategy: hash aggregate vs sort-then-aggregate (GroupAggregate)
  • Whether to use parallel workers (most modern engines)
  • Whether intermediate results spill to disk (sort, hash, aggregate spills)

The 'plan you got vs the plan you wanted' problem

When a query is slow, the question is rarely 'should I add more memory?' — it's 'why did the planner make a bad choice?'. The five usual suspects:

  1. Stale statistics (most common) — ANALYZE
  2. Skewed data — one value covers 90% of rows; histogram-based planning gets it wrong. Fix: extended statistics, query hints in some engines, or rewrite
  3. Missing index — sequential scan when you wanted index lookup
  4. Suboptimal join order — usually surfaces with a 10+ join query; planner heuristics give up. Fix: rewrite to influence order, or set join_collapse_limit (Postgres)
  5. Parameter sniffing (SQL Server) / generic plans (Postgres) — a prepared statement gets a plan optimised for one parameter that's wrong for another. Fix: force a re-plan or use literal values

Exercise

A query that was fast yesterday is now timing out today. Nothing about the query changed. Walk through the diagnostic sequence — what do you check first, second, third? Why in that order?

Key takeaways

  • Three phases: parse → plan → execute. The plan phase is where wins are made and lost.
  • Cost-based optimisers depend on accurate table statistics — keep ANALYZE current
  • When a query 'suddenly got slow', check statistics freshness before anything else

Further reading

  1. 01

    PostgreSQL Internals

    Hironobu SUZUKIThe definitive free deep-dive into the Postgres query planner. Chapters 3–4 cover statistics and cost estimation.

Loading progress…
LeadAfrikPublic Economics Hub