Aggregation is how you turn rows into summaries: counts, sums, averages, min/max. The verb is GROUP BY, and once it clicks, half your daily reporting work writes itself.
Aggregate functions without GROUP BY
SELECT COUNT(*), SUM(total_amount), AVG(total_amount)FROM orders;
Without GROUP BY, an aggregate collapses the entire table to one row. Three numbers: total count of orders, total revenue, average order value.
GROUP BY
SELECT country, COUNT(*), SUM(total_amount)FROM orders oJOIN customers c ON c.id = o.customer_idGROUP BY countryORDER BY 3 DESC;
Now we get one row per country, with the count and revenue for that country. ORDER BY 3 means 'sort by the third column' — a shortcut. The result is a country-level rollup, sorted by revenue descending.
The GROUP BY rule
Every column in your SELECT must either be (a) inside an aggregate function or (b) listed in the GROUP BY. Otherwise the database doesn't know which value to show — there are many rows per group.
Common aggregates
- COUNT(*) — every row, including NULLs
- COUNT(column) — non-NULL values only
- COUNT(DISTINCT column) — unique values
- SUM(column) — total
- AVG(column) — mean (NULLs ignored)
- MIN(column), MAX(column) — extremes
HAVING — filtering after aggregation
SELECT country, COUNT(*) AS order_countFROM orders oJOIN customers c ON c.id = o.customer_idGROUP BY countryHAVING COUNT(*) > 100ORDER BY order_count DESC;
WHERE filters rows before grouping. HAVING filters groups after. If you try to use COUNT(*) in WHERE, the database rejects it — at the WHERE stage, the count doesn't exist yet.
Exercise
Find the top 5 customers by lifetime revenue, including their email and total spent.