Skip to content
Module 01 of 840 min readIntermediate

Reporting vs analysis — the four shapes

Recognise which of the four query shapes a question wants before you write the SELECT.

13%

Listen along

Read “Reporting vs analysis — the four shapes” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Learning objectives

By the end of this module, you should be able to:

  • 01Separate 'reporting SQL' from 'analytical SQL' — different jobs, different shapes
  • 02Frame an analytical question before you write the first SELECT
  • 03Recognise the four query shapes that cover 80% of working-analyst output

Foundations SQL gets you to the point where you can answer a closed question — 'how many active customers do we have?', 'what was last month's revenue?'. Intermediate SQL is where you start answering open questions: 'why is retention falling?', 'which campaign actually drove conversion?', 'what does this number look like normalised against the same period last year?'.

The four shapes

Working analysts spend almost all their time in one of four query shapes. Recognise them and you can plan a query before you write it.

  1. Reporting query — flatten a fact table down to summary numbers (revenue by month, signups by region). Usually GROUP BY + aggregates.
  2. Cohort query — group entities (users, customers, loans) by an entry date, then track their behaviour through time. Always a JOIN to an events table + window or pivoted aggregates.
  3. Funnel query — for each entity, find whether/when it reached each step in a sequence. Usually CASE + MIN/MAX over events + a pivot to wide.
  4. Time-series query — sequence values along a timestamp, compute moving averages / rolling sums / period-over-period change. Window functions over ORDER BY date.

The unlock

Once you can recognise which of the four shapes a question wants, you can plan the joins, grouping, and window frames before typing. Beginners write queries forward (FROM → SELECT). Analysts write them backwards (what's the output shape → what produces it).

Frame the question before the SELECT

A useful pre-query ritual: write the desired output as a tiny example table on paper or in a comment block. Three columns and three rows. If you can't fake the output, you don't yet understand the question. This single habit eliminates half the rewrite cycles in your first year.

sql
-- Question: monthly active customers by country, last 12 months.
-- Output shape (faked):
-- month | country | mac
-- 2025-01-01 | Kenya | 1,240
-- 2025-01-01 | Uganda | 412
-- 2025-02-01 | Kenya | 1,310
-- ...
-- That tells me: GROUP BY month + country, COUNT DISTINCT user, WHERE event_date in last 12 months.

Exercise

Take this brief: 'Marketing wants to know if the new referral programme is actually growing the user base, or just cannibalising organic signups.' (1) What's the actual question hidden inside? (2) Fake the output table you'd need to answer it. (3) Which of the four shapes is this, and why?

Key takeaways

  • Most analyst queries fall into one of four shapes — reporting, cohort, funnel, time-series
  • Write the faked output table before the SELECT — if you can't sketch it, you don't yet understand the question
  • The shift from beginner to intermediate is planning the query backwards from the output, not forwards from the tables

Further reading

  1. 01

    SQL for Data Analysis

    Cathy Tanimura · O'Reilly · 2021The closest thing to a canonical intermediate textbook. Tanimura's framing of 'time-series, cohort, text, anomaly' patterns is the inspiration for this module.

Loading progress…
LeadAfrikPublic Economics Hub