Skip to content
Module 02 of 855 min readIntermediate

Window functions in depth

ROW_NUMBER vs RANK vs DENSE_RANK, LAG/LEAD, FIRST_VALUE, frame specifications.

25%

Listen along

Read “Window functions in depth” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Learning objectives

By the end of this module, you should be able to:

  • 01Distinguish ROW_NUMBER, RANK, DENSE_RANK — and pick the right one
  • 02Use LAG/LEAD/FIRST_VALUE/LAST_VALUE to compute differences from a baseline
  • 03Read and write window-frame specifications (ROWS vs RANGE BETWEEN)

Window functions are SQL's superpower. They let you compute a per-row value that depends on other rows in a 'window' — without collapsing rows the way GROUP BY does. The Foundations course introduced SUM() OVER (). This module gets you fluent in the rest of the toolkit.

ROW_NUMBER vs RANK vs DENSE_RANK

All three assign a per-row position within an ordered partition. The difference is in how they handle ties.

sql
SELECT name, score,
ROW_NUMBER() OVER (ORDER BY score DESC) AS rn,
RANK() OVER (ORDER BY score DESC) AS rk,
DENSE_RANK() OVER (ORDER BY score DESC) AS dr
FROM students;
-- name score rn rk dr
-- Aisha 95 1 1 1
-- Bilal 95 2 1 1
-- Chen 90 3 3 2
-- Dera 85 4 4 3
  • ROW_NUMBER — unique 1, 2, 3, … even when values tie (assigned arbitrarily unless you add a tiebreaker to ORDER BY)
  • RANK — same rank for ties; the next row jumps the count (1, 1, 3 instead of 1, 1, 2)
  • DENSE_RANK — same rank for ties; the next row is consecutive (1, 1, 2)

Which one when?

Use ROW_NUMBER for de-duplication ('keep the most recent row per customer'). Use RANK when leaderboards must skip positions after ties. Use DENSE_RANK when computing a small number of distinct tiers (top-1, top-2, top-3 with ties counted as one tier).

LAG, LEAD, FIRST_VALUE, LAST_VALUE

These let a row peek at neighbouring rows. The killer use case is period-over-period change without a self-join.

sql
SELECT month, revenue,
LAG(revenue, 1) OVER (ORDER BY month) AS prev_month,
revenue - LAG(revenue, 1) OVER (ORDER BY month) AS mom_change,
(revenue * 1.0 / LAG(revenue, 1) OVER (ORDER BY month) - 1) AS mom_pct
FROM monthly_revenue;

LAG(col, n) returns the value n rows before the current row in the window. LEAD(col, n) returns n rows after. FIRST_VALUE returns the first row's value in the window — useful for 'difference from a baseline'. LAST_VALUE has a frame gotcha covered below.

Frames: ROWS BETWEEN vs RANGE BETWEEN

Every window function operates over a 'frame' — a subset of rows within the partition. Default frame for most engines is 'all rows from the start of the partition up to and including the current row.' For LAST_VALUE that surprises people because it gives you the current row, not the last.

sql
-- 3-row trailing moving average
SELECT date, value,
AVG(value) OVER (
ORDER BY date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg_3
FROM daily_metrics;
-- LAST_VALUE that actually gives the last row in the partition
SELECT date, value,
LAST_VALUE(value) OVER (
ORDER BY date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS partition_last
FROM daily_metrics;
  • ROWS frame — counts physical rows. BETWEEN 2 PRECEDING AND CURRENT ROW = a 3-row sliding window.
  • RANGE frame — value-based; rows with the same ORDER BY value are treated as a single position. BETWEEN '7 days' PRECEDING AND CURRENT ROW gives a calendar-window, not a row-count window.
  • Default frame is RANGE UNBOUNDED PRECEDING — the reason LAST_VALUE 'doesn't work as expected' without an explicit frame.

Always specify the frame for LAST_VALUE and any moving aggregate

Half the bugs I've seen with window functions come from relying on the default frame. If you write a moving average or a 'last value in the window' query, spell out the frame explicitly. Future-you will thank you.

Exercise

You have a table sales(customer_id, sale_date, amount). Write a query that returns, for each sale: the customer's running total to date, the running average, the rank of this sale by amount among that customer's sales, and the days since their previous sale. Result should keep one row per sale.

Key takeaways

  • ROW_NUMBER / RANK / DENSE_RANK differ in tie-handling — pick by intent, not habit
  • LAG/LEAD eliminate self-joins for period-over-period analysis
  • Always spell out the frame (ROWS BETWEEN …) for moving aggregates and LAST_VALUE

Further reading

  1. 01

    Modern SQL: Use the Index, Luke

    Markus WinandWinand's site is the definitive reference for window-function semantics across dialects. Bookmark it.

  2. 02

    PostgreSQL Documentation: Window Functions

    PostgreSQL Global Development Group

Loading progress…
LeadAfrikPublic Economics Hub