Skip to content
Module 08 of 1245 min readBeginner

CASE expressions, strings, and dates

Conditional logic, text wrangling, and time arithmetic — the daily tools of cleaning real data.

67%

Listen along

Read “CASE expressions, strings, and dates” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Real datasets are messy. CASE expressions, string functions, and date arithmetic are the daily tools for cleaning, classifying, and time-bucketing.

CASE WHEN

sql
SELECT first_name,
CASE
WHEN total_spent > 1000 THEN 'high'
WHEN total_spent > 100 THEN 'mid'
ELSE 'low'
END AS segment
FROM customers;

If/else inside a query. Often used inside aggregations to count specific categories:

sql
SELECT
COUNT(*) AS total,
COUNT(CASE WHEN country = 'Kenya' THEN 1 END) AS kenya_count,
AVG(CASE WHEN country = 'Kenya' THEN total_amount END) AS kenya_avg_amount
FROM orders;

String functions

  • LOWER(s), UPPER(s) — case conversion
  • TRIM(s), LTRIM, RTRIM — strip whitespace
  • CONCAT(a, b) or a || b — string concatenation
  • SUBSTRING(s FROM x FOR y) — extract substring
  • REPLACE(s, old, new) — find-and-replace
  • LENGTH(s) — character count
  • POSITION(sub IN s) — find substring location
  • REGEXP_REPLACE(s, pattern, replacement) — full regex (Postgres)

Date functions

sql
-- Truncate to month: 2024-03-15 → 2024-03-01
SELECT date_trunc('month', created_at) AS month, COUNT(*)
FROM orders
GROUP BY 1
ORDER BY 1;
-- Difference in days
SELECT id, AGE(NOW(), created_at) AS days_since_signup
FROM customers;

Different databases use different functions (date_trunc in Postgres, strftime in SQLite, DATE_TRUNC in BigQuery). The concepts are universal: truncate to a period, compute differences, extract parts (year, month, dow).

Always store timestamps in UTC

Time zone confusion is a top-10 source of bugs in analytics. Store everything in UTC, convert at display time. If you have to mix zones, use TIMESTAMPTZ in Postgres and be explicit.

Exercise

Bucket customers into 'new' (<30 days), 'active' (<180 days), or 'inactive' based on their last order date.

Loading progress…
LeadAfrikPublic Economics Hub