Skip to content
Module 03 of 1255 min readMixed

The data model — star schema and relationships

Fact and dimension tables, one-to-many vs many-to-many, bi-directional filtering and its dangers. Why analysts who skip modelling fight DAX forever.

25%

Listen along

Read “The data model — star schema and relationships” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Learning objectives

By the end of this module, you should be able to:

  • 01Build a star schema with one fact table and multiple dimension tables
  • 02Set up relationships with the correct cardinality and filter direction
  • 03Recognise when bi-directional relationships are appropriate (rarely) and dangerous (usually)

Power BI rewards clean data modelling more than any other BI tool. DAX is built on the assumption that you have a properly normalised model — fact tables in the middle, dimension tables around them, relationships connecting them. Analysts who skip this step fight DAX forever. Analysts who get it right write DAX that 'just works'.

The star schema

  • Fact table: the transactions / events. Sales, orders, page views, sensor readings. Many rows, few columns of measures, many foreign keys to dimensions.
  • Dimension tables: the 'who, what, where, when'. Customer, product, store, date. Few rows (relatively), descriptive columns.
  • Relationships: foreign-key joins. Sales.customer_id → Customer.id. One-to-many: many sales per customer.

Cardinality and filter direction

Every relationship has a cardinality and a filter direction. Cardinality: one-to-many is the standard. Many-to-many is allowed but introduces complexity. One-to-one is rare and usually means you should merge the tables.

Filter direction: single (the default) means dimension filters propagate to fact. Bi-directional means filters also propagate back from fact to dimension. Bi-directional is occasionally useful (e.g., 'show only customers who have ordered'), but it creates ambiguous filter contexts in DAX and is the #1 cause of mysterious calculation bugs.

The Date dimension

Power BI's time intelligence functions (SAMEPERIODLASTYEAR, TOTALYTD, DATESBETWEEN) require a proper Date dimension table — a table with one row per date, contiguous, marked as a date table. The auto-generated 'Date Hierarchy' on date columns isn't enough; build a real Date table in Power Query or use Marco Russo's standard M script. This is mandatory infrastructure for any serious financial model in Power BI.

The flat-table trap

Many beginners load a single big CSV and try to build everything on it. Performance is poor, DAX is hard, and refresh times are long. Always normalise: in Power Query, split the CSV into a fact table and dimension tables. Even a tiny dataset benefits — the model relationships are what DAX uses to reason about filter context.

Exercise

Take any flat CSV with 5+ columns. In Power Query, normalise it: identify the dimensions (date, customer, product, etc.) and split each into a separate query that contains only the unique values + an ID column. Replace the original columns in the fact table with the IDs. Set up relationships. Build a simple chart. Was DAX easier to write than it would have been on the flat table?

Key takeaways

  • Star schema: one fact table, many dimension tables. The right shape for almost all Power BI work.
  • Default to single-direction filtering. Bi-directional has hidden gotchas.
  • A clean model makes DAX simple; a flat 'one big table' makes DAX a nightmare.
Loading progress…
LeadAfrikPublic Economics Hub