The data model — star schema and relationships — Power BI Module 3

Power BI rewards clean data modelling more than any other BI tool. DAX is built on the assumption that you have a properly normalised model — fact tables in the middle, dimension tables around them, relationships connecting them. Analysts who skip this step fight DAX forever. Analysts who get it right write DAX that 'just works'.

The star schema

Fact table: the transactions / events. Sales, orders, page views, sensor readings. Many rows, few columns of measures, many foreign keys to dimensions.
Dimension tables: the 'who, what, where, when'. Customer, product, store, date. Few rows (relatively), descriptive columns.
Relationships: foreign-key joins. Sales.customer_id → Customer.id. One-to-many: many sales per customer.

Cardinality and filter direction

Every relationship has a cardinality and a filter direction. Cardinality: one-to-many is the standard. Many-to-many is allowed but introduces complexity. One-to-one is rare and usually means you should merge the tables.

Filter direction: single (the default) means dimension filters propagate to fact. Bi-directional means filters also propagate back from fact to dimension. Bi-directional is occasionally useful (e.g., 'show only customers who have ordered'), but it creates ambiguous filter contexts in DAX and is the #1 cause of mysterious calculation bugs.

The Date dimension

Power BI's time intelligence functions (SAMEPERIODLASTYEAR, TOTALYTD, DATESBETWEEN) require a proper Date dimension table — a table with one row per date, contiguous, marked as a date table. The auto-generated 'Date Hierarchy' on date columns isn't enough; build a real Date table in Power Query or use Marco Russo's standard M script. This is mandatory infrastructure for any serious financial model in Power BI.

The flat-table trap

Many beginners load a single big CSV and try to build everything on it. Performance is poor, DAX is hard, and refresh times are long. Always normalise: in Power Query, split the CSV into a fact table and dimension tables. Even a tiny dataset benefits — the model relationships are what DAX uses to reason about filter context.

Exercise

Take any flat CSV with 5+ columns. In Power Query, normalise it: identify the dimensions (date, customer, product, etc.) and split each into a separate query that contains only the unique values + an ID column. Replace the original columns in the fact table with the IDs. Set up relationships. Build a simple chart. Was DAX easier to write than it would have been on the flat table?