Skip to content
Module 09 of 1235 min readBeginner

Tables, keys, and a tiny bit of normalization

How to read a schema, why foreign keys matter, and when denormalization is a feature not a bug.

75%

Listen along

Read “Tables, keys, and a tiny bit of normalization” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

You don't need to design databases to query them — but you do need to read schemas. Five concepts that show up in every real database you'll touch.

Tables, rows, columns

Already covered, but worth restating: a table is a typed grid. Each column has a declared data type (int, text, timestamp, decimal, boolean). Some columns can be NULL; others are NOT NULL.

Primary keys

A column (or combination) that uniquely identifies each row. Almost always called id. The primary key is what foreign keys point to.

Foreign keys

A column that references another table's primary key. orders.customer_id is a foreign key into customers.id. Foreign keys are the structural reason JOINs work — they encode the relationship 'this order belongs to that customer'.

Foreign keys vs JOIN keys

A foreign key is a database-level constraint that prevents orphan rows. The JOIN syntax doesn't require it — you can join any two columns. But a well-designed schema declares foreign keys, and you should expect to JOIN on them.

Normalization in one paragraph

The classical idea: don't repeat data. Store customer info in customers; in orders, store only customer_id. To get the customer's name on an order, JOIN. Versus denormalized: store the customer's name on every order row. The second is faster to read but a nightmare to keep consistent. Real systems compromise: normalized for transactional tables, denormalized for analytics tables.

Reading a schema diagram

  • Boxes are tables; lines are foreign-key relationships
  • A line with '1' on one end and 'N' on the other is one-to-many (one customer has many orders)
  • A line with 'N' on both ends is many-to-many — usually means a join table sits between them
  • Bold or PK-marked columns are primary keys

When you start at a new job, your first afternoon should be reading the schema. It tells you what questions are askable and how.

Exercise

A SACCO is designing the database schema for its new loan management system. The system needs to track: (a) members (one row per member), (b) loan products (e.g. salary advance, school fees loan, business loan), (c) loan applications, (d) approved loans, (e) loan repayments, (f) guarantors per loan. Sketch the relational schema: what tables, what columns, what keys, what relationships? Identify which relationships are one-to-many and which are many-to-many, and where you would need a junction table.

Loading progress…
LeadAfrikPublic Economics Hub