Skip to content
Module 02 of 1250 min readMixed

Connecting to data — extracts vs live, joins, blends, relationships

Live connections vs extracts (.hyper), joins vs data blends vs the new relationship model (2020.2+), and when to pick which.

17%

Listen along

Read “Connecting to data — extracts vs live, joins, blends, relationships” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Learning objectives

By the end of this module, you should be able to:

  • 01Pick between a live connection and an extract for any data source
  • 02Choose joins, blends, or relationships for combining multiple tables
  • 03Recognise when data should be reshaped in Prep vs in Desktop

Tableau supports 100+ data sources. The connection decisions that matter are independent of the source — live vs extract, the shape of your join, and where to do the cleaning. Get these right and the rest of the work flows; get them wrong and you'll spend hours debugging.

Live vs extract

A live connection queries the source database every time a view renders or a filter changes. An extract (.hyper file) snapshots the data into Tableau's columnar engine, dramatically faster for analysis. The choice:

  • Live: real-time data needed (operational dashboards, monitoring), source can handle the query load, security policy demands it.
  • Extract: most analyst work. Faster, less load on the source, works offline, supports incremental refresh.
  • Hybrid: extract for the workbook's main analysis, live for one widget that needs current data.

Joins vs blends vs relationships

The relationship revolution (2020.2)

Before 2020.2, combining tables meant choosing a join level (which created duplicates and required GROUP BYs everywhere) or a blend (slow and limited). Relationships introduced 'noodle' connections — logical, not physical. Tableau decides the right join level per visualisation based on which fields are used. Almost always pick relationships now. Use explicit joins only for cartesian or non-standard combinations.

Tableau Prep

When data is messy — multiple files to union, columns that need pivoting, derived calculations across multiple tables — Tableau Prep is the right tool. Drag-and-drop ETL with a visual flow. Output goes to a .hyper extract that Desktop consumes. Many teams use Prep upstream of all their dashboards to enforce consistent definitions ('what is a customer?', 'what does revenue mean?').

Exercise

Connect Tableau to two related CSVs — for example, orders and customers, or transactions and accounts. Set up a relationship between them. Build two views: one that aggregates at the customer level, one at the order level. Does the relationship 'just work'? What would have required a more careful join in pre-2020.2 Tableau?

Key takeaways

  • Extracts (.hyper) are faster for analysis; live connections are fresher for production.
  • Relationships (2020.2+) are usually correct; joins are for when you specifically need cartesian behaviour.
  • Don't fight messy data in Desktop — fix it in Prep or upstream.
Loading progress…
LeadAfrikPublic Economics Hub