tidyr handles reshaping data between wide and long formats. The tidy data principle: each variable in a column, each observation in a row, each value in a cell. Most analysis is easier on tidy (long) data; most reporting is easier on wide data.
pivot_longer — wide to long
library(tidyr)# Wide: bank as columnswide <- data.frame(month = c("2024-01", "2024-02"),KCB = c(0.13, 0.14),Equity = c(0.12, 0.13))# Long: bank as a valuelong <- wide |>pivot_longer(cols = c(KCB, Equity),names_to = "bank",values_to = "rate")
pivot_wider — long to wide
long |>pivot_wider(names_from = bank,values_from = rate)
separate — split a column
df <- data.frame(date = c("2024-01", "2024-02"))df |> separate(date, into = c("year", "month"), sep = "-")
Joining tables
library(dplyr)left_join(banks, deposits, by = "bank") # keep all banks rowsinner_join(banks, deposits, by = "bank") # only matchesfull_join(banks, deposits, by = "bank") # everything
When to use long vs wide
Long format: easier for ggplot (faceting, grouping by variable), easier for group_by + summarise, more flexible. Wide format: easier for human reading, easier for spreadsheet exports, easier for matrix operations. Most pipelines: convert to long → analyse → convert to wide for output.
Exercise
Pivot a wide data frame with months as rows and banks (KCB, Equity) as columns into long format with bank and rate columns.