lm() is R's built-in function for linear regression. The output is rich with information — coefficients, standard errors, R-squared, F-test, residual diagnostics — and the broom package wraps it into tidy data frames you can pipe into more dplyr.
Fitting an OLS regression
model <- lm(lending_rate ~ deposit_rate, data = bankrates)summary(model)
The formula syntax y ~ x is the heart of R's modelling interface. lm parses the formula, builds the design matrix, fits OLS, and returns an lm object.
Multiple predictors
model <- lm(lending_rate ~ deposit_rate + month, data = bankrates)# Interactions: * means main effects + interaction; : means interaction onlylm(y ~ x * z, data = df) # x + z + x:zlm(y ~ x:z, data = df) # x:z only
Reading summary(model)
- Coefficients: estimate, std error, t value, Pr(>|t|)
- Residual standard error: sqrt of mean squared residual
- Multiple R-squared: fraction of variance explained
- F-statistic: joint test of all coefficients = 0
- Stars: significance codes — informally read but technically a p-hacking warning sign in published work
broom — tidying regression output
library(broom)tidy(model) # data frame: term, estimate, std.error, statistic, p.valueglance(model) # one-row data frame: r.squared, adj.r.squared, p.value, etc.augment(model) # original data + fitted values + residuals
Predict and confidence intervals
predict(model, newdata = data.frame(deposit_rate = 0.06))predict(model, interval = "confidence")predict(model, interval = "prediction") # wider — for new observations
Robust standard errors
Base lm gives classical SEs. For HC-robust SEs, use the sandwich and lmtest packages: coeftest(model, vcov = vcovHC(model, type = 'HC3')).
Exercise
Fit lm(lending_rate ~ deposit_rate) on bankrates and print summary().