Skip to content
Module 02 of 1240 min readBeginner

Atomic types and coercion

Numeric, integer, character, logical. NA. Implicit coercion rules and how they bite.

17%

Listen along

Read “Atomic types and coercion” aloud

Plays in your browser using on-device text-to-speech — nothing leaves the page.

Learning objectives

By the end of this module, you should be able to:

  • 01Identify R's five atomic types: numeric, integer, character, logical, complex
  • 02Handle NA values correctly — using is.na() and the na.rm argument
  • 03Predict and prevent implicit coercion bugs
  • 04Distinguish typeof() from class() and know when each matters

R has a small set of atomic types and a long list of rules about how they coerce into each other. Memorising the coercion order is the single most useful five-minute investment in your R fluency.

The five atomic types

  • numeric (double) — real numbers, the default for any number you type
  • integer — whole numbers, explicit with the L suffix: 5L
  • character — strings, double or single quotes
  • logical — TRUE or FALSE (also written T or F)
  • complex — complex numbers, rare in applied work

NA — the explicit missing value

NA is R's representation of missing data. It is contagious: any arithmetic involving NA produces NA. mean(c(1, 2, NA)) is NA, not 1.5. To skip NAs, pass na.rm = TRUE: mean(c(1, 2, NA), na.rm = TRUE) gives 1.5.

r
x <- c(1, 2, NA, 4, 5)
mean(x) # NA
mean(x, na.rm = TRUE) # 3
is.na(x) # FALSE FALSE TRUE FALSE FALSE
sum(is.na(x)) # 1 — count of missing values

Implicit coercion

When R combines values of different types in a vector, it coerces them all to the most permissive type. The order, from most to least permissive: character > complex > numeric > integer > logical.

r
c(1, 2, 3, "text") # all coerced to character
c(1, 2, TRUE, FALSE) # logical -> numeric: 1 2 1 0
# Explicit coercion
as.numeric("123") # 123
as.character(456) # "456"
as.logical(c(0, 1, 2)) # FALSE TRUE TRUE

The coercion that bites

Reading a CSV where one row has a stray text value in a numeric column — R coerces the entire column to character. Any subsequent mean() returns NA with a warning. Always check class() of every column after import.

typeof vs class

typeof() returns the underlying storage type. class() returns the (possibly user-assigned) class attribute. For most everyday values they agree; for objects (data frames, fitted models), class is what the methods dispatch on.

Exercise

Compute the mean of c(10, 20, NA, 30, 40), removing NAs.

Key takeaways

  • NA is contagious: any arithmetic with NA returns NA. Use na.rm = TRUE to skip
  • Coercion order: character > complex > numeric > integer > logical
  • A single text value in a numeric column will coerce the entire column — always check class() after import
  • Explicit coercion: as.numeric(), as.character(), as.logical() — never rely on implicit

Further reading

  1. 01
  2. 02

    R Inferno

    Patrick Burns · 2011A short, funny tour of R's quirks and traps — every R analyst should read it once.

  3. 03
Loading progress…
LeadAfrikPublic Economics Hub