Video Notes
All data in R has a type (what kind of thing it is) and a structure (how it’s organized).
Data Types
Here are some common data types you’ll encounter:
Numeric - Numbers with decimals
score <- 3.14
class(score) # numeric
Integer - Whole numbers; Add an "L" to make one
count <- 5L
class(count) # integer
Character - Text or strings (always in quotes)
name <- "James"
class(name) # "character"
Logical - TRUE or FALSE values
adult <- TRUE
class(adult) # "logical"
FYI
R also includes complex and raw data types, though these are rarely needed in typical analysis.
Why Data Types Matter
Data types determine how R handles your data. For example, you can’t add text to numbers:
age <- 18
name <- "James"
age + name # Error: non-numeric argument to binary operator
Converting types
You can convert between types using functions like as.numeric(), as.character(), and as.logical().
Example:
age <- "18"
class(age) # "character"
age <- as.numeric(age)
class(age) # numeric
This is especially helpful when cleaning imported data - for example, when numeric values were read in as text.
Data Structures
Data structures describe how data is organized.
The most common one-dimensional structure is the vector, which stores a sequence of values of the same type. Vectors are created using R’s c (combine) function:
scores <- c(1.5, 2.3, 5.0, 4.3, 6.5) # Numeric vector
names <- c("James", "Elliot", "Damien") # Character vector
congruent_trials <- c(TRUE, TRUE, FALSE, FALSE, TRUE)
A list can hold elements of different types, including other vectors or lists:
my_list <- list(1.5, "apple", TRUE, c(1, 2, 3)) # Mixed elements
A factor is a special type of vector used to organize categorical data. It stores the underlying data as integers and a set of labels called levels that describe what those integers represent. Example:
ratings <- factor(c("easy", "easy", "hard", "medium"))
str(ratings) # Factor w/ 3 levels "easy","hard",..: 1 1 2 3
When it comes to working with multi-dimensional data, R provides more complex data structures such as arrays, matrixes, and data frames. For the purpose of data analysis, data frames are most commonly used, so I’ve devoted a separate guide to those here: [COMING SOON].