Skip to Tutorial Content

Combining data elements

Data types refer to individual elements of information. Those elements combine into different data structures. Technically, all elements of information (even individual ones) in R are vectors, and there are different forms of vectors.

Atomic vectors are homogeneous meaning that they contain a single data type. Lists are heterogenous meaning that they can contain multiple data types. We will refer to one dimensional atomic vectors as vectors. In terms of lists, we will primarily work with data frames or rectangular lists. Tibbles are special forms of data frames.

Vectors

Vectors can include numeric, character, or logical data, but they can only contain a single data type. The simplest way to create a vector is by using the c() function.

Create a vector called cast that includes the words Kenan, Punkie, and Molly in that order.
cast <- c("Kenan", "Punkie", "Molly")

Combine the vectors weekdays and weekend to create a new vector called week that starts with Monday.

weekdays <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
weekend <- c("Saturday", "Sunday")
weekdays <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
weekend <- c("Saturday", "Sunday")
week <- c(weekdays, weekend)

Sequences

We can create sequences of numbers with seq() or :. To get quartiles, run seq(from = 0, to = 100, by = 25). To get 1, 2, 3, run 1:3. Give these examples a try.

Write the code to produce a sequence from 0 to 1 in increments of 0.05 (include all argument names).
seq(from = 0, to = 1, by = 0.05)
Write the code to produce a sequence from 10 to 0 in increments first using seq() (include all argument names) then using :. Replace the _ with your answers.
seq(from = _, to = _, by = _)
_:_
seq(from = 10, to = 0, by = -1)
10:0

Repetitions

Sometimes, you need to create a repetition of values, e.g., when creating a column of experimental conditions. You can use the rep() function to either repeat single values or vectors of values. Vectors can be repeated either as a whole vector (times argument) or each element of the vector can be repeated (each argument).

conditions <- c("Control", "Treatment A", "Treatment B")
rep(conditions, times = 3)
## [1] "Control"     "Treatment A" "Treatment B" "Control"     "Treatment A"
## [6] "Treatment B" "Control"     "Treatment A" "Treatment B"
rep(conditions, each = 3)
## [1] "Control"     "Control"     "Control"     "Treatment A" "Treatment A"
## [6] "Treatment A" "Treatment B" "Treatment B" "Treatment B"
Repeat the entire myvector 10 times.
myvector <- 1:5
myvector <- 1:5
rep(myvector, times = 10)
Repeat each element of the myvector 10 times.
myvector <- 1:5
myvector <- 1:5
rep(myvector, each = 10)

Dimensions

We can use length() to find the length of a vector and dim() to get the dimensions of a data frame. We can also use nrow() and ncol() to get the number of rows and columns (respectively) for data frames.

Indexing

Extracting subsets of vector or data frame elements involves using the index operator []. For data frames, the first number represents the row and the second represents the column. For instance, mydf[2, 7] extracts the value from the second row and the seventh column. To extract vectors, use sequences or vectors to subset multiple rows and/or columns: mydf[1:2, c(3, 4, 7)]. Leave the row or column empty to select the entire row or column: mydf[, 2].

Here, we create a data frame.

mydf <- data.frame(matrix(1:25, ncol = 5))
names(mydf) <- letters[1:5]
mydf

Quiz

Wrap-up

Congratulations, you finished the tutorial!

To get credit for this assignment, replace my name with the first name that you submitted in the course introduction form in the code below and click Run Code to generate the text for you to submit to Canvas.

# replace my name below with your first name (surrounded by quotes)
first_name <- "Jeff"
generate_text(first_name)

Assignment complete!

Great! Copy that code into Canvas, and you're all set for this tutorial.

Data structures