Data validation
Validating that your data are as you expect is a critical step before processing them. Minimally, you should inspect your data sets with str()
or glimpse()
or look at them in the Environment tab. You can build validation rules to check your data with assertions and testing.
Summarizing data
There are many different ways to summarize data in R. head()
is a way to see the first few rows of a data set using base R, whereas glimpse()
is the tidyverse way. Try both head()
and glimpse()
on the airquality
data set (built into base R).
summary()
function on the airquality
data.
summary(airquality)
Validating with {dataReporter}
The {dataReporter}
package provides a nice way of generating a data dictionary (they call them codebooks) while giving you a nice overview of your data. Install and load {dataReporter}
. Run makeCodebook()
on toyData
to explore this data set.
Wrap-up
Congratulations, you finished the tutorial!
To get credit for this assignment, replace my name with the first name that you submitted in the course introduction form in the code below and click Run Code to generate the text for you to submit to Canvas.
# replace my name below with your first name (surrounded by quotes)
first_name <- "Jeff"
generate_text(first_name)
Assignment complete!
Great! Copy that code into Canvas, and you're all set for this tutorial.