New Variables
We might want to create new variables because we want to
add extra value for missing values, e.g. NA
create quantitavive values, for example from numbers
do transformation
Get some data to summarize
We will use data from the following page: https://data.baltimorecity.gov/Culture-Arts/Restaurants/k5ry-ef3g
Run this code in order to download sample data for this tutorial.
if (!file.exists("./data")) { dir.create("./data") }
url <- "https://data.baltimorecity.gov/api/views/k5ry-ef3g/rows.csv?accessType=DOWNLOAD"
download.file(url, destfile="./data/restaurants.csv", method="curl")
data <- read.csv("./data/restaurants.csv")How to create a sequence
Sometimes, we might need to create a sequence.
> seq(1, 10, by=2)
[1] 1 3 5 7 9Create new column by subsetting.
Create binary column
Create categorical values
The following code will create categorize zipCode values into percentile 0-25, 25-50, 50-75 and 75-100.
Easier way to do that using Hmisc package.
Create factor values
zipCode is integer and we might want to turn it into factor type.
Levels of factor variables
Mutate function
Mutate function can be used to add variable to a new table.
Transformations
More here functions here: http://www.statmethods.net/management/functions.html
Last updated
Was this helpful?