Summarizing
Get some data to summarize
We will use data from the following page: https://data.baltimorecity.gov/Culture-Arts/Restaurants/k5ry-ef3g
Run this code in order to download sample data for this tutorial.
View data to make sure there is at least something.
Check size in bytes.
View data in table.
View first few rows.
View last few rows.
Get summary for each attribute.
Return types and more info about data.
See quantile of values in specific column.
Get quantile for different percentiles.
Create table of values from a column. Option ifany
will enable the table to show missing values.
Make two dimensional table.
Check for missing values. The following command returns number of NA values.
The same as above but this returns true or false.
Take all values and check if all fulfil a condition.
Check all columns and get count of NA values for each column.
Covert the command above to single command that returns true or false instead of counts.
Find all values that full fill a condition.
The same as above but with multiple values. There is OR condition between the values in the condition.
Filter out specific rows based on a condition and return values (not just numbers/counts).
Cross tabs. The following example is nonsense. But it will break down the data by policeDistrict and counsilDistrict and create sums from zipCodes values.
Last updated