Loading Data Files in R
Supported Formats
R can load the following data files formats:
Compressed CSV (
.csv.gz
)Parquet (
.parquet
)
Download the Data Files
Download and save the data files to a suitable location. In the examples that follow, the data has been saved to C:\data
.
Loading Compressed (Gzip) CSV Data Files
You can use the below steps as a guide on how you can load compressed (Gzip) data files in R.
Reading the entire compressed (Gzip) CSV data file directly into a data frame.
df_date <- read.csv("C:\\data\\date\\date.csv.gz")
Loading Parquet Data Files
You can use the below steps as a guide on how you can load Parquet data files in R.
Install the arrow
package.
install.packages("arrow")
Import the arrow
library.
library(arrow)
Read the Parquet data file into a data frame.
df_date <- read_parquet("C:\\data\\date\\date.parquet")
df_anonymised_mot_test_result_info <- read_parquet("C:\\data\\anonymised_mot_test_result_info\\anonymised_mot_test_result_info.parquet")
Read a subset of the columns from the Parquet data file into a data frame.
df_mot_results_2017 <- read_parquet("C:\\data\\anonymised_mot_test_result\\anonymised_mot_test_result_2017.parquet", col_select = c("drv_anonymised_mot_test_date_key", "drv_anonymised_mot_test_result_info_key"))
Using R for Data Analysis
Guidance on how to analyse data in R is beyond the scope of this documentation.
You may find the following helpful:
Last updated
Was this helpful?