Loading Data Files in R

Supported Formats

R can load the following data files formats:

  • Compressed CSV (.csv.gz)

  • Parquet (.parquet)

Download the Data Files

Download and save the data files to a suitable location. In the examples that follow, the data has been saved to C:\data.

Loading Compressed (Gzip) CSV Data Files

You can use the below steps as a guide on how you can load compressed (Gzip) data files in R.

Reading the entire compressed (Gzip) CSV data file directly into a data frame.

df_date <- read.csv("C:\\data\\date\\date.csv.gz")

Loading Parquet Data Files

You can use the below steps as a guide on how you can load Parquet data files in R.

Install the arrow package.

install.packages("arrow")

Import the arrow library.

library(arrow)

Read the Parquet data file into a data frame.

df_date <- read_parquet("C:\\data\\date\\date.parquet")
df_anonymised_mot_test_result_info <- read_parquet("C:\\data\\anonymised_mot_test_result_info\\anonymised_mot_test_result_info.parquet")

Read a subset of the columns from the Parquet data file into a data frame.

df_mot_results_2017 <- read_parquet("C:\\data\\anonymised_mot_test_result\\anonymised_mot_test_result_2017.parquet", col_select = c("drv_anonymised_mot_test_date_key", "drv_anonymised_mot_test_result_info_key"))

Using R for Data Analysis

Guidance on how to analyse data in R is beyond the scope of this documentation.

You may find the following helpful:

Last updated

Was this helpful?