R Excel Files
Microsoft Excel is the spreadsheet program which stores data in the .xls and .xlsx format. R has the facility to read directly from these files using some excel specific packages.
Some excel packages are: XLConnect, xlsx, gdata, RODBC, RExcel, etc. Here, we will use the xlsx package.
Install xlsx package
Xlsx package provides all necessary tools to interact with Excel files from R. Many of the benefits of the xlsx package come from being able to export and format excel files from R. Following command is used to install the xlsx package in R:
install.packages("xlsx")
Verify and Load the “xlsx” Package
Following command is used to verify and load the package:
# Verify the package is installed. any(grepl("xlsx",installed.packages())) # Load the library into R workspace. library("xlsx")
Output:
TRUE package ‘rJava’ successfully unpacked and MD5 sums checked package ‘xlsxjars’ successfully unpacked and MD5 sums checked package ‘xlsx’ successfully unpacked and MD5 sums checked
Creating an Excel File
Create two worksheets in excel. Use the default name sheet1 and sheet2. Give the name of excel is Student.xlsx.
Sheet1:
Roll_no name Age addmission_date dept 1 Nikita 24 2018-01-01 IT 2 Deep 29 2015-09-23 CS 3 Aryan 21 2017-11-15 Finance 4 Nidhi 25 2015-03-27 HR 5 Aman 27 2013-05-21 IT
And sheet2:
name city Nikita Bangalore Deep Delhi Aryan Bhopal Nidhi Raipur Aman Noida
Reading the Excel File
In R, read.xlsx() function is used to read the excel file. The result will store as a data frame in the R environment.
Example:
Let's see an example to read the first worksheet using a sheet index or name:
# using sheet name read.xlsx("Student.xlsx", sheetName = "Sheet1") # using sheet index read.xlsx("Student.xlsx", sheetIndex = 1)
Output:
Roll_no name Age addmission_date dept 1 1 Nikita 24 2018-01-01 IT 2 2 Deep 29 2015-09-23 CS 3 3 Aryan 21 2017-11-15 Finance 4 4 Nidhi 25 2015-03-27 HR 5 5 Aman 27 2013-05-21 IT Roll_no name Age addmission_date dept 1 1 Nikita 24 2018-01-01 IT 2 2 Deep 29 2015-09-23 CS 3 3 Aryan 21 2017-11-15 Finance 4 4 Nidhi 25 2015-03-27 HR 5 5 Aman 27 2013-05-21 IT
Read the second worksheet:
# using sheet index read.xlsx("Student.xlsx", sheetIndex = 2)
Output:
name city 1 Nikita Bangalore 2 Deep Delhi 3 Aryan Bhopal 4 Nidhi Raipur 5 Aman Noida
If we want to read data starts further down in the Excel worksheet we can include the startRow argument.
Example:
# reading data below the second line read.xlsx("Student.xlsx", sheetName = "Sheet2", startRow = 3)
Output:
Deep Delhi 1 Aryan Bhopal 2 Nidhi Raipur 3 Aman Noida
We can also specify the number of rows and columns. For this xlsx package has rowIndex and colIndex argument.
Example:
# read in a range of rows read.xlsx("Student.xlsx", sheetName = "Sheet2", rowIndex = 3:5)
Output:
Deep Delhi 1 Aryan Bhopal 2 Nidhi RaipurReference: https://www.datacamp.com/community/tutorials/r-tutorial-read-excel-into-r