R Data Reshaping
In R programming, data reshaping is the process of changing the way data is organized into rows and columns.
Most of the time data processing is done by taking the input data as a data frame.
Sometimes we need data frame in a different format. R provides many functions to split, merge and change the rows to columns and vice-versa in data frame.
Joining Columns and Rows in a Data Frame
cbind()
Syntax:
cbind(x1, x2,….)
Here, x1 and x2 may be a data frame, matrix or vector.
We use cbind() function to merge vector, matrix or data frame by columns.
Example:
name <- c("Deep","Nikita","Akansha","Raj") cls <- c("BE","MCA","MSC","MCA") age <- c(28,24,25,30) # Combine above three vectors into one data frame. students <- cbind(name,cls,age) # Print the data frame. print(students)
Output:
name cls age [1,] "Deep" "BE" "28" [2,] "Nikita" "MCA" "24" [3,] "Akansha" "MSC" "25" [4,] "Raj" "MCA" "30"
rbind()
rbind() function is used to combine vector, matrix or data frame by rows:
Syntax:
rbind(x1, x2,….)
Here, x1 and x2 may be a data frame, matrix or vector.
Example:
students <- data.frame( name = c("Deep","Nikita","Akansha","Raj"), cls = c("BE","MCA","MSC","MCA"), age = c(28,24,25,30) ) # Print the data frame1. print(students) new.students <- data.frame( name = c("Akash","Jhon","Dolly","Priya"), cls = c("BSC","BCA","MSC","MCA"), age = c(21,20,26,24) ) # Print the data frame2. print(new.students) # Combine rows form both the data frame1 and dataframe2. all.students <- rbind(students,new.students) # Print the result. print(all.students)
Output:
name cls age 1 Deep BE 28 2 Nikita MCA 24 3 Akansha MSC 25 4 Raj MCA 30 name cls age 1 Akash BSC 21 2 Jhon BCA 20 3 Dolly MSC 26 4 Priya MCA 24 Name cls age 1 Deep BE 28 2 Nikita MCA 24 3 Akansha MSC 25 4 Raj MCA 30 5 Akash BSC 21 6 Jhon BCA 20 7 Dolly MSC 26 8 Priya MCA 24
Merging Data Frame
We can merge two data or more frames by using the merge() function. It is similar to join of SQL. Here, both data frames must have at least one common column by which we can join them.
Example:
Let's see an example to merge two data frames by roll_no:
data.frame1 <- data.frame( roll_no = c(101, 102, 103, 104), name = c("Deep","Nikita","Akansha","Raj"), cls = c("BE","MCA","MSC","MCA"), age = c(28,24,25,30) ) # Print the data frame1. print(students) data.frame2 <- data.frame( roll_no = c(101, 102, 103, 104), dept = c("CS", "IT", "IT", "CS"), marks = c(90, 70, 88, 22) ) # Print the data frame2. print(new.students) total <- merge(data.frame1,data.frame2,by="roll_no") print(total)
Output:
name cls age 1 Deep BE 28 2 Nikita MCA 24 3 Akansha MSC 25 4 Raj MCA 30 name cls age 1 Akash BSC 21 2 Jhon BCA 20 3 Dolly MSC 26 4 Priya MCA 24 roll_no name cls age dept marks 1 101 Deep BE 28 CS 90 2 102 Nikita MCA 24 IT 70 3 103 Akansha MSC 25 IT 88 4 104 Raj MCA 30 CS 22
Transpose
We can transpose a matrix or data frame by using t() function.
Example:
a <- matrix(c(1:9), nrow = 3, byrow = TRUE) a x <- t(a) x
Output:
[,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6 [3,] 7 8 9 [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9
Melting and Casting
We can reshape the data into multiple steps in order to convert input data into the required format.
We generally melt data so that each row in converted into the unique id-variable combination. Then we cast this data into the desired format. The function used to do this are melt() function and cast() function.
melt()
Example:
# install the reshape package install.packages("reshape") #Create a data frame data1 <- data.frame( id = c(1, 1, 2, 2), rank = c(1, 2, 1, 2), x1 = c(5, 2, 6, 7), x2 = c(6, 1, 5, 8) ) # load the reshape package library(reshape) # melt the data mdata <- melt(data1, id =c("id", "rank")) print(mdata)
Output:
id rank variable value 1 1 1 x1 5 2 1 2 x1 2 3 2 1 x1 6 4 2 2 x1 7 5 1 1 x2 6 6 1 2 x2 1 7 2 1 x2 5 8 2 2 x2 8
cast()
Example:
Let's cast the melted data to evaluate mean:
#cast the melted data idmeans <- cast(mdata, id~variable, mean) rankmeans <- cast(mdata, rank~variable, mean) print(idmeans) print(rankmeans)
Output:
id x1 x2 1 1 3.5 3.5 2 2 6.5 6.5 rank x1 x2 1 1 5.5 5.5 2 2 4.5 4.5Refernce: https://data-flair.training/blogs/r-data-reshaping-function-package/