R Factors
Factors are the R data objects. It is used to categorize the data and store it as levels. They can store both integers and strings. Factors are used in data analysis for statistical modeling. factor() function is used to create factors.
Example 1:
data <- c('Physics','Maths','Physics','Bio','Bio','Maths','Physics','Maths','Bio') fdata <- factor(data) fdata
Output:
[1] Physics Maths Physics Bio Bio Maths Physics Maths Bio Levels: Bio Maths Physics
Example 2:
d <- c(100, 400, 200, 500, 100, 400, 300, 200, 100, 500, 400, 500, 100) ndata <- factor(d) ndata
Output:
[1] 100 400 200 500 100 400 300 200 100 500 400 500 100 Levels: 100 200 300 400 500
Accessing the Elements of a Factor
Accessing elements of a factor is very much similar to that of vectors.
Example:
data <- c('Physics','Maths','Physics','Bio','Bio','Maths','Physics','Maths','Bio') fdata <- factor(data) fdata # access 3rd element fdata[4] # access 2nd and 4th element fdata[c(2,4)]
Output:
[1] Physics Maths Physics Bio Bio Maths Physics Maths Bio Levels: Bio Maths Physics [1] Bio Levels: Bio Maths Physics [1] Maths Bio Levels: Bio Maths Physics
How to Modify a Factor
Elements of a factor can be modified using simple assignments. However, we cannot choose values outside of factor’s predefined levels.
Example:
data <- c('Physics','Maths','Physics','Bio','Bio','Maths','Physics','Maths','Bio') fdata <- factor(data) # modify 4th element fdata[4] <- "Maths" fdata # cannot assign values outside the levels fdata[3] <- "Chemistry" fdata
Output:
[1] Physics Maths Physics Maths Bio Maths Physics Maths Bio Levels: Bio Maths Physics Warning message: In `[<-.factor`(`*tmp*`, 3, value = "Chemistry") : invalid factor level, NA generated [1] Physics Maths <NA> Maths Bio Maths Physics Maths Bio Levels: Bio Maths Physics
Changing the Order of Levels
We can change the order of the levels in a factor by applying the factor function again with new order of levels.
Example:
data <- c('Physics','Maths','Physics','Bio','Bio','Maths','Physics','Maths','Bio') fdata <- factor(data) print(fdata) # Apply the factor function with required order of the level. newData <- factor(fdata,levels = c("Physics","Maths","Bio")) print(newData)
Output:
[1] Physics Maths Physics Bio Bio Maths Physics Maths Bio Levels: Bio Maths Physics [1] Physics Maths Physics Bio Bio Maths Physics Maths Bio Levels: Physics Maths Bio
Generating Factor Levels
To generate factor levels by using the gl() function. Let’s see the syntax of this gl() function:
Syntax:
gl(n, k, labels)
Here,
- n is a number of levels
- k is the number of replications
- labels is a vector of labels for the resulting factor levels
Example:
g <- gl(3, 4, labels = c("Nikita", "Deep","Ayesha")) print(g)
Output:
[1] Nikita Nikita Nikita Nikita Deep Deep Deep Deep Ayesha Ayesha Ayesha Ayesha Levels: Nikita Deep AyeshaReference: https://www.stat.berkeley.edu/~s133/factors.html