# R Factors

Factors are the R data objects. It is used to categorize the data and store it as levels. They can store both integers and strings. Factors are used in data analysis for statistical modeling. factor() function is used to create factors.

Example 1:

```data <- c('Physics','Maths','Physics','Bio','Bio','Maths','Physics','Maths','Bio')
fdata <- factor(data)
fdata```

Output:

``` Physics   Maths   Physics   Bio   Bio   Maths   Physics   Maths   Bio
Levels: Bio Maths Physics```

Example 2:

```d <- c(100, 400, 200, 500, 100, 400, 300, 200, 100, 500, 400, 500, 100)
ndata <- factor(d)
ndata```

Output:

``` 100 400 200 500 100 400 300 200 100 500 400 500 100
Levels: 100 200 300 400 500```

Accessing the Elements of a Factor

Accessing elements of a factor is very much similar to that of vectors.

Example:

```data <- c('Physics','Maths','Physics','Bio','Bio','Maths','Physics','Maths','Bio')
fdata <- factor(data)
fdata
# access 3rd element
fdata
# access 2nd and 4th element
fdata[c(2,4)]```

Output:

``` Physics Maths   Physics Bio     Bio     Maths   Physics Maths   Bio
Levels: Bio Maths Physics
 Bio
Levels: Bio Maths Physics
 Maths Bio
Levels: Bio Maths Physics```

How to Modify a Factor

Elements of a factor can be modified using simple assignments. However, we cannot choose values outside of factor’s predefined levels.

Example:

```data <- c('Physics','Maths','Physics','Bio','Bio','Maths','Physics','Maths','Bio')
fdata <- factor(data)
# modify 4th element
fdata <- "Maths"
fdata
# cannot assign values outside the levels
fdata <- "Chemistry"
fdata```

Output:

``` Physics Maths   Physics Maths   Bio     Maths   Physics Maths   Bio
Levels: Bio Maths Physics
Warning message:
In `[<-.factor`(`*tmp*`, 3, value = "Chemistry") :
invalid factor level, NA generated
 Physics Maths   <NA>    Maths   Bio     Maths   Physics Maths   Bio
Levels: Bio Maths Physics```

Changing the Order of Levels

We can change the order of the levels in a factor by applying the factor function again with new order of levels.

Example:

```data <- c('Physics','Maths','Physics','Bio','Bio','Maths','Physics','Maths','Bio')
fdata <- factor(data)
print(fdata)
# Apply the factor function with required order of the level.
newData <- factor(fdata,levels = c("Physics","Maths","Bio"))
print(newData)```

Output:

``` Physics Maths   Physics Bio     Bio     Maths   Physics Maths   Bio
Levels: Bio Maths Physics
 Physics Maths   Physics Bio     Bio     Maths   Physics Maths   Bio
Levels: Physics Maths Bio```

Generating Factor Levels

To generate factor levels by using the gl() function.  Let’s see the syntax of this gl() function:

Syntax:

`gl(n, k, labels)`

Here,

• n is a number of levels
• k is the number of replications
• labels is a vector of labels for the resulting factor levels

Example:

```g <- gl(3, 4, labels = c("Nikita", "Deep","Ayesha"))
print(g)```

Output:

``` Nikita Nikita Nikita Nikita Deep   Deep   Deep   Deep   Ayesha Ayesha Ayesha Ayesha
Levels: Nikita Deep Ayesha```
Reference: https://www.stat.berkeley.edu/~s133/factors.html