Chomsky's Normal Form - Automata

Chomsky's Normal Form (CNF)

In context free grammar, the left-hand side of production rules contains only one variable, and right side may contain any number of variables or terminals in production rule. The production rules in context free grammar are in the form

A -> a, where A is a variable, and a is string of any symbols from (V ? T)*

If we want to impose restriction on the right side of production rule, then context free grammar is said to be in a “normal form”.

In Chomsky Normal Form, there are restrictions on the length of right-hand side, and type of symbols is used in right hand side of production rules.

A -> BC

A -> a

In this, A, B and C are variables, and a is terminal

Steps to convert Context Free Grammar to Chomsky's Normal Form (CNF)

  • If start variable S occurs on right side of any production rule, then create a new start symbol S’, and add as a new production i.e., S’ -> S, in given production rules.
  • Remove or eliminate all Null Productions in given production rules.
  • Remove or eliminate all Unit productions in given production rules.
  • In this step, find out the productions, which have more than two variables in right hand side. Repeat this step for all productions having 2 or more symbols in right hand side.
  • If the right side of any production is of form A ->aB, where a is terminal and A, B are non-terminals, then production is replaced by A -> XB and X -> a.

Example:

Convert the following Context Free Grammar to Chomsky's Normal Form (CNF)

S -> ASA

S -> aB

A -> B

A -> S

A -> ?

B -> b

B -> ?

Solution:

Step 1: S is the start variable in any production rule. In above given production rules, S variable appears in R.H.S, then we can add a new start symbol S’ and add a new Production S’ -> S in above given production.

S’ -> S

S -> ASA

S -> aB

A -> B

A -> S

A -> ?

B -> b

B -> ?

Step 2: Remove the null productions i.e.

A -> ? and B -> ?

After removing the null productions B -> ?

S’ -> S

S -> ASA

S -> aB

S -> a

A -> B

A -> S

A -> B

A -> ?

B -> b

Now removing the null production A -> ?, and after removing the null productions A -> ?. The new production is:

S’ -> S

S -> ASA

S -> aB

S -> a

S -> AS

S -> SA

S -> S

A -> B

A -> S

B -> b

Step 3: Remove Unit productions. i.e.

S -> S

S’ -> S

A -> B

A -> S

After removing the unit production S -> S, the new production is:

S’ -> S

S -> ASA

S -> aB

S -> a

S -> AS

S -> SA

A -> B

A -> S

B -> b

After removing the unit production A -> B, the new production is:

S’ -> ASA|aB|a|AS|SA

S -> ASA|aB|a|AS|SA

A -> b|S

B -> b

After removing the unit production A -> S, the new production is:

S’ -> ASA | aB | a| AS | SA

S -> ASA | aB | a| AS | SA

A -> b ASA | aB | a| AS | SA

B -> b

Step 4: Now, find out the productions that has more than two variables in R.H.S.

S’ -> ASA

S -> ASA

A -> ASA

After removing this production, the new production we get

S’ -> AX | aB | a| AS|SA

S -> AX | aB |a |AS|SA

A -> b| AX | ASA|aB|a|AS|SA

B -> b

X -> AX

Step 5: Now change the productions:

S’ -> aB

S -> aB

A -> aB

Finally, we get the production after converting the given Context Free Grammar to Chomsky's Normal Form (CNF)

S’ -> AX | YB| aB | a| AS|SA

S -> AX | YB | aB |a |AS|SA

A -> b| AX | YB | ASA| aB |a| AS |SA

B -> b

X -> SA

Y -> a

This is the required Context Normal Form(CNF) for given Context Free Grammar (CFG).