LR Parser Compiler Design

LR Parser

LR parsing is a type of bottom-up parsing that is used to parse the large class of grammars. Here "L" stands for left-to-right scanning of the input "R" stands for constructing a right-most derivation in reverse.

LR parser is divided into four categories:

LR (1) Parsing

LR (1) parsing involves various steps as follows:

  • Write a CFG for a given input string.
  • Check if the grammar is ambiguous or not.
  • Add augmented production in the grammar.
  • Create a canonical collection of LR (0) items.
  • Draw DFD.
  •  LR (1) parsing table construction.

Augmented Grammar

If X is a grammar with start symbol A, then X’ will be the augmented grammar for X, which is the grammar with new start symbol X' and production X’ => X. Augmented production helps the parser to identify when the parsing action should be stopped and announce the acceptance of input.

Example:

S => AA

A => aA | b

The augmented grammar for the above grammar will be

S’ => S

A => AA

A => aA | b

Canonical Collection of LR(0) Item

An LR(0) item of a grammar G is a production of G with a dot at some position on the right-side of the production. Thus production S => XYZ yields four items:

S => .XYZ

S => X.YZ

S => XY.Z

S => XYZ.

The collection of LR(0) items is known as canonical LR(0) collection, which is helpful in constructing deterministic finite automata to make parsing decisions. In the LR (0), we need to put the reduce node in the entire row.

Example

Grammar

S => AA

A => aA | b

Add an augmented grammar in a given grammar.

S’ =>S

S => AA

A => aA | b

Take the LR(0) item for all the production. The first production is the starting production.

S’ => .S

S => .AA

A => .aA

A => .b

I0 State

Add starting production to the I0 State and Compute the Closure.

S’ => .S

Since "." is on the left side of the symbol. It means we have not seen anything on the right-hand side. So all productions starting with S will be added into I0 State because "." is followed by the non-terminal. So, the modified I0 State will be:

S’ => .S

S => .AA

Here we can see that “.” is on the left of the variable A. So all productions starting with A will be added in I0 State because "•" is followed by the non-terminal. So, the modified I0 State becomes.

        S’ => .S

        S => .AA

        A => .aA

        A => .b

I1 = Here we have applied Goto on (I0 S) = S’ => S.

Here, "." is on the right side of the symbol, so the production is reduced so close the state.

I1 becomes S’ => S.

I2 = we have applied Goto on (I0 A) = (S => A.A)

We apply closure, and we will get all the A production with "." in the beginning. So I2 becomes

         S => A.A

         A => .aA

         A => .b

Goto on (I2,a) = Closure (A ? a•A) = (same as I3)

Goto on  (I2, b) = Closure (A ? b•) = (same as I4)

I3=Goto on (I0,a) = Closure (A ? a•A)

Add productions starting with A in I3.

A => a.A

A => .aA

A => .b

Goto on (I3, a) = Closure (A ? a•A) = (same as I3)

Goto on (I3, b) = Closure (A ? b•) = (same as I4)

I4 = Go to on (I0, b) = closure (A ? b•) = A ? b•

I5 = Go to on(I2, A) = Closure (S ? AA•) = SA ? A•

I6 = Go to on (I3, A) = Closure (A ? aA•) = A ? Aa•

DFA for a given production

LR(0) Table

  • Shift Move: When a state is going to some other state on a terminal, then it is a shift move.
  • Goto Move: When a state is going to some other state on a variable, then it is a Goto moves.
  • In LR(0) parsing table, whenever any state having a final item in the particular row, we will write reduce node completely.
StatesActionGo to
 a                 b                  $A                               S
I0S3               S4                2                                 1
I1                                  accept 
I2S3              S45
I3S3               S46
I4r3                r3                r3 
I5r1                 r1               r1 
I6r2                 r2                r2 

Explanation

  • Applying Goto on I0 using S is going to I1, so we will write it as 1.
  • Applying Goto on I0 using A is going to I2, so we will write it as 2.
  • Applying Goto on I2 using A is going to I5, so we will write it as 5.
  • Applying Goto on I3 using A is going to I6, so we will write it as 6.
  • Applying Goto on I0, I2, and I3 using a are going to I3, so we will write it as S3, which is shift 3.
  • Applying Goto on I0, I2, and I3 using b are going to I4, so we will write it as S4, which is shift 4.
  • I4, I5, and I6 state contain the final item because the ‘•’ is in the right-most end.

Productions are numbered as follows:

S => AA       (1)

A => aA        (2)

A => b           (3)

  • I1 state has the final item which drives(S` ? S•) because "." is on the right side of the variable. So action {I1, $} = Accept.
  • I4 state has the final item which drives A ? b•, and this production is corresponding to the production number 3, so we will write the entire row as r3.
  • I5 state has the final item which drives S ? AA•, and this production is the same as the production number 1, so we have written the entire row as r1.
  • I6 state has the final item, which drives A ? aA•, and this is the same as the production number 2, so we have written the entire row as r2.