Cal11 calculator

How to Calculate Follow in Compiler Design

Reviewed by Calculator Editorial Team

In compiler design, the FOLLOW set is a crucial concept used in parsing algorithms like LL(1) parsing. It represents the set of terminals that can appear immediately to the right of a non-terminal in any valid sentence of the grammar. Understanding how to calculate FOLLOW is essential for designing efficient parsers.

What is the FOLLOW Set?

The FOLLOW set for a non-terminal A in a grammar is defined as the set of terminals that can appear immediately to the right of A in any valid derivation of the grammar. In other words, it's the set of terminals that can follow A in any sentence generated by the grammar.

FOLLOW sets are particularly important in LL(1) parsing, where they help determine whether a particular production rule can be used during parsing. The FOLLOW set for the start symbol of a grammar always includes the end-of-input marker ($).

Key Points:

  • FOLLOW sets are used in LL(1) parsing to resolve parsing conflicts
  • They help determine which production rule to apply during parsing
  • The FOLLOW set for the start symbol always includes $

How to Calculate FOLLOW

Calculating FOLLOW sets involves several steps that are applied iteratively until no more terminals can be added to any FOLLOW set. Here's the step-by-step process:

  1. Initialize FOLLOW(S) = {$} where S is the start symbol
  2. For each production rule A → αBβ:
    • Add FIRST(β) to FOLLOW(B)
    • If β can derive ε, add FOLLOW(A) to FOLLOW(B)
  3. Repeat step 2 until no more terminals can be added to any FOLLOW set

Formal Definition:

For a grammar G with productions P, the FOLLOW set for a non-terminal A is defined as:

FOLLOW(A) = {a | S ⇒* αAaβ, a ∈ T ∪ {$}, α, β ∈ (T ∪ N)*}

This process continues until a fixed point is reached where no more terminals can be added to any FOLLOW set.

Worked Example

Let's consider the following simple grammar:

S → aAd | bBc

A → a | ε

B → b | ε

We'll calculate FOLLOW sets for all non-terminals:

  1. Initialize FOLLOW(S) = {$}
  2. For S → aAd:
    • FIRST(d) = {d} → FOLLOW(A) = {d}
    • FIRST(d) cannot derive ε → no change
  3. For S → bBc:
    • FIRST(c) = {c} → FOLLOW(B) = {c}
    • FIRST(c) cannot derive ε → no change
  4. For A → a:
    • No change to FOLLOW(A)
  5. For A → ε:
    • FOLLOW(A) = FOLLOW(S) = {$}
  6. For B → b:
    • No change to FOLLOW(B)
  7. For B → ε:
    • FOLLOW(B) = FOLLOW(S) = {$}

The final FOLLOW sets are:

  • FOLLOW(S) = {$}
  • FOLLOW(A) = {d, $}
  • FOLLOW(B) = {c, $}

FAQ

What is the difference between FIRST and FOLLOW sets?
The FIRST set contains the first terminals that can appear in any string derived from a non-terminal, while the FOLLOW set contains the terminals that can appear immediately after a non-terminal in any valid derivation.
When is the FOLLOW set for a non-terminal empty?
The FOLLOW set for a non-terminal is empty if the non-terminal cannot appear in any valid derivation of the grammar. This typically happens with non-terminals that are not reachable from the start symbol.
How are FOLLOW sets used in LL(1) parsing?
In LL(1) parsing, FOLLOW sets help resolve parsing conflicts by determining which production rule to apply when multiple rules are possible. The FOLLOW set for a non-terminal indicates which terminals can follow it, helping the parser make the correct choice.
What happens if a grammar has left recursion?
Left recursion in a grammar can cause problems when calculating FOLLOW sets because it can lead to infinite loops in the derivation process. Left recursion should be eliminated before calculating FOLLOW sets.