Calculate Follow in Context Free Grammar C++

The FOLLOW set is a fundamental concept in compiler design and formal language theory. It represents the set of terminals that can appear immediately to the right of a given non-terminal in any sentential form of the grammar. This calculator helps you compute the FOLLOW set for context-free grammars implemented in C++.

What is the FOLLOW set in context-free grammar?

The FOLLOW set of a non-terminal symbol A in a context-free grammar is the set of terminals that can appear immediately to the right of A in any sentential form derived from the start symbol. The FOLLOW set is crucial for constructing parsing tables in top-down parsers like LL(1) parsers.

Key properties of the FOLLOW set:

The FOLLOW set of the start symbol always contains the end-of-input marker ($)
For any production A → αBβ, the FIRST set of β is added to the FOLLOW set of B
If β can derive the empty string (ε), then the FOLLOW set of A is added to the FOLLOW set of B

The FOLLOW set is distinct from the FIRST set, which represents the set of terminals that can appear at the beginning of strings derived from a given non-terminal.

How to calculate the FOLLOW set

Calculating the FOLLOW set involves these steps:

Initialize the FOLLOW set of the start symbol with the end-of-input marker ($)
For each production in the grammar, apply the following rules:
- If there's a production A → αBβ, add FIRST(β) to FOLLOW(B)
- If β can derive ε, add FOLLOW(A) to FOLLOW(B)
Repeat the process until no more changes occur to any FOLLOW set

FOLLOW(A) = { t | S ⇒* αAβ, t ∈ FIRST(β) } ∪ { $ | S ⇒* αA }

This process is typically implemented using an algorithm that iteratively applies these rules until a fixed point is reached.

Worked example

Consider the following context-free grammar:

S → aAd | bBc A → a | ε B → b | ε

Calculating the FOLLOW sets:

Initialize FOLLOW(S) = {$}
From S → aAd, add FIRST(d) = {d} to FOLLOW(A)
From S → bBc, add FIRST(c) = {c} to FOLLOW(B)
From A → ε, add FOLLOW(S) = {$} to FOLLOW(A)
From B → ε, add FOLLOW(S) = {$} to FOLLOW(B)

The final FOLLOW sets are:

FOLLOW(S) = {$}
FOLLOW(A) = {d, $}
FOLLOW(B) = {c, $}

FAQ

What is the difference between FIRST and FOLLOW sets?

The FIRST set contains the terminals that can appear at the beginning of strings derived from a non-terminal, while the FOLLOW set contains the terminals that can appear immediately after the non-terminal in any sentential form.

When is the FOLLOW set of a non-terminal empty?

The FOLLOW set of a non-terminal is empty only if the non-terminal never appears in any sentential form. This typically happens with non-terminals that are not reachable from the start symbol.

How is the FOLLOW set used in parsing?

The FOLLOW set is used in LL(1) parsing to determine which production to use when the current non-terminal is on top of the parse stack. It helps resolve conflicts in the parsing table.