Calculate The First and Follow Sets for The Following Grammar

Calculating FIRST and FOLLOW sets is essential for constructing parsing tables in compiler design. These sets help determine the validity of productions in a context-free grammar. This guide explains how to compute these sets and provides a calculator to perform the calculations.

What are FIRST and FOLLOW sets?

In formal language theory, FIRST and FOLLOW sets are used to analyze context-free grammars. These sets help in constructing parsing tables for top-down parsers like LL(1) parsers.

The FIRST set of a string or non-terminal symbol is the set of terminals that begin the strings derived from that symbol. The FOLLOW set of a non-terminal is the set of terminals that can appear immediately to the right of that non-terminal in any sentential form of the grammar.

How to calculate FIRST sets

To calculate the FIRST set for a non-terminal A:

If A produces ε (the empty string), then add ε to FIRST(A).
For each production A → X₁X₂...Xₙ:
- Add FIRST(X₁) to FIRST(A), excluding ε.
- If FIRST(X₁) contains ε, add FIRST(X₂), excluding ε, and continue this process until you reach a symbol whose FIRST set doesn't contain ε.
- If all symbols X₁ through Xₙ have ε in their FIRST sets, then add ε to FIRST(A).

Note: The FIRST set of a terminal symbol is the symbol itself.

How to calculate FOLLOW sets

To calculate the FOLLOW set for a non-terminal A:

Place $ (end of input marker) in FOLLOW(S), where S is the start symbol.
For each production that has A in it, consider the form B → αAβ:
- Add FIRST(β) to FOLLOW(A), excluding ε.
- If FIRST(β) contains ε, add FOLLOW(B) to FOLLOW(A).

Note: The FOLLOW set of the start symbol always contains the end marker $.

Example calculation

Consider the following grammar:

S → aBc | bA A → d | ε B → b | ε

Using the calculator on the right, you can compute the FIRST and FOLLOW sets for this grammar.

FAQ

What is the difference between FIRST and FOLLOW sets?

The FIRST set contains the terminals that can appear at the beginning of strings derived from a non-terminal. The FOLLOW set contains the terminals that can appear immediately after a non-terminal in any sentential form.

When is the empty string ε included in a FIRST set?

ε is included in the FIRST set of a non-terminal if that non-terminal can derive the empty string through one or more productions.

How do I know when to stop calculating FIRST sets?

You stop when no more terminals can be added to the FIRST set for any non-terminal. This typically happens when you've processed all productions and can't find any new terminals to add.