How to Calculate First and Follow of A Grammar

Understanding FIRST and FOLLOW sets is crucial for parsing algorithms in computer science. These sets help determine the validity of input strings in context-free grammars. This guide explains how to calculate them step-by-step with practical examples.

What Are FIRST and FOLLOW Sets?

In formal grammar theory, FIRST and FOLLOW sets are used to predict the next possible symbols in a string during parsing. They help determine whether a string can be generated from a given grammar.

Key Concepts:

FIRST(X) - The set of terminals that begin the strings derived from symbol X.
FOLLOW(X) - The set of terminals that can appear immediately after symbol X in any sentential form.

These sets are essential for constructing parsing tables in top-down parsers like LL(1) parsers. They help resolve ambiguities in the grammar and ensure correct parsing decisions.

Calculating FIRST Sets

The FIRST set for a symbol X is calculated as follows:

FIRST(X) = {a | X ⇒* a...}

Where a is a terminal symbol and X ⇒* a... means X derives a string starting with a.

Steps to Calculate FIRST Sets:

For each terminal symbol, FIRST(X) = {X}.
For each non-terminal symbol, apply the following rules:
- If X → a..., where a is a terminal, then add a to FIRST(X).
- If X → Y1Y2...Yk, then add FIRST(Y1) to FIRST(X), excluding ε (empty string).
- If all Yi can derive ε, then add ε to FIRST(X).
Repeat until no more changes occur.

This iterative process continues until the FIRST sets stabilize, meaning no new elements can be added to any FIRST set.

Calculating FOLLOW Sets

The FOLLOW set for a symbol X is calculated as follows:

FOLLOW(X) = {a | S ⇒* αXaβ, a ∈ Vt ∪ {ε}}

Where S is the start symbol, α and β are any strings of grammar symbols, and a is a terminal or ε.

Steps to Calculate FOLLOW Sets:

Initialize FOLLOW(S) = {$}, where $ is the end-of-input marker.
For each production rule A → αBβ:
- Add FIRST(β) to FOLLOW(B), excluding ε.
- If β can derive ε, add FOLLOW(A) to FOLLOW(B).
Repeat until no more changes occur.

This process continues until the FOLLOW sets stabilize, meaning no new elements can be added to any FOLLOW set.

Example Calculation

Consider the following grammar:

S → aB | bA

A → a | ε

B → b | ε

Calculating FIRST Sets:

Symbol	FIRST Set
S	{a, b}
A	{a, ε}
B	{b, ε}

Calculating FOLLOW Sets:

Symbol	FOLLOW Set
S	{$}
A	{$}
B	{$}

This example demonstrates how FIRST and FOLLOW sets are calculated for a simple grammar. The iterative process ensures all possible derivations are considered.

FAQ

What is the difference between FIRST and FOLLOW sets?: FIRST sets contain the terminals that can start strings derived from a symbol, while FOLLOW sets contain the terminals that can appear immediately after a symbol in any sentential form.
When is the empty string (ε) included in a FIRST set?: ε is included in FIRST(X) if X can derive the empty string, meaning it's optional in the grammar.
How are FIRST and FOLLOW sets used in parsing?: They are used to construct parsing tables for top-down parsers like LL(1), helping determine the next symbol to parse and ensuring correct parsing decisions.
Can FIRST and FOLLOW sets be calculated for ambiguous grammars?: Yes, but the sets may be larger and more complex due to the multiple possible derivations in ambiguous grammars.
What happens if a grammar contains left recursion?: Left recursion can make FIRST and FOLLOW sets harder to calculate, as it creates infinite loops in the derivation process.