Week 5.1

CourseCompilers & Interpreters
SemesterS1 2022

Week 5.1

🏷️ Replace this section with some keywords that can be used for searching, tagging and the indexing of this page.

Sections

🌱 Links to all Heading1 Blocks in this page.

  1. Section Number One

1.0 - First-Sets and Follow Sets

  • In predictive RDP, we make choices between alternatives based on the current token.
  • To do this, we need to know:
    • First Set What tokens each alternative can begin with
    • If a construct is nullable, in order to choose between recognising the empty string or a non-empty string, what symbols can follow the construct
      • Do we match the empty sequence of symbols or was there a syntax error?
      • To know this, we need to know what the Follow Set of a non-terminal symbol is

1.1 - First Set

  • The first set for a construct α\alpha, i.e. First(α)\text{First}(\alpha) is a set, and records:
    • The set of terminal symbols α\alpha can start with, and
    • If α\alpha is nullable it contains the empty string ϵ\epsilon to indicate that.
  • It should be emphasised that the ϵ\epsilon in a First Set is not recording that the construct can begin with the empty string, but rather that the whole construct can match the empty string (i.e. is nullable)
  • Including ϵ\epsilon in the first set is a bit confusing.
    • It is there purely to indicate that the construct is nullable, and hence the first set encodes the two pieces of information
      1. The terminal symbols that can start the construct
      2. Whether or not it is nullable.

1.2 - First Set - Formal Definition

  • For these definitions, α,β\alpha,\beta and γ\gamma are assumed to be possible empty sequences of terminal and non-terminal symbols.

  • If α\alpha is not nullable, its first set is the set of terminal symbols that can start α\alpha

    • i.e. those terminal symbols “a” such that α\alpha can derive a sequence beginning with “a”

    First(α)={a:Terminal  (βαaβ)}\text{First}(\alpha)=\{a:\text{Terminal}\ |\ (\exists\beta\bullet\alpha\overset{*}{\Rightarrow a\beta)\}}

  • If α\alpha is nullable, First(α\alpha) includes ϵ\epsilon, i.e. if aϵa\overset{*}{\Rightarrow}\epsilon, then we have:

    First(α)={a:Terminal  (βαaβ)}{ϵ}\text{First}(\alpha)=\{a :\text{Terminal}\ |\ (\exists\beta\bullet\alpha\overset{*}{\Rightarrow}a\beta)\}\cup\{\epsilon\}

  • Note that in this context, ϵ\epsilon is not a terminal symbol; its presence in a first set merely indicates that α\alpha is nullable.

1.3 - Calculating First Sets

  • Let:

    aaa terminal symbol
    α1,α2,,αn\alpha_1,\alpha_2, \cdots,\alpha_nstring of (terminal and non-terminal) symbols)
    AAnon-terminal symbol defined by a single production
    $A\rightarrow\alpha_1\alpha_2

Then:

First(ϵ)={ϵ}\text{First}(\epsilon)=\{\epsilon\}

First(a)={a}\text{First}(a)=\{a\}

First(α1α2αn)=First(α1)First(α2)First(αn)\text{First}(\alpha_1|\alpha_2|\cdots|\alpha_n)=\text{First}(\alpha_1)\cup\text{First}(\alpha_2)\cup\cdots\cup\text{First}(\alpha_n)

First(A)=First(α1α2αn)\text{First}(A)=\text{First}(\alpha_1|\alpha_2|\cdots|\alpha_n)

  • (is nullable, cannot be used to derive a set that start with any non-terminal symbol)
  • Note that if the first set of any α1,α2,,αn\alpha_1, \alpha_2,\cdots,\alpha_n contain ϵ\epsilon then First(α1α2αn)\text{First}(\alpha_1|\alpha_2|\cdots|\alpha_n) must contain ϵ\epsilon

If any of the alternatives is nullable, its first set will contain ϵ\epsilon, and hence the fisrt set of the set of alternatives will contain ϵ\epsilon

1.3.1 - Examples of Calculating First Sets (alternatives)

First(ac)=First(a)First(c)={a}{c}={a,c}\def\first{\text{First}} \begin{aligned} \first(a|c)&=\first(a)\cup\first(c)\\ &=\{a\}\cup\{c\}\\ &=\{a,c\} \end{aligned}

Given that the production for AA is AacA\rightarrow a|c then:

First(A)=First(ac)={a,c}\def\first{\text{First}} \begin{aligned} \first(A)&=\first(a|c)\\ &=\{a,c\} \end{aligned}

Then, First(A)\text{First}(A) is just the RHS of its production (which is what we’ve calculated above

Given the only production for BB is BbϵB\rightarrow b|\epsilon then:

First(B)=First(bϵ)=First(b)First(ϵ)={b,ϵ}\def\first{\text{First}} \begin{aligned}\first(B)&=\first(b|\epsilon)\\ &=\first(b)\cup\first(\epsilon)\\ &=\{b,\epsilon\}\end{aligned}

Then, First(B)\text{First}(B) is just the RHS of its production, which is computed as follows.

Note that First(ϵ)={ϵ}\text{First}(\epsilon)=\{\epsilon\} as ϵ\epsilon is nullable.


1.3.2 - Calculating First-Sets of Symbols.

Let S1,S2,,SnS_1,S_2,\cdots,S_n be (terminal or non-terminal) symbols, then

  • Since S1S_1is the first symbol in the first set that we’re trying to compute, the first set must contain the terminal symbols that can start S1S_1

  • If S1S_1 is nullable, then it will also contain all of the non-terminal symbols that can start S2S_2(and this pattern can repeat until the end of the sequence, to SnS_n

  • We also need to know whether a sequence of symbols is nullable - the sequence will only be nullable if all of the symbols within the sequence are nullable.

\def\first{\text{First}} \def\e{\epsilon} \def\eset{{\e}} \begin{aligned}\first(S_1\ S_2\ \cdots\ S_j\ \cdots\ S_n)&=\ &\first(S_1)-\eset \ &\cup\first(S_2)-\eset {\color{#949494}\ \ \ \ \ \ \ \ \ \text{If } S_1 \text{ is nullable}}\ & \cdots\ &\cup\first(S_j) - {\epsilon} {\color{#949494}\ \ \ \ \ \ \ \ \ \text{If } S_1\ S_2 \cdots S_{j-1} \text{ is nullable}}\ &\cup\cdots\ &\first(S_n)-\eset {\color{#949494}\ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{If } S_1\ S_2 \cdots S_{n-1} \text{ is nullable}}\ &\cup\eset{\color{#949494}\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{If } S_1\ S_2 \cdots S_{n} \text{ is nullable}}\

\end{aligned}

**Example of Calculating First-Sets of Sequences** 🌱 Let the only production for $A$ be $A\rightarrow a\ d\ |\ \epsilon$, then the first set is: $$\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \begin{aligned} \first(A)&=\first(a\ d\ |\ \epsilon)\\ &=\first(a\ d)\cup\first(\e)\\ &=\first(a)\cup\first(\e)\\ &=\first(a)\cup\eset\\ &=\{a,\e\} \end{aligned}

Replace AA with the RHS of its production

Separate out the alternative

\def\first{\text{First}} \def\eset{\text{\{\epsilon\}}} \def\e{\epsilon} \first(a\ d)=\first(a)$ as $a$ is not nullable $\def\first{\text{First}} \def\eset{\text{\{\epsilon\}}}$$\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \first(\e)=\eset

However, as A is nullable,

First(A b)=First(A){ϵ}First(b)={a,ϵ}{ϵ}{b}={a,b}\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \begin{aligned} \first(A\ b)&=\first(A)-\eset\cup\first(b)\\ &=\{a,\e\}-\eset\cup\{b\}\\ &=\{a,b\} \end{aligned}

Since AA is nullable, its first set also contains all of the terminal symbols bb can start with.

Example II of Calculating First-Sets of Sequences

Let the following be the only productions for A,BA, B and CC.

Aa  ϵBb  ϵCc  ϵ\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \def\or{\ |\ } \def\prod{\rightarrow} A\prod a\or\e\\ B\prod b\or\e\\ C\prod c\or\e\\

Then each of A,B,CA, B, C is nullable, and hence

First(A B C)=First(A){ϵ}  First(B){ϵ}  First(C){ϵ}{ϵ}={a,ϵ}{ϵ}  {b,ϵ}{ϵ}  {c,ϵ}{ϵ}  {ϵ}={a}  {b}  {c}  {ϵ}={a,b,c,ϵ}\def\first{\text{First}}\def\e{\epsilon}\def\eset{\{\e\}}\def\or{\ |\ } \def\union{\ \cup\ } \begin{aligned} \first(A\ B\ C) &= \first(A)-\eset\union\first(B)-\eset\union\first(C)-\eset\cup\eset\\ &=\{a,\e\}-\eset\union\{b,\e\}-\eset\union\{c,\e\}-\eset\union\eset\\ &=\{a\}\union\{b\}\union\{c\}\union\{\e\}\\ &=\{a,b,c,\e\} \end{aligned}

1.3.3 - First Sets for Optionals, Repetitions and Groups (EBNF)

  • The first sets of optionals, repetitions and groups are straightforward:

    First([S])=First(S)  {ϵ}First({S})=First(S)  {ϵ}First((S))=First(S)\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \def\or{\ |\ } \def\union{\ \cup\ } \first([S])=\first(S)\union\eset\\ \first(\{S\})=\first(S)\union\eset\\ \first((S))=\first(S)

1.4 - Calculating First-Sets Algorithmically

  • To calculate the first set for a syntactic construct, we need to know the first sets for the non-terminals occurring in it
  • Hence, we start by showing how to calculate the first sets of all the non-terminals in a grammar
  • We start with the first sets for all non-terminals being the empty set, and note that the first set for every terminal symbol a"``a" is the singleton set {a}\{a\}
  • We then make a pass over all productions in a grammar, considering all alternatives and process as follows.
    • In this process, we assume that our grammar is written in plain BNF form.
    • If there is a production of the form Nϵ\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \def\or{\ |\ } \def\union{\ \cup\ } \def\prod{\rightarrow} N\prod\e, we add ϵ\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \def\or{\ |\ } \def\union{\ \cup\ } \e to the first set for N\def\first{\text{First}}\def\e{\epsilon}\def\eset{\{\e\}}\def\or{\ |\ }\def\union{\ \cup\ } N to indicate that it is nullable.
    • If there is a production of the form NS1 S2  Sn\def\first{\text{First}}\def\e{\epsilon}\def\eset{\{\e\}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} N\prod S_1\ S_2\ \cdots\ S_n, then for each i1ni\in1\cdots n, if for all j11i1j\in1 \cdots 1-i-1, SjS_j is nullable, we add the current first set for SjS_j minus ϵ\epsilon to the first set of NN
    • If every construct S1,,SnS_1,\cdots,S_n is nullable, we add ϵ\epsilon to the first set of NN
  • After making a complete pass in which we process every alternative right side for all productions, we repeat the process of making a pass, but start with the first sets computed so far, rather than the initial first sets.
    • This pass may or may not extend some first sets
    • If no first sets are modified in the pass, we are finished.
    • Otherwise, we repeat the process.
  • Because all the first sets are finite, and each time we decide to repeat the process at least one first set must have been extended by at least one symbol, the whole process must terminate.
    • The above iterative algorithm is a common approach to calculating inductively defined sets.

1.4.1 - Calculating First Sets Algorithmically

  • Consider the following productions that define a grammar:

    AB x  CBC y  DCD z  ϵDA w\def\first{\text{First}}\def\e{\epsilon}\def\eset{\{\e\}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} A\prod B\ x\or C\\ B\rightarrow C\ y \or D\\ C \rightarrow D\ z\or\e\\ D\rightarrow A\ w

    • Which of these non-terminal symbols are nullable?
      1. CC is nullable as CD zϵ\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \def\or{\ |\ } \def\union{\ \cup\ } \def\prod{\rightarrow} C\prod D\ z | \e
      2. AA is nullable as AB x  C\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \def\or{\ |\ } \def\union{\ \cup\ } \def\prod{\rightarrow}A\rightarrow B\ x\or C and CD zϵ\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \def\or{\ |\ } \def\union{\ \cup\ } \def\prod{\rightarrow} C\prod D\ z | \e
    • From this, we know that ϵFirst(A)\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \def\or{\ |\ } \def\union{\ \cup\ } \def\prod{\rightarrow} \epsilon\in\first(A) and ϵFirst(C)\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \def\or{\ |\ } \def\union{\ \cup\ } \def\prod{\rightarrow} \epsilon\in\first(C)

  1. Initially, set all first sets to be the empty set, {ϵ}\def\first{\text{First}} \def\e{\epsilon} \def\eset{\{\e\}} \def\or{\ |\ } \def\union{\ \cup\ } \def\prod{\rightarrow} \eset

    1
    A{}{}\{\}
    B{}\{\}
    C{}\{\}
    D{}\{\}
  2. First Pass

    1. Consider the production $$ \def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} A\prod B\ x\or C

fromthisweknowthat - from this we know that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(B)-\eset\in\first(A), \first©-\eset\in\first(A)

- Consider the first alternative, $B\ x$ - We can’t update or change anything since we don’t know what

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(B)

is, so we skip it for this iteration - Consider the second alternative $C$ - we also can’t update or change anything since we don’t know what

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first©

is so we skip it for this iteration - At this stage, we don’t even know that $C$ is nullable, and therefore can’t deduce that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \epsilon\in\first(A)

2.Considertheproduction2. Consider the production

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} B\prod C\ y \or D

fromthisweknowthat - from this we know that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first©-\eset\in\first(B)

and and

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(D)-\eset\in\first(B)

Asin(2a),werealisethatwedontknowwhat - As in (2a), we realise that we don’t know what

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first©

or or

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(D)

is,sowecantupdatethefirstsets3.Considerthethirdproduction is, so we can’t update the first sets 3. Consider the third production

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} C\prod D\ z\or\e

fromthisweknowthat - from this we know that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \eset\in\first©

and and

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(D)-\eset\in\first©

- Since we know that $C$ is nullable, we can add

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \eset

to to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first©

4.Considerthefourthproduction4. Consider the fourth production

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} D\prod\ A\ w

fromthisweknowthat - from this we know that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(A)-\eset\in\first(D)

Asin(2a)and(2b)wedontknowwhat- As in (2a) and (2b) we don’t know what

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(A)

is,sowecantupdate is, so we can’t update

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(D)

| | 1 | | --- | --- | | A | ${}$$\{\}$ | | B | $\{\}$ | | C |

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} {\e}

| | D | $\{\}$ | 3. **Second Pass** - There is a “next round” as a first set changed in the last iteration -

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first©

1.Considerthefirstproduction,1. Consider the first production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} A\prod B\ x\or C

- We look at the first alternative in the production, $B\ x$ - we can’t do anything as

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(B)

is still undefined - We look at the second alternative in the production $C$ - we know that $C$ is nullable, and therefore

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow}

    \e\in\first(C)

Therefore,wecanadd- Therefore, we can add

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \e

to to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(A)

2.Considerthesecondproduction,2. Consider the second production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} B\prod C\ y \or D

- We look at the first alternative in the production, $C\ y$ - We don’t know anything about

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first©

besides the fact that $C$ is nullable. - Based on this, we know that $y$ can start $B$ - we can add $\def\first{\text{First}}\def\e{\epsilon}\def\eset{\{\e\}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow}y\in\first(B)$ - We look at the second alternative in the production, $D$ - Since

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(D)

isstillunknown,wecantaddanythingto is still unknown, we can’t add anything to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(B)

3.Considerthethirdproduction,3. Consider the third production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} C\prod D\ z\or\e

- Consider the first alternative in the production, $D\ z$ - Since

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(D)

isstillunknown,wecantaddanythingto is still unknown, we can’t add anything to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first©

(we already know that $C$ is nullable and have already added $\epsilon$ to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first©

- Consider the second alternative in the production - The symbol $\epsilon$ has already been added to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first©

tosignifythatitisnullable.4.Considerthefourthproduction, to signify that it is nullable. 4. Consider the fourth production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} D\prod\ A\ w

Weknowthat- We know that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(A)\in\first(D)

, and we have $\epsilon$ in

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(A)

- There is a case where $A=\epsilon$, and $D$ produces $w$ - Therefore, we can add $\def\first{\text{First}}\def\e{\epsilon}\def\eset{\{\e\}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow}w\in\first(D)$ | | 1 | 2 | | --- | --- | --- | | A | ${}$$\{\}$ | $\{\epsilon\}$ | | B | $\{\}$ | $\{y\}$ | | C |

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} {\e}

| $\{\}$ | | D | $\{\}$ | $\{w\}$ | 4. **Third Pass** - There is a “next round” as we updated the first sets of A, B and D in the last iteration 1. Consider the first production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} A\prod B\ x\or C

- Consider the first alternative, $B\ x$ - we now know that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} y\in\first(B)

sowecanadd so we can add

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} y\in\first(A)

- Since $B$ is not nullable, we don’t include $x$ - Consider the second alternative, $C$ - we’ve already added $\epsilon$ to signify that $C$ is nullable. 2. Consider the second production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} B\prod C\ y \or D

- Consider the first alternative $C\ y$ - we set

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(B)

to be anything $C$ can start with - Since $C$ is nullable, we set it to $y$ - Consider the second alternative $D$ - we add

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first{D}\in\first(B)

which means that we add $w$ to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(B)

3.Considerthethirdproduction,3. Consider the third production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} C\prod D\ z\or\e

- Consider the first alternative of $C$ - we now know that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(D)

contains $w$, so we add that to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first©

- We also don’t add $z$ as $D$ is not nullable. 4. Consider the fourth production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} D\prod\ A\ w

Weknowthat- We know that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(A)-\eset

now contains $y$, and since the production starts with $A$, we can add that to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(D)

| | 1 | 2 | 3 | | --- | --- | --- | --- | | A | ${}$$\{\}$ | $\{\epsilon\}$ | $\{\epsilon,y\}$ | | B | $\{\}$ | $\{y\}$ | $\{y,w\}$ | | C |

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} {\e}

| $\{\}$ | $\{\epsilon,w\}$ | | D | $\{\}$ | $\{w\}$ | $\{w,y\}$ | 5. Fourth **Pass** - There is a “next round” as we updated the first sets of B, C and D in the last iteration 1. Consider the first production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} A\prod B\ x\or C

Weknowthat- We know that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(A)=(\first(B)-\eset)\union(\first©-\eset)

Wenowseethat- We now see that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(B)

now contains $w$, so we add that to our first set - Since $C$ is nullable, we add $\epsilon$ to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(A)

2.Considerthesecondproduction,2. Consider the second production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} B\prod C\ y \or D

- Consider the first alternative $C\ y$ - we set

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(B)

to be anything $C$ can start with - Since $C$ is nullable, we set it to $y$ - Consider the second alternative $D$ - we add

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first{D}\in\first(B)

which means that we add $w$ to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(B)

3.Considerthethirdproduction,3. Consider the third production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} C\prod D\ z\or\e

- Consider the first alternative, $D\ z$ which now has $y$ - we add this to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first©

4.Considerthefourthproduction,4. Consider the fourth production,

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} D\prod\ A\ w

Weknowthat- We know that

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(A)-\eset

now contains $y$, and since the production starts with $A$, we can add that to

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} \first(D)

| | 1 | 2 | 3 | 4 | | --- | --- | --- | --- | --- | | A | ${}$$\{\}$ | $\{\epsilon\}$ | $\{\epsilon,y\}$ | $\{\epsilon, y, w\}$ | | B | $\{\}$ | $\{y\}$ | $\{y,w\}$ | $\{y,w\}$ | | C |

\def\first{\text{First}}\def\e{\epsilon}\def\eset{{\e}}\def\or{\ |\ }\def\union{\ \cup\ }\def\prod{\rightarrow} {\e}

| $\{\}$ | $\{\epsilon,w\}$ | $\{\epsilon,w,y\}$ | | D | $\{\}$ | $\{w\}$ | $\{w,y\}$ | $\{w,y\}$ | 6. Fifth Pass - There is a “next pass” as the first sets of A, C and D were updated in the previous iteration - {we actually perform an iteration and realise nothing changes, so we terminate at the end of this loop} ### 1.4.2 - Calculating First Sets - Formal Algorithmic Definition - We allow $\epsilon$ as a symbol for calculating first sets: ``` Map ⟨Symbol, Set⟨Symbol⟩⟩ first; calculateFirst() // Set all the first sets as empty to begin with for (Symbol N : g.nonterminals) first(N) := {}; do // Loop for passes // saveFirst are first sets from previous iteration saveFirst := first; // full copy, for (p : g.productions) // Update the first set of the non-terminal on the LHS of the production first(p.lhs) := first(p.lhs) ∪ firstSeq(p.rhs) // Keep updating until the first sets don't change while (first ≠ saveFirst) // value equality ``` - This process computes the least fixed-point of the relation firstSeq(p.rhs) ⊆ first(p.lhs) simultaneously for all productions p in the grammar. - Note that the firstSeq method returns the first set of all of the symbols in the production - The right side of $N\rightarrow\epsilon$ is represented by sequence of length 0 ``` /* * The firstSeq takes in the RHS of a production, s (a list of symbols) and returns * the first set for that sequence of symbols, based on the current first sets * (the current first sets are acccessed using the first(...) method). * * The firstSeq method looks at each symbol from left to right until we: * 1. Get to the end of the sequence of symbols * 2. Find a symbol that isn't nullable. */ Set ⟨Symbol⟩ firstSeq(List⟨Symbol⟩ s) i := 0; nullable := true; f := {}; // First set for the current symbol // If the sequence isn't empty (or we haven't reached the end, and we haven't // found a nullable symbol, look at the next symbol. while i < length(s) && nullable if s(i) ∈ g.terminals // Update the first set for the terminal symbol. f := f ∪ {s(i)}; // Add s(i) to f nullable =:= false // Terminal symbol not nullable, // return first set calculated so far. else: // s(i) ∈ g.nonterminal // If the symbol is a non-terminal, we update it with what we've calculated // so far using the symbol that we're currently looking at (s(i)) f := f ∪ (first(s(i)) - {ε}); nullable := (ε ∈ first(s(i))); // Symbol is nullable if epsilon in first set i := i + 1; if nullable: f := f ∪ {ε}; // Augment first set with epsilon return f ```