PL0 Grammars

CourseCompilers & Interpreters
SemesterS1 2022

PL0 Grammars

1.0 - PL0 Grammar

  • The grammar of PL0 is defined using Extended Backus-Naur form(Extended BNF)
    • Extended Backus-Naur form A language for writing grammars / syntaxes of programming languages
    • BNF was the product of extensive research of both programming languages and natural language.

1.1 - PL0 Grammar Rules

🌱 This content is from the PL0 Concrete Syntax pdf. (Ch3 - Grammar p 2-3)

  • The concrete syntax of PL0 is given in Extended Backus-Naur form (EBNF)

  • Each grammar rule / production consists of the name of the construct (a non-terminal symbol) followed by an β†’ followed by an EBNF term which defines the valid forms of the construct

    Non-Terminal Symbolβ†’EBNF Term\text{\color{Orange}Non-Terminal Symbol} \rightarrow \text{\color{Lime}EBNF Term}

  • The EBNF term can be one of the following:

    • A terminal symbol or lexical token (as defined here)
    • A non-terminal symbol
    • A sequence of EBNF terms separated by β€œ|” indicating a set of alternative forms for the construct (essentially a choice, or logical OR in which one is chosen)
    • An EBNF term in parentheses, β€œ(” and β€œ)” indicating grouping
    • An EBNF term in square brackets, β€œ[” and β€œ]” indicating an optional (0 or 1) occurrence;
    • An EBNF term in (curly) braces, β€œ{” and β€œ}” indicating a repetition of zero or more occurrences

1.2 - PL0 Grammar Definition

🌱 Generated using

Programβ†’Block EOF\text{\color{lightblue}Program} \rightarrow \text{\color{lightgreen}Block EOF}

A Program produces a Block followed by an EOF symbol.

Blockβ†’{Declaration} CompoundStatement\text{\color{lightblue}Block} \rightarrow \text{\color{lightgreen}\{Declaration\} CompoundStatement}

A Block produces 0 or more Declarations followed by a CompoundStatement

Declarationβ†’ConstDefList | TypeDefList | VarDeclList | ProcedureDef\text{\color{lightblue}Declaration} \rightarrow \text{\color{lightgreen}ConstDefList | TypeDefList | VarDeclList | ProcedureDef}

A Declaration can either be for a ConstDefList, TypeDefList, VarDeclList or ProcedureDef

ConstDefList A list of constant declarations

TypeDefList A list of type declarations

VarDeclList A list of variable declarations

ProcedureDef A procedure definition

ConstDefListβ†’KW_CONST ConstDef {ConstDef}\text{\color{lightblue}ConstDefList} \rightarrow \text{\color{lightgreen}KW\_CONST ConstDef \{ConstDef\}}

A ConstDefList produces KW_CONST followed by one or more ConstDefs

Note here that KW_CONST is one of our Lexical Tokens from last lecture.

ConstDefβ†’IDENTIFIER EQUALS Constant SEMICOLONConstantβ†’NUMBER | IDENTIFIER | MINUS ConstantTypeDefListβ†’KW TYPE TypeDef {TypeDef }TypeDefβ†’IDENTIFIER EQUALS Type SEMICOLONTypeβ†’TypeIdentifier | SubrangeTypeTypeIdentifierβ†’IDENTIFIERSubrangeTypeβ†’LBRACKET Constant RANGE Constant RBRACKETVarDeclListβ†’KW VAR VarDecl {VarDecl}VarDeclβ†’IDENTIFIER COLON TypeIdentifier SEMICOLONProcedureDefβ†’ProcedureHead EQUALS Block SEMICOLONProcedureHeadβ†’KW PROCEDURE IDENTIFIER LPAREN FormalParameters RPARENFormalParametersβ†’CompoundStatementβ†’KW BEGIN StatementList KW ENDStatementListβ†’Statement {SEMICOLON Statement}Statementβ†’Assignment | CallStatement | ReadStatement | WriteStatement|Assignmentβ†’LValue ASSIGN ConditionCallStatementβ†’KW CALL IDENTIFIER LPAREN ActualParameters RPARENActualParametersβ†’ReadStatementβ†’KW READ LValueWriteStatementβ†’KW WRITE ExpWhileStatementβ†’KW WHILE Condition KW DO StatementIfStatementβ†’KW IF Condition KW THEN Statement KW ELSE StatementConditionβ†’RelConditionTo be extendedRelConditionβ†’Exp [ RelOp Exp ]RelOpβ†’EQUALS | NEQUALS | LESS | GREATER | LEQUALS | GEQUALSExpβ†’[PLUS | MINUS] Term {(PLUS | MINUS) Term}Termβ†’Factor {(TIMES | DIVIDE) Factor}Factorβ†’LPAREN Condition RPAREN | NUMBER | LValueLValueβ†’IDENTIFIER\text{\color{lightblue}ConstDef} \rightarrow \text{\color{lightgreen}IDENTIFIER EQUALS Constant SEMICOLON}\\\text{\color{lightblue}Constant} \rightarrow \text{\color{lightgreen}NUMBER | IDENTIFIER | MINUS Constant}\\\text{\color{lightblue}TypeDefList} \rightarrow \text{\color{lightgreen}KW TYPE TypeDef \{TypeDef \}}\\\text{\color{lightblue}TypeDef} \rightarrow \text{\color{lightgreen}IDENTIFIER EQUALS Type SEMICOLON}\\\text{\color{lightblue}Type} \rightarrow \text{\color{lightgreen}TypeIdentifier | SubrangeType}\\\text{\color{lightblue}TypeIdentifier} \rightarrow \text{\color{lightgreen}IDENTIFIER}\\\text{\color{lightblue}SubrangeType} \rightarrow \text{\color{lightgreen}LBRACKET Constant RANGE Constant RBRACKET}\\\text{\color{lightblue}VarDeclList} \rightarrow \text{\color{lightgreen}KW VAR VarDecl \{VarDecl\}}\\\text{\color{lightblue}VarDecl} \rightarrow \text{\color{lightgreen}IDENTIFIER COLON TypeIdentifier SEMICOLON}\\\text{\color{lightblue}ProcedureDef} \rightarrow \text{\color{lightgreen}ProcedureHead EQUALS Block SEMICOLON}\\\text{\color{lightblue}ProcedureHead} \rightarrow \text{\color{lightgreen}KW PROCEDURE IDENTIFIER LPAREN FormalParameters RPAREN}\\\text{\color{lightblue}FormalParameters} \rightarrow \text{\color{lightgreen}}\\\text{\color{lightblue}CompoundStatement} \rightarrow \text{\color{lightgreen}KW BEGIN StatementList KW END}\\\text{\color{lightblue}StatementList} \rightarrow \text{\color{lightgreen}Statement \{SEMICOLON Statement\}}\\\text{\color{lightblue}Statement} \rightarrow \text{\color{lightgreen}Assignment | CallStatement | ReadStatement | WriteStatement|}\\\text{\color{lightblue}Assignment} \rightarrow \text{\color{lightgreen}LValue ASSIGN Condition}\\\text{\color{lightblue}CallStatement} \rightarrow \text{\color{lightgreen}KW CALL IDENTIFIER LPAREN ActualParameters RPAREN}\\\text{\color{lightblue}ActualParameters} \rightarrow \text{\color{lightgreen}}\\\text{\color{lightblue}ReadStatement} \rightarrow \text{\color{lightgreen}KW READ LValue}\\\text{\color{lightblue}WriteStatement} \rightarrow \text{\color{lightgreen}KW WRITE Exp}\\\text{\color{lightblue}WhileStatement} \rightarrow \text{\color{lightgreen}KW WHILE Condition KW DO Statement}\\\text{\color{lightblue}IfStatement} \rightarrow \text{\color{lightgreen}KW IF Condition KW THEN Statement KW ELSE Statement}\\\text{\color{lightblue}Condition} \rightarrow \text{\color{lightgreen}RelCondition}\text{\color{gray}To be extended}\\\text{\color{lightblue}RelCondition} \rightarrow \text{\color{lightgreen}Exp [ RelOp Exp ]}\\\text{\color{lightblue}RelOp} \rightarrow \text{\color{lightgreen}EQUALS | NEQUALS | LESS | GREATER | LEQUALS | GEQUALS}\\\text{\color{lightblue}Exp} \rightarrow \text{\color{lightgreen}[PLUS | MINUS] Term \{(PLUS | MINUS) Term\}}\\\text{\color{lightblue}Term} \rightarrow \text{\color{lightgreen}Factor \{(TIMES | DIVIDE) Factor\}}\\\text{\color{lightblue}Factor} \rightarrow \text{\color{lightgreen}LPAREN Condition RPAREN | NUMBER | LValue}\\\text{\color{lightblue}LValue} \rightarrow \text{\color{lightgreen}IDENTIFIER}

  • LValue (Left Value) something that could appear on the left hand side of an assignment statement.
  • We want to specify that certain operators have a higher precedence (e.g. multiplication has higher precedence than addition)
    • Need a slightly more verbose grammar to specify this.
    • Are these operators left-associative or right-associative?

1.3 - Parsing, Generating a Concrete Syntax Tree

🌱 This information is from the PL0 Compiler Example PowerPoint

if x<0 then z:= -x else z:=x\text{if x<0 then z:= -x else z:=x}

  • The grammar of the language can be used to extract the structure of the program statement (from its sequence of lexical tokens)

  • That structure can be described using a concrete syntax tree.

  • Our code can be parsed as the following lexical tokens:

    KW_IF, IDENTIFIER("x"), LESS NUMBER(0), KW_THEN, IDENTIFIER("z"), ASSIGN, MINUS, IDENTIFIER("x"), KW_ELSE, IDENTIFIER("z") ASSIGN, IDENTIFIER("x")
    

  • From this Concrete Syntax Tree, we see that the leaf nodes give our tokens exactly (when performing an in-order traversal of all leaf nodes).

  • The Concrete Syntax Tree contains information about the structure of the program

  • When performing an in-order traversal of all of the leaf nodes, we see that we get the

    • In-Order Traversal Pattern

      Untitled

  • This tree contains a large amount of single-child nodes - the information is very sparse and we can compact it down a lot.

    • We don’t need all of this information to type-check, interpret or generate code for the program

1.4 - Abstract Syntax Tree

🌱 This content is from the PL0 Compiler Data Structures pdf.

4_PL0_Compiler_Data_Structures.pdf

  • The Abstract Syntax Tree is represented in the compiler by three main Java classes (and their various subtypes)

    1. DeclNode
    2. StatementNode
    3. ExpNodes
  • The DeclListNode abstract class groups the Abstract Syntax Tree nodes for the procedures into a list - each entry in the list represents a single procedure and includes the procedure’s symbol table entry, and a BlockNode representing it’s body

    DeclNode() with subclasses
        DeclListNode(List decls)
        ProcedureNode(SymEntry.ProcedureEntry procEntry, BlockNode body)
    
  • The StatementNode abstract class represents a statement in the abstract syntax tree: it has a subclass for each kind of statement.

    • ErrorNode caters for erroneous statements.
    • All statement nodes contain a Location which is the location in the source code corresponding to the node.
    • BlockNode represents the body of a procedure (or the main program)
    StatementNode(Location loc) with subclasses
        ErrorNode() # Store the errors in the AST to provide the programmer with 
                    # as much information as possible :)
        BlockNode(DeclListNode procedures, StatementNode body, Scope blockLocals)
        AssignmentNode(ExpNode lvalue, ExpNode e)
        WriteNode(ExpNode e)
        CallNode(String id)
        ListNode(List sl)
        IfNode(ExpNode cond, StatementNode s1, StatementNode s2) # Conditional node
        WhileNode(ExpNode cond, StatementNode s)
        
    
  • Suppose we want to construct the Abstract Syntax Tree of the following expression:

    if x < 0 then z := -x else z := x
    
  • We know that our node is going to be a binary comparison node.

Untitled

ExpNode(Location loc, Type t) with subclasses
    ConstNode(int value)
    IdentifierNode(String id)
    BinaryNode(Operator op, ExpNode left, ExpNode right)
  • We have now populated the BinaryNode with it’s children
  • By constructing the AST we have essentially constructed the CST and summarised the information in a much less verbose manner.

Untitled

Untitled