PL0 Grammars
1.0 - PL0 Grammar
- The grammar of PL0 is defined using
Extended Backus-Naur form(Extended BNF) Extended Backus-Naur form A language for writing grammars / syntaxes of programming languages- BNF was the product of extensive research of both
programming languages and natural language.
1.1 - PL0 Grammar Rules
π± This content is from the PL0 Concrete Syntax pdf. (Ch3 - Grammar p 2-3)
The concrete syntax of PL0 is given in Extended Backus-Naur form (EBNF)
Each grammar rule / production consists of the name of the construct (a non-terminal symbol) followed by an β followed by an EBNF term which defines the valid forms of the construct
Non-Terminal SymbolβEBNF Term
The EBNF term can be one of the following:
- A terminal symbol or lexical token (as defined here)
- A non-terminal symbol
- A sequence of EBNF terms separated by β|β indicating a set of alternative forms for the construct (essentially a choice, or logical OR in which one is chosen)
- An EBNF term in parentheses, β(β and β)β indicating grouping
- An EBNF term in square brackets, β[β and β]β indicating an optional (0 or 1) occurrence;
- An EBNF term in (curly) braces, β{β and β}β indicating a repetition of zero or more occurrences
1.2 - PL0 Grammar Definition
π± Generated using
ProgramβBlock EOF
A Program produces a Block followed by an EOF symbol.
Blockβ{Declaration} CompoundStatement
A Block produces 0 or more Declarations followed by a CompoundStatement
DeclarationβConstDefList | TypeDefList | VarDeclList | ProcedureDef
A Declaration can either be for a ConstDefList, TypeDefList, VarDeclList or ProcedureDef
ConstDefList A list of constant declarations
TypeDefList A list of type declarations
VarDeclList A list of variable declarations
ProcedureDef A procedure definition
ConstDefListβKW_CONST ConstDef {ConstDef}
A ConstDefList produces KW_CONST followed by one or more ConstDefs
Note here that KW_CONST is one of our Lexical Tokens from last lecture.
ConstDefβIDENTIFIER EQUALS Constant SEMICOLONConstantβNUMBER | IDENTIFIER | MINUS ConstantTypeDefListβKW TYPE TypeDef {TypeDef }TypeDefβIDENTIFIER EQUALS Type SEMICOLONTypeβTypeIdentifier | SubrangeTypeTypeIdentifierβIDENTIFIERSubrangeTypeβLBRACKET Constant RANGE Constant RBRACKETVarDeclListβKW VAR VarDecl {VarDecl}VarDeclβIDENTIFIER COLON TypeIdentifier SEMICOLONProcedureDefβProcedureHead EQUALS Block SEMICOLONProcedureHeadβKW PROCEDURE IDENTIFIER LPAREN FormalParameters RPARENFormalParametersβCompoundStatementβKW BEGIN StatementList KW ENDStatementListβStatement {SEMICOLON Statement}StatementβAssignment | CallStatement | ReadStatement | WriteStatement|AssignmentβLValue ASSIGN ConditionCallStatementβKW CALL IDENTIFIER LPAREN ActualParameters RPARENActualParametersβReadStatementβKW READ LValueWriteStatementβKW WRITE ExpWhileStatementβKW WHILE Condition KW DO StatementIfStatementβKW IF Condition KW THEN Statement KW ELSE StatementConditionβRelConditionTo be extendedRelConditionβExp [ RelOp Exp ]RelOpβEQUALS | NEQUALS | LESS | GREATER | LEQUALS | GEQUALSExpβ[PLUS | MINUS] Term {(PLUS | MINUS) Term}TermβFactor {(TIMES | DIVIDE) Factor}FactorβLPAREN Condition RPAREN | NUMBER | LValueLValueβIDENTIFIER
LValue (Left Value) something that could appear on the left hand side of an assignment statement.- We want to specify that certain operators have a higher precedence (e.g. multiplication has higher precedence than addition)
- Need a slightly more verbose grammar to specify this.
- Are these operators left-associative or right-associative?
1.3 - Parsing, Generating a Concrete Syntax Tree
π± This information is from the PL0 Compiler Example PowerPoint
if x<0 then z:= -x else z:=x
The grammar of the language can be used to extract the structure of the program statement (from its sequence of lexical tokens)
That structure can be described using a concrete syntax tree.
Our code can be parsed as the following lexical tokens:
KW_IF, IDENTIFIER("x"), LESS NUMBER(0), KW_THEN, IDENTIFIER("z"), ASSIGN, MINUS, IDENTIFIER("x"), KW_ELSE, IDENTIFIER("z") ASSIGN, IDENTIFIER("x")
From this Concrete Syntax Tree, we see that the leaf nodes give our tokens exactly (when performing an in-order traversal of all leaf nodes).
The Concrete Syntax Tree contains information about the structure of the program
When performing an in-order traversal of all of the leaf nodes, we see that we get the
This tree contains a large amount of single-child nodes - the information is very sparse and we can compact it down a lot.
- We donβt need all of this information to type-check, interpret or generate code for the program
1.4 - Abstract Syntax Tree
π± This content is from the PL0 Compiler Data Structures pdf.
4_PL0_Compiler_Data_Structures.pdf
The Abstract Syntax Tree is represented in the compiler by three main Java classes (and their various subtypes)
DeclNodeStatementNodeExpNodes
The DeclListNode abstract class groups the Abstract Syntax Tree nodes for the procedures into a list - each entry in the list represents a single procedure and includes the procedureβs symbol table entry, and a BlockNode representing itβs body
DeclNode() with subclasses
DeclListNode(List decls)
ProcedureNode(SymEntry.ProcedureEntry procEntry, BlockNode body)
The StatementNode abstract class represents a statement in the abstract syntax tree: it has a subclass for each kind of statement.
ErrorNode caters for erroneous statements.- All statement nodes contain a Location which is the location in the source code corresponding to the node.
BlockNode represents the body of a procedure (or the main program)
StatementNode(Location loc) with subclasses
ErrorNode() # Store the errors in the AST to provide the programmer with
# as much information as possible :)
BlockNode(DeclListNode procedures, StatementNode body, Scope blockLocals)
AssignmentNode(ExpNode lvalue, ExpNode e)
WriteNode(ExpNode e)
CallNode(String id)
ListNode(List sl)
IfNode(ExpNode cond, StatementNode s1, StatementNode s2) # Conditional node
WhileNode(ExpNode cond, StatementNode s)
Suppose we want to construct the Abstract Syntax Tree of the following expression:
if x < 0 then z := -x else z := x
We know that our node is going to be a binary comparison node.

ExpNode(Location loc, Type t) with subclasses
ConstNode(int value)
IdentifierNode(String id)
BinaryNode(Operator op, ExpNode left, ExpNode right)
- We have now populated the
BinaryNode with itβs children - By constructing the AST we have essentially constructed the CST and summarised the information in a much less verbose manner.

