Parse Tree / Syntax Tree

Chester introduced me to this idea as he was working on a wrapper for Scipy (basically a Symbolab from the terminal).

See Context-Free Language first.

Formally, a context-free grammar is a four-tuple where:

  • is a finite set of non-terminal symbols
  • is a finite set of terminal symbols (the alphabet) • is a finite set of productions, also called rules • is the start non-terminal

We construct our parse tree right alongside running the CYK Parser.

A grammar is ambiguous if there exist two different parse trees for some word.

  • Ambiguity is undesirable for programming languages!
  • In our CS241E class, the grammar provided for the Lacs language is unambiguous and does enforce standard order of operations rules.

Example

For example, see LACS Language grammar (syntactic specification).

Consider the following Context-Free Grammar

\text{op} &\rightarrow \text{+ | - | * | /} \end{aligned}

Here would be an example of a parse tree

Derivation steps

Syntax Trees

A Concrete Syntax Tree (CST) is a detailed representation of a program’s source code, capturing every element including parentheses, semicolons, and the like. It’s primarily used for syntax validation in the parsing phase of a compiler.

An Abstract Syntax Tree (AST), on the other hand, is a simplified version of the CST that only represents the hierarchy and semantics of the source code, leaving out non-essential details. It’s used in the analysis phase for things like semantic checking and code generation.

Concepts