What's the difference between parse tree and AST?

Question

Are they generated by different phases of a compiling process? Or are they just different names for the same thing?

Parse Tree is the result of your grammar with its artifacts (you can write an infinity of grammars for the same language), an AST reduce the Parse Tree the closest possible to the language. Several grammars for the same language will give different parse trees but should result to the same AST. (you can also reduce different scripts (different parse trees from the same grammar) to the same AST) — Guillaume86, Aug 29 '12 at 14:38
This SO answer discusses the dfference in detail: http://stackoverflow.com/a/1916687/120163 — Ira Baxter, Sep 27 '16 at 17:52
Possible duplicate of [What's the difference between parse trees and abstract syntax trees?](https://stackoverflow.com/questions/5967888/whats-the-difference-between-parse-trees-and-abstract-syntax-trees) — curiousdannii, Sep 10 '18 at 04:58

Guy Coder · Accepted Answer · 2016-10-20T16:59:49.740

100

This is based on the Expression Evaluator grammar by Terrence Parr.

The grammar for this example:

grammar Expr002;

options 
{
    output=AST;
    ASTLabelType=CommonTree; // type of $stat.tree ref etc...
}

prog    :   ( stat )+ ;

stat    :   expr NEWLINE        -> expr
        |   ID '=' expr NEWLINE -> ^('=' ID expr)
        |   NEWLINE             ->
        ;

expr    :   multExpr (( '+'^ | '-'^ ) multExpr)*
        ; 

multExpr
        :   atom ('*'^ atom)*
        ; 

atom    :   INT 
        |   ID
        |   '('! expr ')'!
        ;

ID      : ('a'..'z' | 'A'..'Z' )+ ;
INT     : '0'..'9'+ ;
NEWLINE : '\r'? '\n' ;
WS      : ( ' ' | '\t' )+ { skip(); } ;

Input

x=1
y=2
3*(x+y)

Parse Tree

The parse tree is a concrete representation of the input. The parse tree retains all of the information of the input. The empty boxes represent whitespace, i.e. end of line.

Parse Tree

AST

The AST is an abstract representation of the input. Notice that parens are not present in the AST because the associations are derivable from the tree structure.

AST

For a more through explanation see Compilers and Compiler Generators pg. 23
or Abstract Syntax Trees on pg. 21 in Syntax and Semantics of Programming Languages

edited Oct 20 '16 at 16:59

answered Mar 25 '12 at 22:14

Guy Coder

22,011
6
54
113

6

How do you derive the AST from the parse tree? What's the method of simplifying a parse tree into an AST? – CMCDragonkai Feb 15 '15 at 08:54
3

There is no specific algorithm to derive the AST from the parse tree. What goes into the AST is more of a personal preference but must contain enough info to accomplish the task. I excluded the parens from the AST by using the ANTLR [! operator](https://theantlrguy.atlassian.net/wiki/display/ANTLR3/Tree+construction) in the grammar since they are not needed, but by default ANTLR would have included them. I think of the parse tree as giving you everything whether you need it or not, and the AST as giving you the bare minimum. Remember that you will traverse the trees a lot, so size matters. – Guy Coder Jan 05 '16 at 15:48
2

You mean like CST (concrete syntax tree) vs AST (abstract syntax tree)? – CMCDragonkai Jan 07 '16 at 13:36
Semantic actions/rules embedded in a parser or parser generator’s syntax files are the usual way of semantic analysis and creating an AST, while the parse tree is rarely, if ever constructed or used by user code, except perhaps for parser correctness verification. – Jun 15 '18 at 22:30
Of interest: [Abstract semantic graph](https://en.wikipedia.org/wiki/Abstract_semantic_graph) – Guy Coder Jun 25 '18 at 13:20

score 16 · Answer 2 · edited Sep 01 '17 at 21:45

From what I understand, the AST focuses more on the abstract relationships between the components of source code, while the parse tree focuses on the actual implementation of the grammar utilized by the language, including the nitpicky details. They are definitely not the same, since another term for "parse tree" is "concrete syntax tree".

I found this page which attempts to resolve this exact question.

score 11 · Answer 3 · edited Oct 10 '18 at 09:50

11

The DSL book from Martin Fowler explains this nicely. The AST only contains all 'useful' elements that will be used for further processing, while the parse tree contains all the artifacts (spaces, brackets, ...) from the original document you parse

edited Oct 10 '18 at 09:50

nitarshs

173
1
10

answered Feb 17 '11 at 08:36

Wim Deblauwe

19,439
13
111
173

score 4 · Answer 4 · answered Aug 04 '15 at 14:26

Take the pascal assignment Age:= 42;

The syntax tree would look just like the source code. Below I am putting brackets around the nodes. [Age][:=][42][;]

An abstract tree would look like this [=][Age][42]

The assignment becomes a node with 2 elements, Age and 42. The idea is that you can execute the assignment.

Also note that the pascal syntax disappears. Thus it is possible to have more than one language generate the same AST. This is useful for cross language script engines.

score 1 · Answer 5 · answered Nov 25 '14 at 12:56

1

In parse tree interior nodes are non terminal, leaves are terminal. In syntax tree interior nodes are operator, leaves are operands.

answered Nov 25 '14 at 12:56

Roshani Patel

11
1

What's the difference between parse tree and AST?

5 Answers5

Linked

Related