Newest Viewed Downloaded

Parsing Giuseppe Attardi Università di Pisa

Parsing Giuseppe Attardi Università di Pisa

Parsing To extract the grammatical structure of a program, where: sentence = program words = tokens For further information: Aho, Sethi, Ullman, “Compilers: Principles, Techniques, and Tools” (a.k.a, the “Dragon Book”)

Outline of coverage Context-free grammars Parsing Tabular Parsing Methods One pass Top-down Bottom-up Yacc

Grammatical structure of program function-def name arguments stmt-list main stmt expression operator expression expression variable string cout << “hello, world\n” ()

Context-free languages Grammatical structure defined by context-free grammar statement  labeled-statement | expression-statement | compound-statement labeled-statement  ident : statement | case constant-expression : statement compound-statement  { declaration-list statement-list } terminal non-terminal “Context-free” = only one non-terminal in left-part

Parse trees Parse tree = tree labeled with grammar symbols, such that: If node is labeled A, and its children are labeled x1...xn, then there is a production A x1...xn “Parse tree from A” = root labeled with A “Complete parse tree” = all leaves labeled with tokens

Parse trees and sentences Frontier of tree = labels on leaves (in left-to-right order) Frontier of tree from S is a sentential form Frontier of a complete tree from S is a sentence L E a L ; E “Frontier”

Example G: L L ; E | E E a | b Syntax trees from start symbol (L): a a;E a;b;b L E a L E a L ; E L E a L ; E b L E b ; Sentential forms:

Derivations Alternate definition of sentence: Given ,  in V*, say  is a derivation step if ’’’ and  = ’’’ , where A  is a production  is a sentential form iff there exists a derivation (sequence of derivation steps) S ( alternatively, we say that S* ) Two definitions are equivalent, but note that there are many derivations corresponding to each parse tree

Another example H: L E ; L | E E a | b L E a L E a L ; E L E a L ; E b L E b ;

Ambiguity A sentence can have more than one parse tree A grammar is ambiguous if there is a sentence with more than one parse tree Example 1 E  E+E | E*E | id E E E E E id id id + * E E E E E id id id + *

Notes If e then if b then d else f { int x; y = 0; } a.b.c = d; Id -> s | s.id E -> E + T -> E + T + T -> T + T + T -> id + T + T -> id + T * id + T -> id + id * id + T -> id + id * id + id

Ambiguity Ambiguity is a function of the grammar rather than the language Certain ambiguous grammars may have equivalent unambiguous ones

Grammar Transformations Grammars can be transformed without affecting the language generated Three transformations are discussed next: Eliminating Ambiguity Eliminating Left Recursion (i.e. productions of the form AA  ) Left Factoring

Eliminating Ambiguity Sometimes an ambiguous grammar can be rewritten to eliminate ambiguity For example, the grammar of Example 1 can be written as follows: E  E +T | T T  T * id | id The language generated by this grammar is the same as that generated by the previous grammar. Both generate id(+id|*id)* However, this grammar is not ambiguous

Eliminating Ambiguity (Cont.) One advantage of this grammar is that it represents the precedence between operators. In the parsing tree, products appear nested within additions E T T E id + * id T id

Eliminating Ambiguity (Cont.) An example of ambiguity in a programming language is the dangling else Consider S  if b then S else S | if b then S | a

Eliminating Ambiguity (Cont.) When there are two nested ifs and only one else.. S if b then S else S a if b then S a S if b then S if b then S else S a a

Eliminating Ambiguity (Cont.) In most languages (including C++ and Java), each else is assumed to belong to the nearest if that is not already matched by an else. This association is expressed in the following (unambiguous) grammar: S  Matched | Unmatched Matched  if b then Matched else Matched | a Unmatched  if b then S | if b then Matched else Unmatched

Eliminating Ambiguity (Cont.) Ambiguity is a property of the grammar It is undecidable whether a context free grammar is ambiguous The proof is done by reduction to Post’s correspondence problem Although there is no general algorithm, it is possible to isolate certain constructs in productions which lead to ambiguous grammars

Showing 1 - 20 of 75 items Details

Name: 
parsing
Author: 
N/A
Company: 
N/A
Description: 
Parsing Giuseppe Attardi Università di Pisa
Tags: 
pars | grammar | product | input | exp | token | ambigu | expr
Created: 
9/25/2009 8:53:05 AM
Slides: 
75
Views: 
0
Downloads: 
1
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap