Some important factors from "Concepts of Programming Languages (global edition) 11ed. - Robert W. Sebesta"
:Chapter 3, 4
⬥context-free(one meaning for one lexeme) VS context-sensitive(one word having various meanings regarding the various situations)
⬥BNF(Backus–Naur form)
The derivation of the program from above is as follows:
<program> => begin <stmt_list> end
=> begin <stmt> ; <stmt_list> end
=> begin <var> = <expression> ; <stmt_list> end
=> begin A = <expression> ; <stmt_list> end
=> begin A = <var> + <var> ; <stmt_list> end
=> begin A = B + <var> ; <stmt_list> end
=> begin A = B + C ; <stmt_list> end
=> begin A = B + ; <stmt> end
=> begin A = B + C ; <var> = <expression> end
=> begin A = B + C ; B = <expression> end
=> begin A = B + C ; B = <var> end
=> begin A = B + C ; B = C end
(cf) Derivations that use this order of replacement are called leftmost derivations.
⬥Hierarchical structures are called parse trees.
From grammar of Example 3.2, through leftmost derivation, the following can be generated:
<assign> => <id> = <expr>
=> A = <expr>
=> A = <id> * <expr>
=> A = B * <expr>
=> A = B * ( <expr>)
=> A = B * ( <id> + <expr>)
=> A = B * ( A + <expr>)
=> A = B * ( A + <id>)
=> A = B * ( A + C )
Therefore, the following parse tree can be created:
There can be a unique parse tree or an ambiguous parse tree according to the grammar.
A unique parse tree will have an associativity of operaters(either lefts or right associativity)
⬥lexical analysis: find a lexeme --> add a token --> transfer to syntax analyzer
+ 3 more factors(s skipping comments and white space outside lexemes, create symbol table, error-detect)
⬥State transition diagram: similar to the finite automata from maths
⬥Parsing: top-down(preorder/recursive-descent parser, LL) VS bottom-up(postorder, LR)
⬥Syntax analysis purpose:
-check the input program to determine whether it is syntactically correct(=grammar checking)
-produce a complete parse tree(parsing)