Info Of Bison

| 分类 Develop  | 标签 bison  parser 

1 简介

Bison 是一个生成解释器的生成器,用于将上下文无关的语法转换成为特定的解释表。 Bison 可以兼容 yacc。

grammer_bison_language.png

2 Tutorial Sections

2.1 Bison 基本概念

2.1.1 Languages and Context-Free Grammars

  1. Context-Free Grammer
    • Context-Free Grammer 是一系列的语法规则,最常见的表达形式为 Backus-Naur Form 即巴克斯范式(BNF)。
    • Context-Free Grammer 有很多子类,但 Bison 为 LR 做了更多的优化。
    • LR : Deterministric LR.
    • GLR: Generalized LR.
  2. Symbols: 任意的词法单元及其组合被称为 符号。
    • NonTernimal Symbols : 由更小的 symbol 根据语法规则组合而成。
    • Terminal Symbols : 无法继续细分的 symbols,又名 Token Types .
  3. Start Symbol

    One nonterminal symbol must be distinguished as the special one which defines a complete utterance in the language. It is called the "start symbol".

    The Bison parser reads a sequence of tokens as its input, and groups the tokens using the grammar rules. If the input is valid, the end result is that the entire token sequence reduces to a single grouping whose symbol is the grammar's start symbol.

2.1.2 From Formal Rules to Bison Input

  1. Representation of symbols
    • Non-terminal Symbol
      • also known as identifier
      • should be in lower case, such as `expr', `stmt'.
    • Ternminal Symbol
      • also known as `token type'
      • should be upper case, such as `INTEGER', `IF'.
      • keyword `error' is reserved by Bison.
  2. Example:
    1: %token RETURN;
    2: stmt: RETURN expr ';' ;
    

2.1.3 Semantic Values

Each token in a Bison grammar has both a token type and a "semantic value".

  1. Token Type
    • The token type is a terminal symbol defined in the grammar, such as `INTEGER', `IDENTIFIER' or `',''.
    • It tells everything you need to know to decide where the token may validly appear and how to group it with other tokens.
    • The grammar rules know nothing about tokens except their types.
  2. Semantic Value

    The semantic value has all the rest of the information about the meaning of the token, such as the value of an integer, or the name of an identifier. (A token such as `','' which is just punctuation doesn't need to have any semantic value.)

2.1.4 Semantic Actions

In order to be useful, a program must do more than parse input; it must also produce some output based on the input. In a Bison grammar, a grammar rule can have an "action" made up of C statements. Each time the parser recognizes a match for that rule, the action is executed.

For example, here is a rule that says an expression can be the sum of two subexpressions:

1: expr: expr '+' expr   { $$ = $1 + $3; } ;

The action says how to produce the semantic value of the sum expression from the values of the two subexpressions.

2.1.5 Stages in Using Bison

  1. Grammar specification:
    • Provide Grammar file.
    • Provide Lexcial analyzer
    • Write Controlling function
    • Write error-reporting routines.
  2. Turn source code into executable app:
    • Run Bison to produce parser.
    • Compile code output by Bison and other files.
    • Link object files to produce finished product.

2.1.6 The Overall Layout of a Bison Grammar

Bison grammar file looks like:

%{
PROLOGUE
%}

BISON DECLARATIONS

%%
GRAMMAR RULES
%%
EPILOGUE

其中:

  1. %%%{ %}

    section 的分界,界定不同的 section.

  2. prologue 序言

    用于引入所需的头文件,定义错误处理函数,定义自己的宏等等。

  3. Bison declarations

    declare the names of the terminal and nonterminal symbols, and may also describe operator precedence and the data types of semantic values of various symbols.

  4. grammar rules

    define how to construct each nonterminal symbol from its parts.

  5. epilogue

    contain any code you want to use. Often the definitions of functions declared in the prologue go here. In a simple program, all the rest of the program can go here.


上一篇     下一篇