123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191 |
- This directory contains data needed by Bison.
- # Directory content
- ## Skeletons
- Bison skeletons: the general shapes of the different parser kinds, that are
- specialized for specific grammars by the bison program.
- Currently, the supported skeletons are:
- - yacc.c
- It used to be named bison.simple: it corresponds to C Yacc
- compatible LALR(1) parsers.
- - lalr1.cc
- Produces a C++ parser class.
- - lalr1.java
- Produces a Java parser class.
- - glr.c
- A Generalized LR C parser based on Bison's LALR(1) tables.
- - glr.cc
- A Generalized LR C++ parser. Actually a C++ wrapper around glr.c.
- These skeletons are the only ones supported by the Bison team. Because the
- interface between skeletons and the bison program is not finished, *we are
- not bound to it*. In particular, Bison is not mature enough for us to
- consider that "foreign skeletons" are supported.
- ## m4sugar
- This directory contains M4sugar, sort of an extended library for M4, which
- is used by Bison to instantiate the skeletons.
- ## xslt
- This directory contains XSLT programs that transform Bison's XML output into
- various formats.
- - bison.xsl
- A library of routines used by the other XSLT programs.
- - xml2dot.xsl
- Conversion into GraphViz's dot format.
- - xml2text.xsl
- Conversion into text.
- - xml2xhtml.xsl
- Conversion into XHTML.
- # Implementation note about the skeletons
- "Skeleton" in Bison parlance means "backend": a skeleton is fed by the bison
- executable with LR tables, facts about the symbols, etc. and they generate
- the output (say parser.cc, parser.hh, location.hh, etc.). They are only in
- charge of generating the parser and its auxiliary files, they do not
- generate the XML output, the parser.output reports, nor the graphical
- rendering.
- The bits of information passing from bison to the backend is named
- "muscles". Muscles are passed to M4 via its standard input: it's a set of
- m4 definitions. To see them, use `--trace=muscles`.
- Except for muscles, whose names are generated by bison, the skeletons have
- no constraint at all on the macro names: there is no technical/theoretical
- limitation, as long as you generate the output, you can do what you want.
- However, of course, that would be a bad idea if, say, the C and C++
- skeletons used different approaches and had completely different
- implementations. That would be a maintenance nightmare.
- Below, we document some of the macros that we use in several of the
- skeletons. If you are to write a new skeleton, please, implement them for
- your language. Overall, be sure to follow the same patterns as the existing
- skeletons.
- ## Symbols
- ### `b4_symbol(NUM, FIELD)`
- In order to unify the handling of the various aspects of symbols (tag, type
- name, whether terminal, etc.), bison.exe defines one macro per (token,
- field), where field can `has_id`, `id`, etc.: see
- `prepare_symbols_definitions()` in `src/output.c`.
- The macro `b4_symbol(NUM, FIELD)` gives access to the following FIELDS:
- - `has_id`: 0 or 1.
- Whether the symbol has an id.
- - `id`: string
- If has_id, the id (prefixed by api.token.prefix if defined), otherwise
- defined as empty. Guaranteed to be usable as a C identifier.
- - `tag`: string.
- A representation of the symbol. Can be 'foo', 'foo.id', '"foo"' etc.
- - `user_number`: integer
- The external number as used by yylex. Can be ASCII code when a character,
- some number chosen by bison, or some user number in the case of
- %token FOO <NUM>. Corresponds to yychar in yacc.c.
- - `is_token`: 0 or 1
- Whether this is a terminal symbol.
- - `number`: integer
- The internal number (computed from the external number by yytranslate).
- Corresponds to yytoken in yacc.c. This is the same number that serves as
- key in b4_symbol(NUM, FIELD).
- In bison, symbols are first assigned increasing numbers in order of
- appearance (but tokens first, then nterms). After grammar reduction,
- unused nterms are then renumbered to appear last (i.e., first tokens, then
- used nterms and finally unused nterms). This final number NUM is the one
- contained in this field, and it is the one used as key in `b4_symbol(NUM,
- FIELD)`.
- The code of the rule actions, however, is emitted before we know what
- symbols are unused, so they use the original numbers. To avoid confusion,
- they actually use "orig NUM" instead of just "NUM". bison also emits
- definitions for `b4_symbol(orig NUM, number)` that map from original
- numbers to the new ones. `b4_symbol` actually resolves `orig NUM` in the
- other case, i.e., `b4_symbol(orig 42, tag)` would return the tag of the
- symbols whose original number was 42.
- - `has_type`: 0, 1
- Whether has a semantic value.
- - `type_tag`: string
- When api.value.type=union, the generated name for the union member.
- yytype_INT etc. for symbols that has_id, otherwise yytype_1 etc.
- - `type`
- If it has a semantic value, its type tag, or, if variant are used,
- its type.
- In the case of api.value.type=union, type is the real type (e.g. int).
- - `has_printer`: 0, 1
- - `printer`: string
- - `printer_file`: string
- - `printer_line`: integer
- If the symbol has a printer, everything about it.
- - `has_destructor`, `destructor`, `destructor_file`, `destructor_line`
- Likewise.
- ### `b4_symbol_value(VAL, [SYMBOL-NUM], [TYPE-TAG])`
- Expansion of $$, $1, $<TYPE-TAG>3, etc.
- The semantic value from a given VAL.
- - `VAL`: some semantic value storage (typically a union). e.g., `yylval`
- - `SYMBOL-NUM`: the symbol number from which we extract the type tag.
- - `TYPE-TAG`, the user forced the `<TYPE-TAG>`.
- The result can be used safely, it is put in parens to avoid nasty precedence
- issues.
- ### `b4_lhs_value(SYMBOL-NUM, [TYPE])`
- Expansion of `$$` or `$<TYPE>$`, for symbol `SYMBOL-NUM`.
- ### `b4_rhs_data(RULE-LENGTH, POS)`
- The data corresponding to the symbol `#POS`, where the current rule has
- `RULE-LENGTH` symbols on RHS.
- ### `b4_rhs_value(RULE-LENGTH, POS, SYMBOL-NUM, [TYPE])`
- Expansion of `$<TYPE>POS`, where the current rule has `RULE-LENGTH` symbols
- on RHS.
- -----
- Local Variables:
- mode: markdown
- fill-column: 76
- ispell-dictionary: "american"
- End:
- Copyright (C) 2002, 2008-2015, 2018-2019 Free Software Foundation, Inc.
- This file is part of GNU Bison.
- This program is free software: you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation, either version 3 of the License, or
- (at your option) any later version.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program. If not, see <http://www.gnu.org/licenses/>.
|