README 6.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191
  1. This directory contains data needed by Bison.
  2. # Directory content
  3. ## Skeletons
  4. Bison skeletons: the general shapes of the different parser kinds, that are
  5. specialized for specific grammars by the bison program.
  6. Currently, the supported skeletons are:
  7. - yacc.c
  8. It used to be named bison.simple: it corresponds to C Yacc
  9. compatible LALR(1) parsers.
  10. - lalr1.cc
  11. Produces a C++ parser class.
  12. - lalr1.java
  13. Produces a Java parser class.
  14. - glr.c
  15. A Generalized LR C parser based on Bison's LALR(1) tables.
  16. - glr.cc
  17. A Generalized LR C++ parser. Actually a C++ wrapper around glr.c.
  18. These skeletons are the only ones supported by the Bison team. Because the
  19. interface between skeletons and the bison program is not finished, *we are
  20. not bound to it*. In particular, Bison is not mature enough for us to
  21. consider that "foreign skeletons" are supported.
  22. ## m4sugar
  23. This directory contains M4sugar, sort of an extended library for M4, which
  24. is used by Bison to instantiate the skeletons.
  25. ## xslt
  26. This directory contains XSLT programs that transform Bison's XML output into
  27. various formats.
  28. - bison.xsl
  29. A library of routines used by the other XSLT programs.
  30. - xml2dot.xsl
  31. Conversion into GraphViz's dot format.
  32. - xml2text.xsl
  33. Conversion into text.
  34. - xml2xhtml.xsl
  35. Conversion into XHTML.
  36. # Implementation note about the skeletons
  37. "Skeleton" in Bison parlance means "backend": a skeleton is fed by the bison
  38. executable with LR tables, facts about the symbols, etc. and they generate
  39. the output (say parser.cc, parser.hh, location.hh, etc.). They are only in
  40. charge of generating the parser and its auxiliary files, they do not
  41. generate the XML output, the parser.output reports, nor the graphical
  42. rendering.
  43. The bits of information passing from bison to the backend is named
  44. "muscles". Muscles are passed to M4 via its standard input: it's a set of
  45. m4 definitions. To see them, use `--trace=muscles`.
  46. Except for muscles, whose names are generated by bison, the skeletons have
  47. no constraint at all on the macro names: there is no technical/theoretical
  48. limitation, as long as you generate the output, you can do what you want.
  49. However, of course, that would be a bad idea if, say, the C and C++
  50. skeletons used different approaches and had completely different
  51. implementations. That would be a maintenance nightmare.
  52. Below, we document some of the macros that we use in several of the
  53. skeletons. If you are to write a new skeleton, please, implement them for
  54. your language. Overall, be sure to follow the same patterns as the existing
  55. skeletons.
  56. ## Symbols
  57. ### `b4_symbol(NUM, FIELD)`
  58. In order to unify the handling of the various aspects of symbols (tag, type
  59. name, whether terminal, etc.), bison.exe defines one macro per (token,
  60. field), where field can `has_id`, `id`, etc.: see
  61. `prepare_symbols_definitions()` in `src/output.c`.
  62. The macro `b4_symbol(NUM, FIELD)` gives access to the following FIELDS:
  63. - `has_id`: 0 or 1.
  64. Whether the symbol has an id.
  65. - `id`: string
  66. If has_id, the id (prefixed by api.token.prefix if defined), otherwise
  67. defined as empty. Guaranteed to be usable as a C identifier.
  68. - `tag`: string.
  69. A representation of the symbol. Can be 'foo', 'foo.id', '"foo"' etc.
  70. - `user_number`: integer
  71. The external number as used by yylex. Can be ASCII code when a character,
  72. some number chosen by bison, or some user number in the case of
  73. %token FOO <NUM>. Corresponds to yychar in yacc.c.
  74. - `is_token`: 0 or 1
  75. Whether this is a terminal symbol.
  76. - `number`: integer
  77. The internal number (computed from the external number by yytranslate).
  78. Corresponds to yytoken in yacc.c. This is the same number that serves as
  79. key in b4_symbol(NUM, FIELD).
  80. In bison, symbols are first assigned increasing numbers in order of
  81. appearance (but tokens first, then nterms). After grammar reduction,
  82. unused nterms are then renumbered to appear last (i.e., first tokens, then
  83. used nterms and finally unused nterms). This final number NUM is the one
  84. contained in this field, and it is the one used as key in `b4_symbol(NUM,
  85. FIELD)`.
  86. The code of the rule actions, however, is emitted before we know what
  87. symbols are unused, so they use the original numbers. To avoid confusion,
  88. they actually use "orig NUM" instead of just "NUM". bison also emits
  89. definitions for `b4_symbol(orig NUM, number)` that map from original
  90. numbers to the new ones. `b4_symbol` actually resolves `orig NUM` in the
  91. other case, i.e., `b4_symbol(orig 42, tag)` would return the tag of the
  92. symbols whose original number was 42.
  93. - `has_type`: 0, 1
  94. Whether has a semantic value.
  95. - `type_tag`: string
  96. When api.value.type=union, the generated name for the union member.
  97. yytype_INT etc. for symbols that has_id, otherwise yytype_1 etc.
  98. - `type`
  99. If it has a semantic value, its type tag, or, if variant are used,
  100. its type.
  101. In the case of api.value.type=union, type is the real type (e.g. int).
  102. - `has_printer`: 0, 1
  103. - `printer`: string
  104. - `printer_file`: string
  105. - `printer_line`: integer
  106. If the symbol has a printer, everything about it.
  107. - `has_destructor`, `destructor`, `destructor_file`, `destructor_line`
  108. Likewise.
  109. ### `b4_symbol_value(VAL, [SYMBOL-NUM], [TYPE-TAG])`
  110. Expansion of $$, $1, $<TYPE-TAG>3, etc.
  111. The semantic value from a given VAL.
  112. - `VAL`: some semantic value storage (typically a union). e.g., `yylval`
  113. - `SYMBOL-NUM`: the symbol number from which we extract the type tag.
  114. - `TYPE-TAG`, the user forced the `<TYPE-TAG>`.
  115. The result can be used safely, it is put in parens to avoid nasty precedence
  116. issues.
  117. ### `b4_lhs_value(SYMBOL-NUM, [TYPE])`
  118. Expansion of `$$` or `$<TYPE>$`, for symbol `SYMBOL-NUM`.
  119. ### `b4_rhs_data(RULE-LENGTH, POS)`
  120. The data corresponding to the symbol `#POS`, where the current rule has
  121. `RULE-LENGTH` symbols on RHS.
  122. ### `b4_rhs_value(RULE-LENGTH, POS, SYMBOL-NUM, [TYPE])`
  123. Expansion of `$<TYPE>POS`, where the current rule has `RULE-LENGTH` symbols
  124. on RHS.
  125. -----
  126. Local Variables:
  127. mode: markdown
  128. fill-column: 76
  129. ispell-dictionary: "american"
  130. End:
  131. Copyright (C) 2002, 2008-2015, 2018-2019 Free Software Foundation, Inc.
  132. This file is part of GNU Bison.
  133. This program is free software: you can redistribute it and/or modify
  134. it under the terms of the GNU General Public License as published by
  135. the Free Software Foundation, either version 3 of the License, or
  136. (at your option) any later version.
  137. This program is distributed in the hope that it will be useful,
  138. but WITHOUT ANY WARRANTY; without even the implied warranty of
  139. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  140. GNU General Public License for more details.
  141. You should have received a copy of the GNU General Public License
  142. along with this program. If not, see <http://www.gnu.org/licenses/>.