README.rst 8.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252
  1. ===============
  2. pycparser v2.21
  3. ===============
  4. .. image:: https://github.com/eliben/pycparser/workflows/pycparser-tests/badge.svg
  5. :align: center
  6. :target: https://github.com/eliben/pycparser/actions
  7. ----
  8. .. contents::
  9. :backlinks: none
  10. .. sectnum::
  11. Introduction
  12. ============
  13. What is pycparser?
  14. ------------------
  15. **pycparser** is a parser for the C language, written in pure Python. It is a
  16. module designed to be easily integrated into applications that need to parse
  17. C source code.
  18. What is it good for?
  19. --------------------
  20. Anything that needs C code to be parsed. The following are some uses for
  21. **pycparser**, taken from real user reports:
  22. * C code obfuscator
  23. * Front-end for various specialized C compilers
  24. * Static code checker
  25. * Automatic unit-test discovery
  26. * Adding specialized extensions to the C language
  27. One of the most popular uses of **pycparser** is in the `cffi
  28. <https://cffi.readthedocs.io/en/latest/>`_ library, which uses it to parse the
  29. declarations of C functions and types in order to auto-generate FFIs.
  30. **pycparser** is unique in the sense that it's written in pure Python - a very
  31. high level language that's easy to experiment with and tweak. To people familiar
  32. with Lex and Yacc, **pycparser**'s code will be simple to understand. It also
  33. has no external dependencies (except for a Python interpreter), making it very
  34. simple to install and deploy.
  35. Which version of C does pycparser support?
  36. ------------------------------------------
  37. **pycparser** aims to support the full C99 language (according to the standard
  38. ISO/IEC 9899). Some features from C11 are also supported, and patches to support
  39. more are welcome.
  40. **pycparser** supports very few GCC extensions, but it's fairly easy to set
  41. things up so that it parses code with a lot of GCC-isms successfully. See the
  42. `FAQ <https://github.com/eliben/pycparser/wiki/FAQ>`_ for more details.
  43. What grammar does pycparser follow?
  44. -----------------------------------
  45. **pycparser** very closely follows the C grammar provided in Annex A of the C99
  46. standard (ISO/IEC 9899).
  47. How is pycparser licensed?
  48. --------------------------
  49. `BSD license <https://github.com/eliben/pycparser/blob/master/LICENSE>`_.
  50. Contact details
  51. ---------------
  52. For reporting problems with **pycparser** or submitting feature requests, please
  53. open an `issue <https://github.com/eliben/pycparser/issues>`_, or submit a
  54. pull request.
  55. Installing
  56. ==========
  57. Prerequisites
  58. -------------
  59. * **pycparser** was tested on Python 2.7, 3.4-3.6, on both Linux and
  60. Windows. It should work on any later version (in both the 2.x and 3.x lines)
  61. as well.
  62. * **pycparser** has no external dependencies. The only non-stdlib library it
  63. uses is PLY, which is bundled in ``pycparser/ply``. The current PLY version is
  64. 3.10, retrieved from `<http://www.dabeaz.com/ply/>`_
  65. Note that **pycparser** (and PLY) uses docstrings for grammar specifications.
  66. Python installations that strip docstrings (such as when using the Python
  67. ``-OO`` option) will fail to instantiate and use **pycparser**. You can try to
  68. work around this problem by making sure the PLY parsing tables are pre-generated
  69. in normal mode; this isn't an officially supported/tested mode of operation,
  70. though.
  71. Installation process
  72. --------------------
  73. Installing **pycparser** is very simple. Once you download and unzip the
  74. package, you just have to execute the standard ``python setup.py install``. The
  75. setup script will then place the ``pycparser`` module into ``site-packages`` in
  76. your Python's installation library.
  77. Alternatively, since **pycparser** is listed in the `Python Package Index
  78. <https://pypi.org/project/pycparser/>`_ (PyPI), you can install it using your
  79. favorite Python packaging/distribution tool, for example with::
  80. > pip install pycparser
  81. Known problems
  82. --------------
  83. * Some users who've installed a new version of **pycparser** over an existing
  84. version ran into a problem using the newly installed library. This has to do
  85. with parse tables staying around as ``.pyc`` files from the older version. If
  86. you see unexplained errors from **pycparser** after an upgrade, remove it (by
  87. deleting the ``pycparser`` directory in your Python's ``site-packages``, or
  88. wherever you installed it) and install again.
  89. Using
  90. =====
  91. Interaction with the C preprocessor
  92. -----------------------------------
  93. In order to be compilable, C code must be preprocessed by the C preprocessor -
  94. ``cpp``. ``cpp`` handles preprocessing directives like ``#include`` and
  95. ``#define``, removes comments, and performs other minor tasks that prepare the C
  96. code for compilation.
  97. For all but the most trivial snippets of C code **pycparser**, like a C
  98. compiler, must receive preprocessed C code in order to function correctly. If
  99. you import the top-level ``parse_file`` function from the **pycparser** package,
  100. it will interact with ``cpp`` for you, as long as it's in your PATH, or you
  101. provide a path to it.
  102. Note also that you can use ``gcc -E`` or ``clang -E`` instead of ``cpp``. See
  103. the ``using_gcc_E_libc.py`` example for more details. Windows users can download
  104. and install a binary build of Clang for Windows `from this website
  105. <http://llvm.org/releases/download.html>`_.
  106. What about the standard C library headers?
  107. ------------------------------------------
  108. C code almost always ``#include``\s various header files from the standard C
  109. library, like ``stdio.h``. While (with some effort) **pycparser** can be made to
  110. parse the standard headers from any C compiler, it's much simpler to use the
  111. provided "fake" standard includes for C11 in ``utils/fake_libc_include``. These
  112. are standard C header files that contain only the bare necessities to allow
  113. valid parsing of the files that use them. As a bonus, since they're minimal, it
  114. can significantly improve the performance of parsing large C files.
  115. The key point to understand here is that **pycparser** doesn't really care about
  116. the semantics of types. It only needs to know whether some token encountered in
  117. the source is a previously defined type. This is essential in order to be able
  118. to parse C correctly.
  119. See `this blog post
  120. <https://eli.thegreenplace.net/2015/on-parsing-c-type-declarations-and-fake-headers>`_
  121. for more details.
  122. Note that the fake headers are not included in the ``pip`` package nor installed
  123. via ``setup.py`` (`#224 <https://github.com/eliben/pycparser/issues/224>`_).
  124. Basic usage
  125. -----------
  126. Take a look at the |examples|_ directory of the distribution for a few examples
  127. of using **pycparser**. These should be enough to get you started. Please note
  128. that most realistic C code samples would require running the C preprocessor
  129. before passing the code to **pycparser**; see the previous sections for more
  130. details.
  131. .. |examples| replace:: ``examples``
  132. .. _examples: examples
  133. Advanced usage
  134. --------------
  135. The public interface of **pycparser** is well documented with comments in
  136. ``pycparser/c_parser.py``. For a detailed overview of the various AST nodes
  137. created by the parser, see ``pycparser/_c_ast.cfg``.
  138. There's also a `FAQ available here <https://github.com/eliben/pycparser/wiki/FAQ>`_.
  139. In any case, you can always drop me an `email <eliben@gmail.com>`_ for help.
  140. Modifying
  141. =========
  142. There are a few points to keep in mind when modifying **pycparser**:
  143. * The code for **pycparser**'s AST nodes is automatically generated from a
  144. configuration file - ``_c_ast.cfg``, by ``_ast_gen.py``. If you modify the AST
  145. configuration, make sure to re-generate the code. This can be done by running
  146. the ``_build_tables.py`` script from the ``pycparser`` directory.
  147. * Make sure you understand the optimized mode of **pycparser** - for that you
  148. must read the docstring in the constructor of the ``CParser`` class. For
  149. development you should create the parser without optimizations, so that it
  150. will regenerate the Yacc and Lex tables when you change the grammar.
  151. Package contents
  152. ================
  153. Once you unzip the ``pycparser`` package, you'll see the following files and
  154. directories:
  155. README.rst:
  156. This README file.
  157. LICENSE:
  158. The pycparser license
  159. setup.py:
  160. Installation script
  161. examples/:
  162. A directory with some examples of using **pycparser**
  163. pycparser/:
  164. The **pycparser** module source code.
  165. tests/:
  166. Unit tests.
  167. utils/fake_libc_include:
  168. Minimal standard C library include files that should allow to parse any C code.
  169. Note that these headers now include C11 code, so they may not work when the
  170. preprocessor is configured to an earlier C standard (like ``-std=c99``).
  171. utils/internal/:
  172. Internal utilities for my own use. You probably don't need them.
  173. Contributors
  174. ============
  175. Some people have contributed to **pycparser** by opening issues on bugs they've
  176. found and/or submitting patches. The list of contributors is in the CONTRIBUTORS
  177. file in the source distribution. After **pycparser** moved to Github I stopped
  178. updating this list because Github does a much better job at tracking
  179. contributions.