README.rst 10 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281
  1. GAST, daou naer!
  2. ================
  3. A generic AST to represent Python2 and Python3's Abstract Syntax Tree(AST).
  4. GAST provides a compatibility layer between the AST of various Python versions,
  5. as produced by ``ast.parse`` from the standard ``ast`` module.
  6. Basic Usage
  7. -----------
  8. .. code:: python
  9. >>> import ast, gast
  10. >>> code = open('file.py').read()
  11. >>> tree = ast.parse(code)
  12. >>> gtree = gast.ast_to_gast(tree)
  13. >>> ... # process gtree
  14. >>> tree = gast.gast_to_ast(gtree)
  15. >>> ... # do stuff specific to tree
  16. API
  17. ---
  18. ``gast`` provides the same API as the ``ast`` module. All functions and classes
  19. from the ``ast`` module are also available in the ``gast`` module, and work
  20. accordingly on the ``gast`` tree.
  21. Three notable exceptions:
  22. 1. ``gast.parse`` directly produces a GAST node. It's equivalent to running
  23. ``gast.ast_to_gast`` on the output of ``ast.parse``.
  24. 2. ``gast.dump`` dumps the ``gast`` common representation, not the original
  25. one.
  26. 3. ``gast.gast_to_ast`` and ``gast.ast_to_gast`` can be used to convert
  27. from one ast to the other, back and forth.
  28. Version Compatibility
  29. ---------------------
  30. GAST is tested using ``tox`` and Travis on the following Python versions:
  31. - 2.7
  32. - 3.4
  33. - 3.5
  34. - 3.6
  35. - 3.7
  36. - 3.8
  37. - 3.9
  38. - 3.10
  39. - 3.11
  40. AST Changes
  41. -----------
  42. Python3
  43. *******
  44. The AST used by GAST is the same as the one used in Python3.9, with the
  45. notable exception of ``ast.arg`` being replaced by an ``ast.Name`` with an
  46. ``ast.Param`` context.
  47. For minor version before 3.9, please note that ``ExtSlice`` and ``Index`` are
  48. not used.
  49. For minor version before 3.8, please note that ``Ellipsis``, ``Num``, ``Str``,
  50. ``Bytes`` and ``NamedConstant`` are represented as ``Constant``.
  51. Python2
  52. *******
  53. To cope with Python3 features, several nodes from the Python2 AST are extended
  54. with some new attributes/children, or represented by different nodes
  55. - ``ModuleDef`` nodes have a ``type_ignores`` attribute.
  56. - ``FunctionDef`` nodes have a ``returns`` attribute and a ``type_comment``
  57. attribute.
  58. - ``ClassDef`` nodes have a ``keywords`` attribute.
  59. - ``With``'s ``context_expr`` and ``optional_vars`` fields are hold in a
  60. ``withitem`` object.
  61. - ``For`` nodes have a ``type_comment`` attribute.
  62. - ``Raise``'s ``type``, ``inst`` and ``tback`` fields are hold in a single
  63. ``exc`` field, using the transformation ``raise E, V, T => raise E(V).with_traceback(T)``.
  64. - ``TryExcept`` and ``TryFinally`` nodes are merged in the ``Try`` node.
  65. - ``arguments`` nodes have a ``kwonlyargs`` and ``kw_defaults`` attributes.
  66. - ``Call`` nodes loose their ``starargs`` attribute, replaced by an
  67. argument wrapped in a ``Starred`` node. They also loose their ``kwargs``
  68. attribute, wrapped in a ``keyword`` node with the identifier set to
  69. ``None``, as done in Python3.
  70. - ``comprehension`` nodes have an ``async`` attribute (that is always set
  71. to 0).
  72. - ``Ellipsis``, ``Num`` and ``Str`` nodes are represented as ``Constant``.
  73. - ``Subscript`` which don't have any ``Slice`` node as ``slice`` attribute (and
  74. ``Ellipsis`` are not ``Slice`` nodes) no longer hold an ``ExtSlice`` but an
  75. ``Index(Tuple(...))`` instead.
  76. Pit Falls
  77. *********
  78. - In Python3, ``None``, ``True`` and ``False`` are parsed as ``Constant``
  79. while they are parsed as regular ``Name`` in Python2.
  80. ASDL
  81. ****
  82. This closely matches the one from https://docs.python.org/3/library/ast.html#abstract-grammar, with a few
  83. trade-offs to cope with legacy ASTs.
  84. .. code::
  85. -- ASDL's six builtin types are identifier, int, string, bytes, object, singleton
  86. module Python
  87. {
  88. mod = Module(stmt* body, type_ignore *type_ignores)
  89. | Interactive(stmt* body)
  90. | Expression(expr body)
  91. | FunctionType(expr* argtypes, expr returns)
  92. -- not really an actual node but useful in Jython's typesystem.
  93. | Suite(stmt* body)
  94. stmt = FunctionDef(identifier name, arguments args,
  95. stmt* body, expr* decorator_list, expr? returns,
  96. string? type_comment)
  97. | AsyncFunctionDef(identifier name, arguments args,
  98. stmt* body, expr* decorator_list, expr? returns,
  99. string? type_comment)
  100. | ClassDef(identifier name,
  101. expr* bases,
  102. keyword* keywords,
  103. stmt* body,
  104. expr* decorator_list)
  105. | Return(expr? value)
  106. | Delete(expr* targets)
  107. | Assign(expr* targets, expr value, string? type_comment)
  108. | AugAssign(expr target, operator op, expr value)
  109. -- 'simple' indicates that we annotate simple name without parens
  110. | AnnAssign(expr target, expr annotation, expr? value, int simple)
  111. -- not sure if bool is allowed, can always use int
  112. | Print(expr? dest, expr* values, bool nl)
  113. -- use 'orelse' because else is a keyword in target languages
  114. | For(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment)
  115. | AsyncFor(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment)
  116. | While(expr test, stmt* body, stmt* orelse)
  117. | If(expr test, stmt* body, stmt* orelse)
  118. | With(withitem* items, stmt* body, string? type_comment)
  119. | AsyncWith(withitem* items, stmt* body, string? type_comment)
  120. | Match(expr subject, match_case* cases)
  121. | Raise(expr? exc, expr? cause)
  122. | Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody)
  123. | TryStar(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody)
  124. | Assert(expr test, expr? msg)
  125. | Import(alias* names)
  126. | ImportFrom(identifier? module, alias* names, int? level)
  127. -- Doesn't capture requirement that locals must be
  128. -- defined if globals is
  129. -- still supports use as a function!
  130. | Exec(expr body, expr? globals, expr? locals)
  131. | Global(identifier* names)
  132. | Nonlocal(identifier* names)
  133. | Expr(expr value)
  134. | Pass | Break | Continue
  135. -- XXX Jython will be different
  136. -- col_offset is the byte offset in the utf8 string the parser uses
  137. attributes (int lineno, int col_offset)
  138. -- BoolOp() can use left & right?
  139. expr = BoolOp(boolop op, expr* values)
  140. | NamedExpr(expr target, expr value)
  141. | BinOp(expr left, operator op, expr right)
  142. | UnaryOp(unaryop op, expr operand)
  143. | Lambda(arguments args, expr body)
  144. | IfExp(expr test, expr body, expr orelse)
  145. | Dict(expr* keys, expr* values)
  146. | Set(expr* elts)
  147. | ListComp(expr elt, comprehension* generators)
  148. | SetComp(expr elt, comprehension* generators)
  149. | DictComp(expr key, expr value, comprehension* generators)
  150. | GeneratorExp(expr elt, comprehension* generators)
  151. -- the grammar constrains where yield expressions can occur
  152. | Await(expr value)
  153. | Yield(expr? value)
  154. | YieldFrom(expr value)
  155. -- need sequences for compare to distinguish between
  156. -- x < 4 < 3 and (x < 4) < 3
  157. | Compare(expr left, cmpop* ops, expr* comparators)
  158. | Call(expr func, expr* args, keyword* keywords)
  159. | Repr(expr value)
  160. | FormattedValue(expr value, int? conversion, expr? format_spec)
  161. | JoinedStr(expr* values)
  162. | Constant(constant value, string? kind)
  163. -- the following expression can appear in assignment context
  164. | Attribute(expr value, identifier attr, expr_context ctx)
  165. | Subscript(expr value, slice slice, expr_context ctx)
  166. | Starred(expr value, expr_context ctx)
  167. | Name(identifier id, expr_context ctx, expr? annotation,
  168. string? type_comment)
  169. | List(expr* elts, expr_context ctx)
  170. | Tuple(expr* elts, expr_context ctx)
  171. -- col_offset is the byte offset in the utf8 string the parser uses
  172. attributes (int lineno, int col_offset)
  173. expr_context = Load | Store | Del | AugLoad | AugStore | Param
  174. slice = Slice(expr? lower, expr? upper, expr? step)
  175. | ExtSlice(slice* dims)
  176. | Index(expr value)
  177. boolop = And | Or
  178. operator = Add | Sub | Mult | MatMult | Div | Mod | Pow | LShift
  179. | RShift | BitOr | BitXor | BitAnd | FloorDiv
  180. unaryop = Invert | Not | UAdd | USub
  181. cmpop = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn
  182. comprehension = (expr target, expr iter, expr* ifs, int is_async)
  183. excepthandler = ExceptHandler(expr? type, expr? name, stmt* body)
  184. attributes (int lineno, int col_offset, int? end_lineno, int? end_col_offset)
  185. arguments = (expr* args, expr* posonlyargs, expr? vararg, expr* kwonlyargs,
  186. expr* kw_defaults, expr? kwarg, expr* defaults)
  187. -- keyword arguments supplied to call (NULL identifier for **kwargs)
  188. keyword = (identifier? arg, expr value)
  189. -- import name with optional 'as' alias.
  190. alias = (identifier name, identifier? asname)
  191. attributes (int lineno, int col_offset, int? end_lineno, int? end_col_offset)
  192. withitem = (expr context_expr, expr? optional_vars)
  193. match_case = (pattern pattern, expr? guard, stmt* body)
  194. pattern = MatchValue(expr value)
  195. | MatchSingleton(constant value)
  196. | MatchSequence(pattern* patterns)
  197. | MatchMapping(expr* keys, pattern* patterns, identifier? rest)
  198. | MatchClass(expr cls, pattern* patterns, identifier* kwd_attrs, pattern* kwd_patterns)
  199. | MatchStar(identifier? name)
  200. -- The optional "rest" MatchMapping parameter handles capturing extra mapping keys
  201. | MatchAs(pattern? pattern, identifier? name)
  202. | MatchOr(pattern* patterns)
  203. attributes (int lineno, int col_offset, int end_lineno, int end_col_offset)
  204. type_ignore = TypeIgnore(int lineno, string tag)
  205. }