GAST, daou naer! ================ A generic AST to represent Python2 and Python3's Abstract Syntax Tree(AST). GAST provides a compatibility layer between the AST of various Python versions, as produced by ``ast.parse`` from the standard ``ast`` module. Basic Usage ----------- .. code:: python >>> import ast, gast >>> code = open('file.py').read() >>> tree = ast.parse(code) >>> gtree = gast.ast_to_gast(tree) >>> ... # process gtree >>> tree = gast.gast_to_ast(gtree) >>> ... # do stuff specific to tree API --- ``gast`` provides the same API as the ``ast`` module. All functions and classes from the ``ast`` module are also available in the ``gast`` module, and work accordingly on the ``gast`` tree. Three notable exceptions: 1. ``gast.parse`` directly produces a GAST node. It's equivalent to running ``gast.ast_to_gast`` on the output of ``ast.parse``. 2. ``gast.dump`` dumps the ``gast`` common representation, not the original one. 3. ``gast.gast_to_ast`` and ``gast.ast_to_gast`` can be used to convert from one ast to the other, back and forth. Version Compatibility --------------------- GAST is tested using ``tox`` and Travis on the following Python versions: - 2.7 - 3.4 - 3.5 - 3.6 - 3.7 - 3.8 - 3.9 - 3.10 - 3.11 AST Changes ----------- Python3 ******* The AST used by GAST is the same as the one used in Python3.9, with the notable exception of ``ast.arg`` being replaced by an ``ast.Name`` with an ``ast.Param`` context. For minor version before 3.9, please note that ``ExtSlice`` and ``Index`` are not used. For minor version before 3.8, please note that ``Ellipsis``, ``Num``, ``Str``, ``Bytes`` and ``NamedConstant`` are represented as ``Constant``. Python2 ******* To cope with Python3 features, several nodes from the Python2 AST are extended with some new attributes/children, or represented by different nodes - ``ModuleDef`` nodes have a ``type_ignores`` attribute. - ``FunctionDef`` nodes have a ``returns`` attribute and a ``type_comment`` attribute. - ``ClassDef`` nodes have a ``keywords`` attribute. - ``With``'s ``context_expr`` and ``optional_vars`` fields are hold in a ``withitem`` object. - ``For`` nodes have a ``type_comment`` attribute. - ``Raise``'s ``type``, ``inst`` and ``tback`` fields are hold in a single ``exc`` field, using the transformation ``raise E, V, T => raise E(V).with_traceback(T)``. - ``TryExcept`` and ``TryFinally`` nodes are merged in the ``Try`` node. - ``arguments`` nodes have a ``kwonlyargs`` and ``kw_defaults`` attributes. - ``Call`` nodes loose their ``starargs`` attribute, replaced by an argument wrapped in a ``Starred`` node. They also loose their ``kwargs`` attribute, wrapped in a ``keyword`` node with the identifier set to ``None``, as done in Python3. - ``comprehension`` nodes have an ``async`` attribute (that is always set to 0). - ``Ellipsis``, ``Num`` and ``Str`` nodes are represented as ``Constant``. - ``Subscript`` which don't have any ``Slice`` node as ``slice`` attribute (and ``Ellipsis`` are not ``Slice`` nodes) no longer hold an ``ExtSlice`` but an ``Index(Tuple(...))`` instead. Pit Falls ********* - In Python3, ``None``, ``True`` and ``False`` are parsed as ``Constant`` while they are parsed as regular ``Name`` in Python2. ASDL **** This closely matches the one from https://docs.python.org/3/library/ast.html#abstract-grammar, with a few trade-offs to cope with legacy ASTs. .. code:: -- ASDL's six builtin types are identifier, int, string, bytes, object, singleton module Python { mod = Module(stmt* body, type_ignore *type_ignores) | Interactive(stmt* body) | Expression(expr body) | FunctionType(expr* argtypes, expr returns) -- not really an actual node but useful in Jython's typesystem. | Suite(stmt* body) stmt = FunctionDef(identifier name, arguments args, stmt* body, expr* decorator_list, expr? returns, string? type_comment) | AsyncFunctionDef(identifier name, arguments args, stmt* body, expr* decorator_list, expr? returns, string? type_comment) | ClassDef(identifier name, expr* bases, keyword* keywords, stmt* body, expr* decorator_list) | Return(expr? value) | Delete(expr* targets) | Assign(expr* targets, expr value, string? type_comment) | AugAssign(expr target, operator op, expr value) -- 'simple' indicates that we annotate simple name without parens | AnnAssign(expr target, expr annotation, expr? value, int simple) -- not sure if bool is allowed, can always use int | Print(expr? dest, expr* values, bool nl) -- use 'orelse' because else is a keyword in target languages | For(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment) | AsyncFor(expr target, expr iter, stmt* body, stmt* orelse, string? type_comment) | While(expr test, stmt* body, stmt* orelse) | If(expr test, stmt* body, stmt* orelse) | With(withitem* items, stmt* body, string? type_comment) | AsyncWith(withitem* items, stmt* body, string? type_comment) | Match(expr subject, match_case* cases) | Raise(expr? exc, expr? cause) | Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody) | TryStar(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody) | Assert(expr test, expr? msg) | Import(alias* names) | ImportFrom(identifier? module, alias* names, int? level) -- Doesn't capture requirement that locals must be -- defined if globals is -- still supports use as a function! | Exec(expr body, expr? globals, expr? locals) | Global(identifier* names) | Nonlocal(identifier* names) | Expr(expr value) | Pass | Break | Continue -- XXX Jython will be different -- col_offset is the byte offset in the utf8 string the parser uses attributes (int lineno, int col_offset) -- BoolOp() can use left & right? expr = BoolOp(boolop op, expr* values) | NamedExpr(expr target, expr value) | BinOp(expr left, operator op, expr right) | UnaryOp(unaryop op, expr operand) | Lambda(arguments args, expr body) | IfExp(expr test, expr body, expr orelse) | Dict(expr* keys, expr* values) | Set(expr* elts) | ListComp(expr elt, comprehension* generators) | SetComp(expr elt, comprehension* generators) | DictComp(expr key, expr value, comprehension* generators) | GeneratorExp(expr elt, comprehension* generators) -- the grammar constrains where yield expressions can occur | Await(expr value) | Yield(expr? value) | YieldFrom(expr value) -- need sequences for compare to distinguish between -- x < 4 < 3 and (x < 4) < 3 | Compare(expr left, cmpop* ops, expr* comparators) | Call(expr func, expr* args, keyword* keywords) | Repr(expr value) | FormattedValue(expr value, int? conversion, expr? format_spec) | JoinedStr(expr* values) | Constant(constant value, string? kind) -- the following expression can appear in assignment context | Attribute(expr value, identifier attr, expr_context ctx) | Subscript(expr value, slice slice, expr_context ctx) | Starred(expr value, expr_context ctx) | Name(identifier id, expr_context ctx, expr? annotation, string? type_comment) | List(expr* elts, expr_context ctx) | Tuple(expr* elts, expr_context ctx) -- col_offset is the byte offset in the utf8 string the parser uses attributes (int lineno, int col_offset) expr_context = Load | Store | Del | AugLoad | AugStore | Param slice = Slice(expr? lower, expr? upper, expr? step) | ExtSlice(slice* dims) | Index(expr value) boolop = And | Or operator = Add | Sub | Mult | MatMult | Div | Mod | Pow | LShift | RShift | BitOr | BitXor | BitAnd | FloorDiv unaryop = Invert | Not | UAdd | USub cmpop = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn comprehension = (expr target, expr iter, expr* ifs, int is_async) excepthandler = ExceptHandler(expr? type, expr? name, stmt* body) attributes (int lineno, int col_offset, int? end_lineno, int? end_col_offset) arguments = (expr* args, expr* posonlyargs, expr? vararg, expr* kwonlyargs, expr* kw_defaults, expr? kwarg, expr* defaults) -- keyword arguments supplied to call (NULL identifier for **kwargs) keyword = (identifier? arg, expr value) -- import name with optional 'as' alias. alias = (identifier name, identifier? asname) attributes (int lineno, int col_offset, int? end_lineno, int? end_col_offset) withitem = (expr context_expr, expr? optional_vars) match_case = (pattern pattern, expr? guard, stmt* body) pattern = MatchValue(expr value) | MatchSingleton(constant value) | MatchSequence(pattern* patterns) | MatchMapping(expr* keys, pattern* patterns, identifier? rest) | MatchClass(expr cls, pattern* patterns, identifier* kwd_attrs, pattern* kwd_patterns) | MatchStar(identifier? name) -- The optional "rest" MatchMapping parameter handles capturing extra mapping keys | MatchAs(pattern? pattern, identifier? name) | MatchOr(pattern* patterns) attributes (int lineno, int col_offset, int end_lineno, int end_col_offset) type_ignore = TypeIgnore(int lineno, string tag) }