README.rst 8.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236
  1. Gast, Beniget!
  2. ==============
  3. Beniget is a collection of Compile-time analyse on Python Abstract Syntax Tree(AST).
  4. It's a building block to write static analyzer or compiler for Python.
  5. Beniget relies on `gast <https://pypi.org/project/gast/>`_ to provide a cross
  6. version abstraction of the AST, effectively working across all Python 3 versions greater than 3.6.
  7. API
  8. ---
  9. Basically Beniget provides three analyse:
  10. - ``beniget.Ancestors`` that maps each node to the list of enclosing nodes;
  11. - ``beniget.DefUseChains`` that maps each node to the list of definition points in that node;
  12. - ``beniget.UseDefChains`` that maps each node to the list of possible definition of that node.
  13. See sample usages and/or run ``pydoc beniget`` for more information :-).
  14. Sample Usages
  15. -------------
  16. Detect unused imports
  17. *********************
  18. This is a very basic usage: look for def without any use, and warn about them, focusing on imported values.
  19. .. code:: python
  20. >>> import beniget, gast as ast
  21. # parse some simple statements
  22. >>> code = "from math import cos, sin; print(cos(3))"
  23. >>> module = ast.parse(code)
  24. # compute the def-use chains at module level
  25. >>> duc = beniget.DefUseChains()
  26. >>> duc.visit(module)
  27. # grab the import statement
  28. >>> imported = module.body[0].names
  29. # inspect the users of each imported name
  30. >>> for name in imported:
  31. ... ud = duc.chains[name]
  32. ... if not ud.users():
  33. ... print("Unused import: {}".format(ud.name()))
  34. Unused import: sin
  35. *NOTE*: Due to the dynamic nature of Python, one can fool this analysis by
  36. calling the ``eval`` function, eventually through an indirection, or by performing a lookup
  37. into ``globals()``.
  38. Find all functions marked with a given decorator
  39. ************************************************
  40. Let's assume we've got a ``@nice`` decorator applied to some functions. We can traverse the users
  41. of this decorator to find which functions are decorated.
  42. .. code:: python
  43. # parse some simple statements
  44. >>> code = """
  45. ... nice = lambda x: x
  46. ... @nice
  47. ... def aw(): pass
  48. ... def some(): pass"""
  49. >>> module = ast.parse(code)
  50. # compute the def-use chains at module level
  51. >>> duc = beniget.DefUseChains()
  52. >>> duc.visit(module)
  53. # analysis to find parent of a node
  54. >>> ancestors = beniget.Ancestors()
  55. >>> ancestors.visit(module)
  56. # find the nice definition
  57. >>> nice = [d for d in duc.locals[module] if d.name() == "nice"][0]
  58. # walkthrough its users
  59. >>> for use in nice.users():
  60. ... # we're interested in the parent of the decorator
  61. ... parents = ancestors.parents(use.node)
  62. ... # direct parent of the decorator is the function
  63. ... fdef = parents[-1]
  64. ... print(fdef.name)
  65. aw
  66. Gather attributes of ``self``
  67. *****************************
  68. This analysis gathers all attributes of a class, by going through all methods and checking
  69. the users of the first method parameter, investigating the one used in attribute lookup.
  70. .. code:: python
  71. >>> import gast as ast
  72. >>> import beniget
  73. >>> class Attributes(ast.NodeVisitor):
  74. ...
  75. ... def __init__(self, module_node):
  76. ... # compute the def-use of the module
  77. ... self.chains = beniget.DefUseChains()
  78. ... self.chains.visit(module_node)
  79. ... self.users = set() # all users of `self`
  80. ... self.attributes = set() # attributes of current class
  81. ...
  82. ... def visit_ClassDef(self, node):
  83. ... # walk methods and fill users of `self`
  84. ... for stmt in node.body:
  85. ... if isinstance(stmt, ast.FunctionDef):
  86. ... self_def = self.chains.chains[stmt.args.args[0]]
  87. ... self.users.update(use.node for use in self_def.users())
  88. ... self.generic_visit(node)
  89. ...
  90. ... def visit_Attribute(self, node):
  91. ... # any attribute of `self` is registered
  92. ... if node.value in self.users:
  93. ... self.attributes.add(node.attr)
  94. >>> code = "class My(object):\n def __init__(self, x): self.x = x"
  95. >>> module = ast.parse(code)
  96. >>> classdef = module.body[0]
  97. >>> attr = Attributes(module)
  98. >>> attr.visit(classdef)
  99. >>> list(attr.attributes)
  100. ['x']
  101. *NOTE*: This is *not* an alias analysis, so assigning ``self`` to another variable, or
  102. setting it in a tuple is not captured by this analysis. It's still possible to write such an
  103. a analysis using def-use chains though ;-)
  104. Compute the identifiers captured by a function
  105. **********************************************
  106. In Python, inner functions (and lambdas) can capture identifiers defined in the outer scope.
  107. This analysis computes such identifiers by registering each identifier defined in the function,
  108. then walking through all loaded identifier and checking whether it's local or not.
  109. .. code:: python
  110. >>> import gast as ast
  111. >>> import beniget
  112. >>> class Capture(ast.NodeVisitor):
  113. ...
  114. ... def __init__(self, module_node):
  115. ... # initialize def-use chains
  116. ... self.chains = beniget.DefUseChains()
  117. ... self.chains.visit(module_node)
  118. ... self.users = set() # users of local definitions
  119. ... self.captured = set() # identifiers that don't belong to local users
  120. ...
  121. ... def visit_FunctionDef(self, node):
  122. ... # initialize the set of node using a local variable
  123. ... for def_ in self.chains.locals[node]:
  124. ... self.users.update(use.node for use in def_.users())
  125. ... self.generic_visit(node)
  126. ...
  127. ... def visit_Name(self, node):
  128. ... # register load of identifiers not locally definied
  129. ... if isinstance(node.ctx, ast.Load):
  130. ... if node not in self.users:
  131. ... self.captured.add(node.id)
  132. >>> code = 'def foo(x):\n def bar(): return x\n return bar'
  133. >>> module = ast.parse(code)
  134. >>> inner_function = module.body[0].body[0]
  135. >>> capture = Capture(module)
  136. >>> capture.visit(inner_function)
  137. >>> list(capture.captured)
  138. ['x']
  139. Compute the set of instructions required to compute a function
  140. **************************************************************
  141. This is actually very similar to the computation of the closure, but this time
  142. let's use the UseDef chains combined with the ancestors.
  143. .. code:: python
  144. >>> import gast as ast
  145. >>> import beniget
  146. >>> class CaptureX(ast.NodeVisitor):
  147. ...
  148. ... def __init__(self, module_node, fun):
  149. ... self.fun = fun
  150. ... # initialize use-def chains
  151. ... du = beniget.DefUseChains()
  152. ... du.visit(module_node)
  153. ... self.chains = beniget.UseDefChains(du)
  154. ... self.ancestors = beniget.Ancestors()
  155. ... self.ancestors.visit(module_node)
  156. ... self.external = list()
  157. ... self.visited_external = set()
  158. ...
  159. ... def visit_Name(self, node):
  160. ... # register load of identifiers not locally defined
  161. ... if isinstance(node.ctx, ast.Load):
  162. ... uses = self.chains.chains[node]
  163. ... for use in uses:
  164. ... try:
  165. ... parents = self.ancestors.parents(use.node)
  166. ... except KeyError:
  167. ... return # a builtin
  168. ... if self.fun not in parents:
  169. ... parent = self.ancestors.parentStmt(use.node)
  170. ... if parent not in self.visited_external:
  171. ... self.visited_external.add(parent)
  172. ... self.external.append(parent)
  173. ... self.rec(parent)
  174. ...
  175. ... def rec(self, node):
  176. ... "walk definitions to find their operands's def"
  177. ... if isinstance(node, ast.Assign):
  178. ... self.visit(node.value)
  179. ... # TODO: implement this for AugAssign etc
  180. >>> code = 'a = 1; b = [a, a]; c = len(b)\ndef foo():\n return c'
  181. >>> module = ast.parse(code)
  182. >>> function = module.body[3]
  183. >>> capturex = CaptureX(module, function)
  184. >>> capturex.visit(function)
  185. >>> # the three top level assignments have been captured!
  186. >>> list(map(type, capturex.external))
  187. [<class 'gast.gast.Assign'>, <class 'gast.gast.Assign'>, <class 'gast.gast.Assign'>]
  188. Acknowledgments
  189. ---------------
  190. Beniget is in Pierre Augier's debt, for he triggered the birth of beniget and provided
  191. countless meaningful bug reports and advices. Trugarez!