README.rst 8.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237
  1. Gast, Beniget!
  2. ==============
  3. Beniget is a collection of Compile-time analyse on Python Abstract Syntax Tree(AST).
  4. It's a building block to write static analyzer or compiler for Python.
  5. Beniget relies on `gast <https://pypi.org/project/gast/>`_ to provide a cross
  6. version abstraction of the AST, effectively working on both Python2 and
  7. Python3.
  8. API
  9. ---
  10. Basically Beniget provides three analyse:
  11. - ``beniget.Ancestors`` that maps each node to the list of enclosing nodes;
  12. - ``beniget.DefUseChains`` that maps each node to the list of definition points in that node;
  13. - ``beniget.UseDefChains`` that maps each node to the list of possible definition of that node.
  14. See sample usages and/or run ``pydoc beniget`` for more information :-).
  15. Sample Usages
  16. -------------
  17. Detect unused imports
  18. *********************
  19. This is a very basic usage: look for def without any use, and warn about them, focusing on imported values.
  20. .. code:: python
  21. >>> import beniget, gast as ast
  22. # parse some simple statements
  23. >>> code = "from math import cos, sin; print(cos(3))"
  24. >>> module = ast.parse(code)
  25. # compute the def-use chains at module level
  26. >>> duc = beniget.DefUseChains()
  27. >>> duc.visit(module)
  28. # grab the import statement
  29. >>> imported = module.body[0].names
  30. # inspect the users of each imported name
  31. >>> for name in imported:
  32. ... ud = duc.chains[name]
  33. ... if not ud.users():
  34. ... print("Unused import: {}".format(ud.name()))
  35. Unused import: sin
  36. *NOTE*: Due to the dynamic nature of Python, one can fool this analysis by
  37. calling the ``eval`` function, eventually through an indirection, or by performing a lookup
  38. into ``globals()``.
  39. Find all functions marked with a given decorator
  40. ************************************************
  41. Let's assume we've got a ``@nice`` decorator applied to some functions. We can traverse the users
  42. of this decorator to find which functions are decorated.
  43. .. code:: python
  44. # parse some simple statements
  45. >>> code = """
  46. ... nice = lambda x: x
  47. ... @nice
  48. ... def aw(): pass
  49. ... def some(): pass"""
  50. >>> module = ast.parse(code)
  51. # compute the def-use chains at module level
  52. >>> duc = beniget.DefUseChains()
  53. >>> duc.visit(module)
  54. # analysis to find parent of a node
  55. >>> ancestors = beniget.Ancestors()
  56. >>> ancestors.visit(module)
  57. # find the nice definition
  58. >>> nice = [d for d in duc.locals[module] if d.name() == "nice"][0]
  59. # walkthrough its users
  60. >>> for use in nice.users():
  61. ... # we're interested in the parent of the decorator
  62. ... parents = ancestors.parents(use.node)
  63. ... # direct parent of the decorator is the function
  64. ... fdef = parents[-1]
  65. ... print(fdef.name)
  66. aw
  67. Gather attributes of ``self``
  68. *****************************
  69. This analysis gathers all attributes of a class, by going through all methods and checking
  70. the users of the first method parameter, investigating the one used in attribute lookup.
  71. .. code:: python
  72. >>> import gast as ast
  73. >>> import beniget
  74. >>> class Attributes(ast.NodeVisitor):
  75. ...
  76. ... def __init__(self, module_node):
  77. ... # compute the def-use of the module
  78. ... self.chains = beniget.DefUseChains()
  79. ... self.chains.visit(module_node)
  80. ... self.users = set() # all users of `self`
  81. ... self.attributes = set() # attributes of current class
  82. ...
  83. ... def visit_ClassDef(self, node):
  84. ... # walk methods and fill users of `self`
  85. ... for stmt in node.body:
  86. ... if isinstance(stmt, ast.FunctionDef):
  87. ... self_def = self.chains.chains[stmt.args.args[0]]
  88. ... self.users.update(use.node for use in self_def.users())
  89. ... self.generic_visit(node)
  90. ...
  91. ... def visit_Attribute(self, node):
  92. ... # any attribute of `self` is registered
  93. ... if node.value in self.users:
  94. ... self.attributes.add(node.attr)
  95. >>> code = "class My(object):\n def __init__(self, x): self.x = x"
  96. >>> module = ast.parse(code)
  97. >>> classdef = module.body[0]
  98. >>> attr = Attributes(module)
  99. >>> attr.visit(classdef)
  100. >>> list(attr.attributes)
  101. ['x']
  102. *NOTE*: This is *not* an alias analysis, so assigning ``self`` to another variable, or
  103. setting it in a tuple is not captured by this analysis. It's still possible to write such an
  104. a analysis using def-use chains though ;-)
  105. Compute the identifiers captured by a function
  106. **********************************************
  107. In Python, inner functions (and lambdas) can capture identifiers defined in the outer scope.
  108. This analysis computes such identifiers by registering each identifier defined in the function,
  109. then walking through all loaded identifier and checking whether it's local or not.
  110. .. code:: python
  111. >>> import gast as ast
  112. >>> import beniget
  113. >>> class Capture(ast.NodeVisitor):
  114. ...
  115. ... def __init__(self, module_node):
  116. ... # initialize def-use chains
  117. ... self.chains = beniget.DefUseChains()
  118. ... self.chains.visit(module_node)
  119. ... self.users = set() # users of local definitions
  120. ... self.captured = set() # identifiers that don't belong to local users
  121. ...
  122. ... def visit_FunctionDef(self, node):
  123. ... # initialize the set of node using a local variable
  124. ... for def_ in self.chains.locals[node]:
  125. ... self.users.update(use.node for use in def_.users())
  126. ... self.generic_visit(node)
  127. ...
  128. ... def visit_Name(self, node):
  129. ... # register load of identifiers not locally definied
  130. ... if isinstance(node.ctx, ast.Load):
  131. ... if node not in self.users:
  132. ... self.captured.add(node.id)
  133. >>> code = 'def foo(x):\n def bar(): return x\n return bar'
  134. >>> module = ast.parse(code)
  135. >>> inner_function = module.body[0].body[0]
  136. >>> capture = Capture(module)
  137. >>> capture.visit(inner_function)
  138. >>> list(capture.captured)
  139. ['x']
  140. Compute the set of instructions required to compute a function
  141. **************************************************************
  142. This is actually very similar to the computation of the closure, but this time
  143. let's use the UseDef chains combined with the ancestors.
  144. .. code:: python
  145. >>> import gast as ast
  146. >>> import beniget
  147. >>> class CaptureX(ast.NodeVisitor):
  148. ...
  149. ... def __init__(self, module_node, fun):
  150. ... self.fun = fun
  151. ... # initialize use-def chains
  152. ... du = beniget.DefUseChains()
  153. ... du.visit(module_node)
  154. ... self.chains = beniget.UseDefChains(du)
  155. ... self.ancestors = beniget.Ancestors()
  156. ... self.ancestors.visit(module_node)
  157. ... self.external = list()
  158. ... self.visited_external = set()
  159. ...
  160. ... def visit_Name(self, node):
  161. ... # register load of identifiers not locally defined
  162. ... if isinstance(node.ctx, ast.Load):
  163. ... uses = self.chains.chains[node]
  164. ... for use in uses:
  165. ... try:
  166. ... parents = self.ancestors.parents(use.node)
  167. ... except KeyError:
  168. ... return # a builtin
  169. ... if self.fun not in parents:
  170. ... parent = self.ancestors.parentStmt(use.node)
  171. ... if parent not in self.visited_external:
  172. ... self.visited_external.add(parent)
  173. ... self.external.append(parent)
  174. ... self.rec(parent)
  175. ...
  176. ... def rec(self, node):
  177. ... "walk definitions to find their operands's def"
  178. ... if isinstance(node, ast.Assign):
  179. ... self.visit(node.value)
  180. ... # TODO: implement this for AugAssign etc
  181. >>> code = 'a = 1; b = [a, a]; c = len(b)\ndef foo():\n return c'
  182. >>> module = ast.parse(code)
  183. >>> function = module.body[3]
  184. >>> capturex = CaptureX(module, function)
  185. >>> capturex.visit(function)
  186. >>> # the three top level assignments have been captured!
  187. >>> list(map(type, capturex.external))
  188. [<class 'gast.gast.Assign'>, <class 'gast.gast.Assign'>, <class 'gast.gast.Assign'>]
  189. Acknowledgments
  190. ---------------
  191. Beniget is in Pierre Augier's debt, for he triggered the birth of beniget and provided
  192. countless meaningful bug reports and advices. Trugarez!