HACKING.rst.txt 22 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528
  1. =====================
  2. Contributing to SciPy
  3. =====================
  4. This document aims to give an overview of how to contribute to SciPy. It
  5. tries to answer commonly asked questions, and provide some insight into how the
  6. community process works in practice. Readers who are familiar with the SciPy
  7. community and are experienced Python coders may want to jump straight to the
  8. `git workflow`_ documentation.
  9. There are a lot of ways you can contribute:
  10. - Contributing new code
  11. - Fixing bugs and other maintenance work
  12. - Improving the documentation
  13. - Reviewing open pull requests
  14. - Triaging issues
  15. - Working on the `scipy.org`_ website
  16. - Answering questions and participating on the scipy-dev and scipy-user
  17. `mailing lists`_.
  18. Contributing new code
  19. =====================
  20. If you have been working with the scientific Python toolstack for a while, you
  21. probably have some code lying around of which you think "this could be useful
  22. for others too". Perhaps it's a good idea then to contribute it to SciPy or
  23. another open source project. The first question to ask is then, where does
  24. this code belong? That question is hard to answer here, so we start with a
  25. more specific one: *what code is suitable for putting into SciPy?*
  26. Almost all of the new code added to scipy has in common that it's potentially
  27. useful in multiple scientific domains and it fits in the scope of existing
  28. scipy submodules. In principle new submodules can be added too, but this is
  29. far less common. For code that is specific to a single application, there may
  30. be an existing project that can use the code. Some scikits (`scikit-learn`_,
  31. `scikit-image`_, `statsmodels`_, etc.) are good examples here; they have a
  32. narrower focus and because of that more domain-specific code than SciPy.
  33. Now if you have code that you would like to see included in SciPy, how do you
  34. go about it? After checking that your code can be distributed in SciPy under a
  35. compatible license (see FAQ for details), the first step is to discuss on the
  36. scipy-dev mailing list. All new features, as well as changes to existing code,
  37. are discussed and decided on there. You can, and probably should, already
  38. start this discussion before your code is finished.
  39. Assuming the outcome of the discussion on the mailing list is positive and you
  40. have a function or piece of code that does what you need it to do, what next?
  41. Before code is added to SciPy, it at least has to have good documentation, unit
  42. tests and correct code style.
  43. 1. Unit tests
  44. In principle you should aim to create unit tests that exercise all the code
  45. that you are adding. This gives some degree of confidence that your code
  46. runs correctly, also on Python versions and hardware or OSes that you don't
  47. have available yourself. An extensive description of how to write unit
  48. tests is given in the NumPy `testing guidelines`_.
  49. 2. Documentation
  50. Clear and complete documentation is essential in order for users to be able
  51. to find and understand the code. Documentation for individual functions
  52. and classes -- which includes at least a basic description, type and
  53. meaning of all parameters and returns values, and usage examples in
  54. `doctest`_ format -- is put in docstrings. Those docstrings can be read
  55. within the interpreter, and are compiled into a reference guide in html and
  56. pdf format. Higher-level documentation for key (areas of) functionality is
  57. provided in tutorial format and/or in module docstrings. A guide on how to
  58. write documentation is given in `how to document`_.
  59. 3. Code style
  60. Uniformity of style in which code is written is important to others trying
  61. to understand the code. SciPy follows the standard Python guidelines for
  62. code style, `PEP8`_. In order to check that your code conforms to PEP8,
  63. you can use the `pep8 package`_ style checker. Most IDEs and text editors
  64. have settings that can help you follow PEP8, for example by translating
  65. tabs by four spaces. Using `pyflakes`_ to check your code is also a good
  66. idea.
  67. At the end of this document a checklist is given that may help to check if your
  68. code fulfills all requirements for inclusion in SciPy.
  69. Another question you may have is: *where exactly do I put my code*? To answer
  70. this, it is useful to understand how the SciPy public API (application
  71. programming interface) is defined. For most modules the API is two levels
  72. deep, which means your new function should appear as
  73. ``scipy.submodule.my_new_func``. ``my_new_func`` can be put in an existing or
  74. new file under ``/scipy/<submodule>/``, its name is added to the ``__all__``
  75. list in that file (which lists all public functions in the file), and those
  76. public functions are then imported in ``/scipy/<submodule>/__init__.py``. Any
  77. private functions/classes should have a leading underscore (``_``) in their
  78. name. A more detailed description of what the public API of SciPy is, is given
  79. in `SciPy API`_.
  80. Once you think your code is ready for inclusion in SciPy, you can send a pull
  81. request (PR) on Github. We won't go into the details of how to work with git
  82. here, this is described well in the `git workflow`_ section of the NumPy
  83. documentation and on the `Github help pages`_. When you send the PR for a new
  84. feature, be sure to also mention this on the scipy-dev mailing list. This can
  85. prompt interested people to help review your PR. Assuming that you already got
  86. positive feedback before on the general idea of your code/feature, the purpose
  87. of the code review is to ensure that the code is correct, efficient and meets
  88. the requirements outlined above. In many cases the code review happens
  89. relatively quickly, but it's possible that it stalls. If you have addressed
  90. all feedback already given, it's perfectly fine to ask on the mailing list
  91. again for review (after a reasonable amount of time, say a couple of weeks, has
  92. passed). Once the review is completed, the PR is merged into the "master"
  93. branch of SciPy.
  94. The above describes the requirements and process for adding code to SciPy. It
  95. doesn't yet answer the question though how decisions are made exactly. The
  96. basic answer is: decisions are made by consensus, by everyone who chooses to
  97. participate in the discussion on the mailing list. This includes developers,
  98. other users and yourself. Aiming for consensus in the discussion is important
  99. -- SciPy is a project by and for the scientific Python community. In those
  100. rare cases that agreement cannot be reached, the maintainers of the module
  101. in question can decide the issue.
  102. Contributing by helping maintain existing code
  103. ==============================================
  104. The previous section talked specifically about adding new functionality to
  105. SciPy. A large part of that discussion also applies to maintenance of existing
  106. code. Maintenance means fixing bugs, improving code quality, documenting
  107. existing functionality better, adding missing unit tests, keeping
  108. build scripts up-to-date, etc. The SciPy `issue list`_ contains all
  109. reported bugs, build/documentation issues, etc. Fixing issues
  110. helps improve the overall quality of SciPy, and is also a good way
  111. of getting familiar with the project. You may also want to fix a bug because
  112. you ran into it and need the function in question to work correctly.
  113. The discussion on code style and unit testing above applies equally to bug
  114. fixes. It is usually best to start by writing a unit test that shows the
  115. problem, i.e. it should pass but doesn't. Once you have that, you can fix the
  116. code so that the test does pass. That should be enough to send a PR for this
  117. issue. Unlike when adding new code, discussing this on the mailing list may
  118. not be necessary - if the old behavior of the code is clearly incorrect, no one
  119. will object to having it fixed. It may be necessary to add some warning or
  120. deprecation message for the changed behavior. This should be part of the
  121. review process.
  122. .. note::
  123. Pull requests that *only* change code style, e.g. fixing some PEP8 issues in
  124. a file, are discouraged. Such PRs are often not worth cluttering the git
  125. annotate history, and take reviewer time that may be better spent in other ways.
  126. Code style cleanups of code that is touched as part of a functional change
  127. are fine however.
  128. Reviewing pull requests
  129. =======================
  130. Reviewing open pull requests (PRs) is very welcome, and a valuable way to help
  131. increase the speed at which the project moves forward. If you have specific
  132. knowledge/experience in a particular area (say "optimization algorithms" or
  133. "special functions") then reviewing PRs in that area is especially valuable -
  134. sometimes PRs with technical code have to wait for a long time to get merged
  135. due to a shortage of appropriate reviewers.
  136. We encourage everyone to get involved in the review process; it's also a
  137. great way to get familiar with the code base. Reviewers should ask
  138. themselves some or all of the following questions:
  139. - Was this change adequately discussed (relevant for new features and changes
  140. in existing behavior)?
  141. - Is the feature scientifically sound? Algorithms may be known to work based on
  142. literature; otherwise, closer look at correctness is valuable.
  143. - Is the intended behavior clear under all conditions (e.g. unexpected inputs
  144. like empty arrays or nan/inf values)?
  145. - Does the code meet the quality, test and documentation expectation outline
  146. under `Contributing new code`_?
  147. If we do not know you yet, consider introducing yourself.
  148. Other ways to contribute
  149. ========================
  150. There are many ways to contribute other than contributing code.
  151. Triaging issues (investigating bug reports for validity and possible actions to
  152. take) is also a useful activity. SciPy has many hundreds of open issues;
  153. closing invalid ones and correctly labeling valid ones (ideally with some first
  154. thoughts in a comment) allows prioritizing maintenance work and finding related
  155. issues easily when working on an existing function or submodule.
  156. Participating in discussions on the scipy-user and scipy-dev `mailing lists`_ is
  157. a contribution in itself. Everyone who writes to those lists with a problem or
  158. an idea would like to get responses, and writing such responses makes the
  159. project and community function better and appear more welcoming.
  160. The `scipy.org`_ website contains a lot of information on both SciPy the
  161. project and SciPy the community, and it can always use a new pair of hands.
  162. The sources for the website live in their own separate repo:
  163. https://github.com/scipy/scipy.org
  164. Recommended development setup
  165. =============================
  166. Since Scipy contains parts written in C, C++, and Fortran that need to be
  167. compiled before use, make sure you have the necessary compilers and Python
  168. development headers installed. Having compiled code also means that importing
  169. Scipy from the development sources needs some additional steps, which are
  170. explained below.
  171. First fork a copy of the main Scipy repository in Github onto your own
  172. account and then create your local repository via::
  173. $ git clone git@github.com:YOURUSERNAME/scipy.git scipy
  174. $ cd scipy
  175. $ git remote add upstream git://github.com/scipy/scipy.git
  176. To build the development version of Scipy and run tests, spawn
  177. interactive shells with the Python import paths properly set up etc.,
  178. do one of::
  179. $ python runtests.py -v
  180. $ python runtests.py -v -s optimize
  181. $ python runtests.py -v -t scipy.special.tests.test_basic::test_xlogy
  182. $ python runtests.py --ipython
  183. $ python runtests.py --python somescript.py
  184. $ python runtests.py --bench
  185. This builds Scipy first, so the first time it may take some time. If
  186. you specify ``-n``, the tests are run against the version of Scipy (if
  187. any) found on current PYTHONPATH. *Note: if you run into a build issue,
  188. more detailed build documentation can be found in :doc:`building/index` and at
  189. https://github.com/scipy/scipy/tree/master/doc/source/building*
  190. Using ``runtests.py`` is the recommended approach to running tests.
  191. There are also a number of alternatives to it, for example in-place
  192. build or installing to a virtualenv. See the FAQ below for details.
  193. Some of the tests in Scipy are very slow and need to be separately
  194. enabled. See the FAQ below for details.
  195. SciPy structure
  196. ===============
  197. All SciPy modules should follow the following conventions. In the
  198. following, a *SciPy module* is defined as a Python package, say
  199. ``yyy``, that is located in the scipy/ directory.
  200. * Ideally, each SciPy module should be as self-contained as possible.
  201. That is, it should have minimal dependencies on other packages or
  202. modules. Even dependencies on other SciPy modules should be kept to
  203. a minimum. A dependency on NumPy is of course assumed.
  204. * Directory ``yyy/`` contains:
  205. - A file ``setup.py`` that defines
  206. ``configuration(parent_package='',top_path=None)`` function
  207. for `numpy.distutils`.
  208. - A directory ``tests/`` that contains files ``test_<name>.py``
  209. corresponding to modules ``yyy/<name>{.py,.so,/}``.
  210. * Private modules should be prefixed with an underscore ``_``,
  211. for instance ``yyy/_somemodule.py``.
  212. * User-visible functions should have good documentation following
  213. the Numpy documentation style, see `how to document`_
  214. * The ``__init__.py`` of the module should contain the main reference
  215. documentation in its docstring. This is connected to the Sphinx
  216. documentation under ``doc/`` via Sphinx's automodule directive.
  217. The reference documentation should first give a categorized list of
  218. the contents of the module using ``autosummary::`` directives, and
  219. after that explain points essential for understanding the use of the
  220. module.
  221. Tutorial-style documentation with extensive examples should be
  222. separate, and put under ``doc/source/tutorial/``
  223. See the existing Scipy submodules for guidance.
  224. For further details on Numpy distutils, see:
  225. https://github.com/numpy/numpy/blob/master/doc/DISTUTILS.rst.txt
  226. Useful links, FAQ, checklist
  227. ============================
  228. Checklist before submitting a PR
  229. --------------------------------
  230. - Are there unit tests with good code coverage?
  231. - Do all public function have docstrings including examples?
  232. - Is the code style correct (PEP8, pyflakes)
  233. - Is the commit message `formatted correctly`_?
  234. - Is the new functionality tagged with ``.. versionadded:: X.Y.Z`` (with
  235. X.Y.Z the version number of the next release - can be found in setup.py)?
  236. - Is the new functionality mentioned in the release notes of the next
  237. release?
  238. - Is the new functionality added to the reference guide?
  239. - In case of larger additions, is there a tutorial or more extensive
  240. module-level description?
  241. - In case compiled code is added, is it integrated correctly via setup.py
  242. - If you are a first-time contributor, did you add yourself to THANKS.txt?
  243. Please note that this is perfectly normal and desirable - the aim is to
  244. give every single contributor credit, and if you don't add yourself it's
  245. simply extra work for the reviewer (or worse, the reviewer may forget).
  246. - Did you check that the code can be distributed under a BSD license?
  247. Useful SciPy documents
  248. ----------------------
  249. - The `how to document`_ guidelines
  250. - NumPy/SciPy `testing guidelines`_
  251. - `SciPy API`_
  252. - The `SciPy Roadmap`_
  253. - NumPy/SciPy `git workflow`_
  254. - How to submit a good `bug report`_
  255. FAQ
  256. ---
  257. *I based my code on existing Matlab/R/... code I found online, is this OK?*
  258. It depends. SciPy is distributed under a BSD license, so if the code that you
  259. based your code on is also BSD licensed or has a BSD-compatible license (e.g.
  260. MIT, PSF) then it's OK. Code which is GPL or Apache licensed, has no
  261. clear license, requires citation or is free for academic use only can't be
  262. included in SciPy. Therefore if you copied existing code with such a license
  263. or made a direct translation to Python of it, your code can't be included.
  264. If you're unsure, please ask on the scipy-dev mailing list.
  265. *Why is SciPy under the BSD license and not, say, the GPL?*
  266. Like Python, SciPy uses a "permissive" open source license, which allows
  267. proprietary re-use. While this allows companies to use and modify the software
  268. without giving anything back, it is felt that the larger user base results in
  269. more contributions overall, and companies often publish their modifications
  270. anyway, without being required to. See John Hunter's `BSD pitch`_.
  271. *How do I set up a development version of SciPy in parallel to a released
  272. version that I use to do my job/research?*
  273. One simple way to achieve this is to install the released version in
  274. site-packages, by using a binary installer or pip for example, and set
  275. up the development version in a virtualenv. First install
  276. `virtualenv`_ (optionally use `virtualenvwrapper`_), then create your
  277. virtualenv (named scipy-dev here) with::
  278. $ virtualenv scipy-dev
  279. Now, whenever you want to switch to the virtual environment, you can use the
  280. command ``source scipy-dev/bin/activate``, and ``deactivate`` to exit from the
  281. virtual environment and back to your previous shell. With scipy-dev
  282. activated, install first Scipy's dependencies::
  283. $ pip install Numpy pytest Cython
  284. After that, you can install a development version of Scipy, for example via::
  285. $ python setup.py install
  286. The installation goes to the virtual environment.
  287. *How do I set up an in-place build for development*
  288. For development, you can set up an in-place build so that changes made to
  289. ``.py`` files have effect without rebuild. First, run::
  290. $ python setup.py build_ext -i
  291. Then you need to point your PYTHONPATH environment variable to this directory.
  292. Some IDEs (`Spyder`_ for example) have utilities to manage PYTHONPATH. On Linux
  293. and OSX, you can run the command::
  294. $ export PYTHONPATH=$PWD
  295. and on Windows
  296. $ set PYTHONPATH=/path/to/scipy
  297. Now editing a Python source file in SciPy allows you to immediately
  298. test and use your changes (in ``.py`` files), by simply restarting the
  299. interpreter.
  300. *Are there any video examples for installing from source, setting up a
  301. development environment, etc...?*
  302. Currently, there are two video demonstrations for Anaconda Python on macOS:
  303. `Anaconda SciPy Dev Part I (macOS)`_ is a four-minute
  304. overview of installing Anaconda, building SciPy from source, and testing
  305. changes made to SciPy from the `Spyder`_ IDE.
  306. `Anaconda SciPy Dev Part II (macOS)`_ shows how to use
  307. a virtual environment to easily switch between the "pre-built version" of SciPy
  308. installed with Anaconda and your "source-built version" of SciPy created
  309. according to Part I.
  310. *Are there any video examples of the basic development workflow?*
  311. `SciPy Development Workflow`_ is a five-minute example of fixing a bug and
  312. submitting a pull request. While it's intended as a followup to
  313. `Anaconda SciPy Dev Part I (macOS)`_ and `Anaconda SciPy Dev Part II (macOS)`_,
  314. the process is similar for other development setups.
  315. *Can I use a programming language other than Python to speed up my code?*
  316. Yes. The languages used in SciPy are Python, Cython, C, C++ and Fortran. All
  317. of these have their pros and cons. If Python really doesn't offer enough
  318. performance, one of those languages can be used. Important concerns when
  319. using compiled languages are maintainability and portability. For
  320. maintainability, Cython is clearly preferred over C/C++/Fortran. Cython and C
  321. are more portable than C++/Fortran. A lot of the existing C and Fortran code
  322. in SciPy is older, battle-tested code that was only wrapped in (but not
  323. specifically written for) Python/SciPy. Therefore the basic advice is: use
  324. Cython. If there's specific reasons why C/C++/Fortran should be preferred,
  325. please discuss those reasons first.
  326. *How do I debug code written in C/C++/Fortran inside Scipy?*
  327. The easiest way to do this is to first write a Python script that
  328. invokes the C code whose execution you want to debug. For instance
  329. ``mytest.py``::
  330. from scipy.special import hyp2f1
  331. print(hyp2f1(5.0, 1.0, -1.8, 0.95))
  332. Now, you can run::
  333. gdb --args python runtests.py -g --python mytest.py
  334. If you didn't compile with debug symbols enabled before, remove the
  335. ``build`` directory first. While in the debugger::
  336. (gdb) break cephes_hyp2f1
  337. (gdb) run
  338. The execution will now stop at the corresponding C function and you
  339. can step through it as usual. Instead of plain ``gdb`` you can of
  340. course use your favourite alternative debugger; run it on the
  341. ``python`` binary with arguments ``runtests.py -g --python mytest.py``.
  342. *How do I enable additional tests in Scipy?*
  343. Some of the tests in Scipy's test suite are very slow and not enabled
  344. by default. You can run the full suite via::
  345. $ python runtests.py -g -m full
  346. This invokes the test suite ``import scipy; scipy.test("full")``,
  347. enabling also slow tests.
  348. There is an additional level of very slow tests (several minutes),
  349. which are disabled also in this case. They can be enabled by setting
  350. the environment variable ``SCIPY_XSLOW=1`` before running the test
  351. suite.
  352. .. _scikit-learn: http://scikit-learn.org
  353. .. _scikit-image: http://scikit-image.org/
  354. .. _statsmodels: https://www.statsmodels.org/
  355. .. _testing guidelines: https://github.com/numpy/numpy/blob/master/doc/TESTS.rst.txt
  356. .. _formatted correctly: https://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html#writing-the-commit-message
  357. .. _how to document: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
  358. .. _bug report: https://scipy.org/bug-report.html
  359. .. _PEP8: https://www.python.org/dev/peps/pep-0008/
  360. .. _pep8 package: https://pypi.python.org/pypi/pep8
  361. .. _pyflakes: https://pypi.python.org/pypi/pyflakes
  362. .. _SciPy API: https://docs.scipy.org/doc/scipy/reference/api.html
  363. .. _SciPy Roadmap: https://scipy.github.io/devdocs/roadmap.html
  364. .. _git workflow: https://docs.scipy.org/doc/numpy/dev/gitwash/
  365. .. _Github help pages: https://help.github.com/articles/set-up-git/
  366. .. _issue list: https://github.com/scipy/scipy/issues
  367. .. _Github: https://github.com/scipy/scipy
  368. .. _scipy.org: https://scipy.org/
  369. .. _scipy.github.com: https://scipy.github.com/
  370. .. _scipy.org-new: https://github.com/scipy/scipy.org-new
  371. .. _documentation wiki: https://docs.scipy.org/scipy/Front%20Page/
  372. .. _SciPy Central: https://web.archive.org/web/20170520065729/http://central.scipy.org/
  373. .. _doctest: https://pymotw.com/3/doctest/
  374. .. _virtualenv: https://virtualenv.pypa.io/
  375. .. _virtualenvwrapper: https://bitbucket.org/dhellmann/virtualenvwrapper/
  376. .. _bsd pitch: http://nipy.sourceforge.net/nipy/stable/faq/johns_bsd_pitch.html
  377. .. _Pytest: https://pytest.org/
  378. .. _mailing lists: https://www.scipy.org/scipylib/mailing-lists.html
  379. .. _Spyder: https://www.spyder-ide.org/
  380. .. _Anaconda SciPy Dev Part I (macOS): https://youtu.be/1rPOSNd0ULI
  381. .. _Anaconda SciPy Dev Part II (macOS): https://youtu.be/Faz29u5xIZc
  382. .. _SciPy Development Workflow: https://youtu.be/HgU01gJbzMY