README.rst 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416
  1. The ordereddict module in short
  2. -------------------------------
  3. This is an implementation of an ordered dictionary with Key Insertion
  4. Order (KIO: updates of values do not affect the position of the key),
  5. Key Value Insertion Order (KVIO, an existing key's position is removed
  6. and put at the back). The standard library module OrderedDict, implemented
  7. later, implements a subset of ``ordereddict`` functionality.
  8. Sorted dictionaries are also provided. Currently only with Key Sorted
  9. Order (KSO, no sorting function can be specified, but you can specify a
  10. transform to apply on the key before comparison (e.g. string.lower)).
  11. This package is hosted on BitBucket and installable from PyPI::
  12. pip install ruamel.ordereddict
  13. For Windows there are 32 and 64 bit installable wheels available.
  14. Usage::
  15. from ruamel.ordereddict import ordereddict
  16. kio = ordereddict()
  17. kvio = ordereddict(kvio=True)
  18. # without relax unordered initalisation is not allowed
  19. d = ordereddict({'a':1, 'b': 2}, relax=True)
  20. sd = sorteddict({'a':1, 'b': 2}) # sorteddict is always relaxed
  21. **please note that starting with 0.4.6 you should not import _ordereddict
  22. directly**
  23. This module has been tested under:
  24. ============= ========================= ==========
  25. OS compiler Python
  26. Linux Mint 17 gcc 4.8.4 2.7.13
  27. Windows Visual Studio 2010 2.7.13-32
  28. Windows Visual Studio 2010 2.7.13-64
  29. ============= ========================= ==========
  30. Older versions of this module has been tested under
  31. and I expect those to still work:
  32. ============= ======================== =========
  33. OS compiler Python
  34. Windows XP-64 Visual Studio 2010 2.7.10-32
  35. Windows XP-64 Visual Studio 2010 2.7.10-64
  36. Windows XP-64 Visual Studio 2008 2.6.9-32
  37. Windows XP-64 Visual Studio 2008 2.6.9-64
  38. Linux Mint 17 gcc 4.8.2 2.6.9
  39. Ubuntu 12.04 gcc 4.7.2 2.7.6
  40. Ubuntu 12.04 gcc 4.7.2 2.6.8
  41. Ubuntu 8.04 gcc 4.2.4 2.7.6
  42. Ubuntu 8.04 gcc 4.2.4 2.5.2
  43. Windows XP Visual C++ 2008 Express 2.7.6
  44. Windows 7 64 Windows SDK for Win7 SP1 2.7.6
  45. Ubuntu 12.04 gcc 4.6.3 2.7.3
  46. Ubuntu 8.04 gcc 4.2.4 2.6.4
  47. Ubuntu 8.04 gcc 4.2.4 2.5.2
  48. Ubuntu 8.10 gcc 4.3.2 2.5.4
  49. Ubuntu 8.10 gcc 4.3.2 2.4.6
  50. Ubuntu 7.04 gcc 4.1.2 2.5.1
  51. Ubuntu 7.04 gcc 4.1.2 2.4.4
  52. Ubuntu 6.06 gcc 2.5.1
  53. Windows XP Visual Studio 2003 2.5.1
  54. Windows XP Visual C++ 2008 Express 2.6.5
  55. Windows MingGW 4.7.0 2.7.3
  56. Solaris 10 GCC 4.4.x 2.7.3
  57. ============= ======================== =========
  58. Version 0.4.1 was tested and found working on SuSE Linux Enterprise Server
  59. (GCC 4.1.0 and Intel C/C++ 10.1) by Stuart Stock.
  60. MingGW and Solaris were tested and reported to work by Wladimir with version
  61. 0.4.5
  62. Home
  63. ----------------------------
  64. https://bitbucket.org/ruamel/ordereddict is ordereddict's home on the web.
  65. Clone the repository there if you want to work from the source.
  66. http://www.xs4all.nl/~anthon/Python/ordereddict used to be
  67. ordereddict's home on the web.
  68. There you can still find the links for downloading the older version (0.4.5).
  69. Installation
  70. ------------
  71. .. comment: To install the package you can use::
  72. pip install ruamel.ordereddict
  73. You can clone and checkout the sources, and then run::
  74. python setup.py install
  75. Bugreporting
  76. ------------
  77. If you find any problems, please let me know, but also realise that I
  78. have a spamfilter that catches over 100 emails a day and yours might
  79. get in there unnoticed. So if there is no response within a few days
  80. please try again.
  81. Functionality
  82. -------------
  83. ordereddict has all of the functionality of dict() except that there
  84. is no keyword based initialisation and that you cannot pass a normal
  85. dict to the initialisation of the basic ordereddict (however see the
  86. relax-ed keyword below). sorteddict cannot be initialised from keywords
  87. either, but can be initialised from normal dict (ie. they are always
  88. relaxed).
  89. As you probably would expect .keys(), .values(), .items(),
  90. .iterkeys(), itervalues(), iteritems() and "for i in some_ordereddict"
  91. have elements ordered based on the key insertion order (or key value
  92. insertion order if kvio is specified, or sort order for sorteddict).
  93. ordered/sorteddicts can be pickled.
  94. Some methods have been slightly changed:
  95. - initialisation of ordereddict takes keywords:
  96. - kvio: if set to True, then move an existing key on update
  97. - relax: if set to True, the ordereddict is relaxed for its life regarding
  98. initialisation and/or update from unordered data (read a normal dict).
  99. - initialisation of sorteddict takes keyword:
  100. - key: specifies a function to apply on key (e.g. string.lower)
  101. - .popitem() takes an optional argument (defaulting to -1) indicating which
  102. key/value pair to return (by default the last one available)
  103. - .dict()/.values()/.items()/.iterdict()/.itervalues()/.iteritems()
  104. all take an optional reverse (default False) parameter that gives
  105. the list reversed order resp. iterates in reverse
  106. (the non-iterator can also be done relatively efficient with e.g.
  107. od.dict().reverse() )
  108. - .update(): takes an optional relax=True which allows one time
  109. ordereddict update from normal dictionaries regardless of
  110. initialisation time relax setting.
  111. In addition to that ordereddict and sorteddict have some extra methods:
  112. - .index(key) - gives an integer value that is the index of the key
  113. - .setkeys()/.setvalues()/.setitems(), work like those in the Larosa/Foord
  114. implementation, although they might throw different exceptions:
  115. - setvalues' argument must be an itereable that returns the same number of
  116. items as the length of the ordereddict
  117. - setitems' argument is free in length, it performs a clear and adds
  118. the items in order.
  119. - slice retrieval for all
  120. and ordereddict only also has:
  121. - .setkeys(), works like the one in the Larosa/Foord
  122. implementation. Argument must be an itereable returning a permutation of the
  123. existing keys ( that implies having the same length as the ordereddict)
  124. - .reverse() - reverses the keys in place
  125. - .insert(position, key, value) - this will put a key at a particular position
  126. so that afterwards .index(key) == position, if the key was already there
  127. the original position (and value) is lost to the new position. This often
  128. means moving keys to new positions!
  129. - slice deletion/assignment:
  130. - stepped deletion could be optimized a bit (individual items are deleted
  131. which can require memmoving multiple items)
  132. - assignment only from OrderedDict (with the same length as the slice). This
  133. could also be optimised as I first delete, then insert individual items.
  134. If the assigned items contain keys that are still there after the deletion
  135. 'phase' then retrieving that slice does not always give the original
  136. assigned ordereddict (depending on the position of the items
  137. with those keys in either ordereddict)
  138. - .rename(oldkey, newkey) renames a key, but keeps the items position and value
  139. The new OrderedDict in the standard collections module
  140. ------------------------------------------------------
  141. With Python 3.1 and backported to 2.7 there is an OrderedDict class
  142. available in the collections modules. Raymond Hettinger indicated in
  143. 2009 at EuroPython that he preferred to start from a minimal
  144. OrderedDict instead of using the Larosa/Foord
  145. implementation. Unfortunately the available tests (for the
  146. functionality that the simple collections.OrderedDict supports) were
  147. not used either resulting in preventable bugs like repr initially not
  148. working on recursive OrderedDicts.
  149. ordereddict (and the Larosa/Foord implementation) is essentially
  150. a superset of collections.OrderedDict, but there are a few
  151. differences:
  152. - OrderedDict is by default relax-ed.
  153. - repr of recursive OrderedDict does not give any indication of the
  154. value of the recursive key, as it only displays `...`. ordereddict
  155. displays `ordereddict([...])` as value. Just using the dots like
  156. OrderedDict does is going to be ambiguous as soon as you have two different
  157. types A and B and nest A in B in A or B in B in A.
  158. - some newer build-in functions available in OrderedDict are not
  159. available in ordereddict ( __reversed__, viewkeys, viewvalues, viewitems).
  160. All of the differences can be straightened out in small (70 lines of
  161. Python) OrderedDict wrapper around ordereddict. With this wrapper the
  162. OrderedDict tests in the standard test_collections.py all pass.
  163. Testing
  164. -------
  165. testordereddict.py in the test subdirectory has been used to test the module.
  166. You can use::
  167. python testordereddict
  168. to run the tests (py.test support has been dropped as newer versions
  169. of py.test were not compatible).
  170. There is a somewhat patched copy of the python lib/Test dictionary testing
  171. routines included as well, it fails on the _update test however
  172. because the default is not to use a relaxed ordereddict.
  173. You can run it with::
  174. cd test/unit
  175. python test_dict.py
  176. To Do
  177. -----
  178. - implement Value Sorted Order (VSO: specify value=True for normal
  179. value comparison), or a value rewrite function for VSO ( e.g.
  180. value=string.lower )
  181. - implement Item Sorted Order (ISO): compare value then key ( the other way
  182. around would not make sense with unique keys, but we might have
  183. non-unique values).
  184. - implement slice deletion for sorteddict
  185. - more testing of sorteddict functionality
  186. - speedtest slices
  187. - speedtest sorteddict
  188. - check on the test_update unittest in test_dict.py
  189. To Consider
  190. -----------
  191. - comparing ordereddicts (as per Larosa/Foord)
  192. - implement the whole (optionally) using pointers in the DictObject Items
  193. (Faster on insertion/deletion, slower on accessing slices, makes
  194. implementing algorithms somewhat more difficult), would have to seperate
  195. code for sorteddict as key position determination would be much slower.
  196. - supply a pure Python implementation of exactly the functionality in
  197. ordereddict
  198. - test on older versions (< 2.4) of Python and make portable (if this can
  199. be done without too much clutter) or port.
  200. - test on the Mac
  201. - optimise searching for an item pointer for sorteddict with binary search
  202. (for deletion)
  203. Background information
  204. ----------------------
  205. ordereddict is directly derived from Python's own dictobject.c file.
  206. The extensions and the representation of ordereddicts() are based
  207. on Larosa/Foord's excellent pure Python OrderedDict() module
  208. (http://www.voidspace.org.uk/python/odict.html).
  209. The implemenation adds a vector of pointers to elements to the basic
  210. dictionary structure and keeps this vector compact (and in order) so
  211. indexing is fast. The elements do not know about their position (so
  212. nothing needs to be updated there if that position changes, but then
  213. finding an item's index is expensive. Insertion/deletion is also relatively
  214. expensive in that on average half of the vector of pointers needs to
  215. be memmove-d one position.
  216. There is also a long value for bit info like kvio, relaxed.
  217. The sorteddict structure has an additional 3 pointers of which only
  218. one (sd_key) is currently used (the others are sd_cmp and sd_value).
  219. Speed
  220. -----
  221. Based on some tests with best of 10 iterations of 10000 iterations of various
  222. functions under Ubuntu 7.10 (see test/timeordereddict.py and test/ta.py)::
  223. Results in seconds:
  224. ------------------------------- dict ordereddict Larosa/Ford collections
  225. OrderedDict OrderedDict
  226. empty 0.023 0.025 0.023 0.024
  227. create_empty 0.028 0.031 0.147 0.329
  228. create_five_entry 0.037 0.042 0.384 0.558
  229. create_26_entry 0.187 0.203 1.494 1.602
  230. create_676_entry 5.330 5.574 36.797 34.810
  231. get_keys_from_26_entry 0.209 0.231 1.501 1.762
  232. pop_5_items_26_entry 0.219 0.247 1.952 1.864
  233. pop_26_items_676_entry 7.550 8.127 46.578 41.851
  234. popitem_last_26_entry 0.203 0.225 1.624 1.734
  235. popitem_last_676_entry 5.285 5.534 36.912 34.799
  236. popitem_100_676_entry -------- 5.552 36.577 --------
  237. walk_26_iteritems -------- 0.494 2.792 2.238
  238. ------------------------------- dict ordereddict Larosa/Ford collections
  239. OrderedDict OrderedDict
  240. empty 0.930 1.000 0.950 0.966
  241. create_empty 0.909 1.000 4.728 10.594
  242. create_five_entry 0.892 1.000 9.201 13.374
  243. create_26_entry 0.923 1.000 7.368 7.901
  244. create_676_entry 0.956 1.000 6.601 6.245
  245. get_keys_from_26_entry 0.908 1.000 6.508 7.641
  246. pop_5_items_26_entry 0.888 1.000 7.916 7.559
  247. pop_26_items_676_entry 0.929 1.000 5.732 5.150
  248. popitem_last_26_entry 0.901 1.000 7.222 7.712
  249. popitem_last_676_entry 0.955 1.000 6.670 6.288
  250. popitem_100_676_entry -------- 1.000 6.588 --------
  251. walk_26_iteritems -------- 1.000 5.653 4.532
  252. Why
  253. ---
  254. Because I am orderly ;-O, and because I use dictionaries to
  255. store key/value information read from some text file quite often.
  256. Unfortunately comparing those files with diff when written from
  257. normal dictionaries often obfucates changes because of the reordering
  258. of lines when key/value pairs are added and then written.
  259. I have special routine for YAML files that takes lines like::
  260. - key1: val1
  261. - key2: val3
  262. - key3:
  263. - val3a
  264. - val3b
  265. (i.e. a list of key-value pairs) directly to a single ordered dictionary
  266. and back. (I find it kind of strange to finally have a structured,
  267. human readeable, format that does not try to preserve the
  268. order of key-value pairs so that comparing files is difficult with
  269. 'standard' text tools).
  270. Older versions
  271. --------------
  272. http://www.xs4all.nl/~anthon/Python/ordereddict used to be
  273. ordereddict's home on the web.
  274. There you can still find the links for downloading the older version (0.4.5).
  275. History
  276. -------
  277. ``0.4.13``: 2017-0723
  278. -
  279. | ``0.4.9 2015-08-10``
  280. | typos fixed by Gianfranco Costamagna
  281. |
  282. | ``0.4.8 2015-05-31``
  283. | dependent on ruamel.base
  284. | version number in a single place
  285. | using py.test under tox
  286. | generate wheel for 32/64bit py26/py27 on windows
  287. |
  288. | ``0.4.6 2014-01-18``
  289. | Move to ruamel namespace, hosted on bitbucket, MIT License
  290. | Testing with tox
  291. |
  292. | ``0.4.5 2012-06-17``
  293. | Fix for a bug while inserting last item again beyond last position (reported
  294. | by Volkan Çetin / volki tolki ( cetinv at gmail.com )
  295. | Fix for repeated deletion and insertion fail. Found by and solution provided
  296. | by Darren Dowker (including tests). Also found by Fabio Zadronzy (including
  297. | a less elegant fix).
  298. | applied reindent to .py and astyle to .c files
  299. |
  300. | ``0.4.3 2009-05-11``
  301. | Fix for a bug in slicing SortedDicts.
  302. | Found by, and fix provided by, Migel Anguel (linos.es)
  303. |
  304. | ``0.4.2 2009-03-27``
  305. | Bug found and by Alexandre Andrade and Fabio Zadrozny in
  306. | doing deepcopy
  307. |
  308. | ``0.4.1 2007-11-06``
  309. | Bug found and fixed by Fabio Zadrozny on resizing dictionaries
  310. |
  311. | ``0.4 2007-10-30``
  312. | added pickling, added relaxed initialisation/update (from unordered dicts)
  313. | added KVIO (Key Value Insertion Order ie. key moves to back on update)
  314. | implemented sorteddict, with KSO, Key Sorted Order. You can specify
  315. | a function for key transformation before comparison (such as string.lower)
  316. | sorteddict does not have all of the ordereddict methods as not all make
  317. | sense (eg. slice assignment, rename, setkeys)
  318. |
  319. | ``0.3 2007-10-24``
  320. | added setkeys/setvalues/setitems; slice retrieval, deletion, assignment
  321. | .rename(oldkey, newkey) rename a key keeping same value and position
  322. | .index() of non-existing key now returns ValueError instead of SystemError
  323. | Changed the module name to _ordereddict (from ordereddict), as Jason
  324. | Kirstland probably rightfully suggested that any private implementation
  325. | likely has the (file)name ordereddict.py. A modulename with leading
  326. | underscore seams more common for extension modules anyway.
  327. |
  328. | ``0.2a 2007-10-16``
  329. | Solved the potential GC problem on Windows
  330. |
  331. | ``0.2 2007-10-16``
  332. | First release, with some tests, and possible still a GC problem
  333. | with Windows.
  334. |
  335. | ``0.1 2007-10-..``
  336. | This version was never released. While testing it I was far in writing
  337. | an email to comp.lang.python about why timing with timeit did seem to
  338. | be memory hungry ....
  339. | and then I realised ordereddict had a memory leak %-)