README.rst 9.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221
  1. olefile (formerly OleFileIO\_PL)
  2. ================================
  3. `olefile <http://www.decalage.info/olefile>`_ is a Python package to
  4. parse, read and write `Microsoft OLE2
  5. files <http://en.wikipedia.org/wiki/Compound_File_Binary_Format>`_ (also
  6. called Structured Storage, Compound File Binary Format or Compound
  7. Document File Format), such as Microsoft Office 97-2003 documents,
  8. vbaProject.bin in MS Office 2007+ files, Image Composer and FlashPix
  9. files, Outlook messages, StickyNotes, several Microscopy file formats,
  10. McAfee antivirus quarantine files, etc.
  11. **Quick links:** `Home page <http://www.decalage.info/olefile>`_ -
  12. `Download/Install <https://bitbucket.org/decalage/olefileio_pl/wiki/Install>`_
  13. - `Documentation <https://bitbucket.org/decalage/olefileio_pl/wiki>`_ -
  14. `Report
  15. Issues/Suggestions/Questions <https://bitbucket.org/decalage/olefileio_pl/issues?status=new&status=open>`_
  16. - `Contact the author <http://decalage.info/contact>`_ -
  17. `Repository <https://bitbucket.org/decalage/olefileio_pl>`_ - `Updates
  18. on Twitter <https://twitter.com/decalage2>`_
  19. News
  20. ----
  21. Follow all updates and news on Twitter: https://twitter.com/decalage2
  22. - **2015-01-25 v0.42**: improved handling of special characters in
  23. stream/storage names on Python 2.x (using UTF-8 instead of Latin-1),
  24. fixed bug in listdir with empty storages.
  25. - 2014-11-25 v0.41: OleFileIO.open and isOleFile now support OLE files
  26. stored in byte strings, fixed installer for python 3, added support
  27. for Jython (Niko Ehrenfeuchter)
  28. - 2014-10-01 v0.40: renamed OleFileIO\_PL to olefile, added initial
  29. write support for streams >4K, updated doc and license, improved the
  30. setup script.
  31. - 2014-07-27 v0.31: fixed support for large files with 4K sectors,
  32. thanks to Niko Ehrenfeuchter, Martijn Berger and Dave Jones. Added
  33. test scripts from Pillow (by hugovk). Fixed setup for Python 3
  34. (Martin Panter)
  35. - 2014-02-04 v0.30: now compatible with Python 3.x, thanks to Martin
  36. Panter who did most of the hard work.
  37. - 2013-07-24 v0.26: added methods to parse stream/storage timestamps,
  38. improved listdir to include storages, fixed parsing of direntry
  39. timestamps
  40. - 2013-05-27 v0.25: improved metadata extraction, properties parsing
  41. and exception handling, fixed `issue
  42. #12 <https://bitbucket.org/decalage/olefileio_pl/issue/12/error-when-converting-timestamps-in-ole>`_
  43. - 2013-05-07 v0.24: new features to extract metadata (get\_metadata
  44. method and OleMetadata class), improved getproperties to convert
  45. timestamps to Python datetime
  46. - 2012-10-09: published
  47. `python-oletools <http://www.decalage.info/python/oletools>`_, a
  48. package of analysis tools based on OleFileIO\_PL
  49. - 2012-09-11 v0.23: added support for file-like objects, fixed `issue
  50. #8 <https://bitbucket.org/decalage/olefileio_pl/issue/8/bug-with-file-object>`_
  51. - 2012-02-17 v0.22: fixed issues #7 (bug in getproperties) and #2
  52. (added close method)
  53. - 2011-10-20: code hosted on bitbucket to ease contributions and bug
  54. tracking
  55. - 2010-01-24 v0.21: fixed support for big-endian CPUs, such as PowerPC
  56. Macs.
  57. - 2009-12-11 v0.20: small bugfix in OleFileIO.open when filename is not
  58. plain str.
  59. - 2009-12-10 v0.19: fixed support for 64 bits platforms (thanks to Ben
  60. G. and Martijn for reporting the bug)
  61. - see changelog in source code for more info.
  62. Download/Install
  63. ----------------
  64. If you have pip or setuptools installed (pip is included in Python
  65. 2.7.9+), you may simply run **pip install olefile** or **easy\_install
  66. olefile** for the first installation.
  67. To update olefile, run **pip install -U olefile**.
  68. Otherwise, see https://bitbucket.org/decalage/olefileio\_pl/wiki/Install
  69. Features
  70. --------
  71. - Parse, read and write any OLE file such as Microsoft Office 97-2003
  72. legacy document formats (Word .doc, Excel .xls, PowerPoint .ppt,
  73. Visio .vsd, Project .mpp), Image Composer and FlashPix files, Outlook
  74. messages, StickyNotes, Zeiss AxioVision ZVI files, Olympus FluoView
  75. OIB files, etc
  76. - List all the streams and storages contained in an OLE file
  77. - Open streams as files
  78. - Parse and read property streams, containing metadata of the file
  79. - Portable, pure Python module, no dependency
  80. olefile can be used as an independent package or with PIL/Pillow.
  81. olefile is mostly meant for developers. If you are looking for tools to
  82. analyze OLE files or to extract data (especially for security purposes
  83. such as malware analysis and forensics), then please also check my
  84. `python-oletools <http://www.decalage.info/python/oletools>`_, which are
  85. built upon olefile and provide a higher-level interface.
  86. History
  87. -------
  88. olefile is based on the OleFileIO module from
  89. `PIL <http://www.pythonware.com/products/pil/index.htm>`_, the excellent
  90. Python Imaging Library, created and maintained by Fredrik Lundh. The
  91. olefile API is still compatible with PIL, but since 2005 I have improved
  92. the internal implementation significantly, with new features, bugfixes
  93. and a more robust design. From 2005 to 2014 the project was called
  94. OleFileIO\_PL, and in 2014 I changed its name to olefile to celebrate
  95. its 9 years and its new write features.
  96. As far as I know, olefile is the most complete and robust Python
  97. implementation to read MS OLE2 files, portable on several operating
  98. systems. (please tell me if you know other similar Python modules)
  99. Since 2014 olefile/OleFileIO\_PL has been integrated into
  100. `Pillow <http://python-imaging.github.io/>`_, the friendly fork of PIL.
  101. olefile will continue to be improved as a separate project, and new
  102. versions will be merged into Pillow regularly.
  103. Main improvements over the original version of OleFileIO in PIL:
  104. ----------------------------------------------------------------
  105. - Compatible with Python 3.x and 2.6+
  106. - Many bug fixes
  107. - Support for files larger than 6.8MB
  108. - Support for 64 bits platforms and big-endian CPUs
  109. - Robust: many checks to detect malformed files
  110. - Runtime option to choose if malformed files should be parsed or raise
  111. exceptions
  112. - Improved API
  113. - Metadata extraction, stream/storage timestamps (e.g. for document
  114. forensics)
  115. - Can open file-like objects
  116. - Added setup.py and install.bat to ease installation
  117. - More convenient slash-based syntax for stream paths
  118. - Write features
  119. Documentation
  120. -------------
  121. Please see the `online
  122. documentation <https://bitbucket.org/decalage/olefileio_pl/wiki>`_ for
  123. more information, especially the `OLE
  124. overview <https://bitbucket.org/decalage/olefileio_pl/wiki/OLE_Overview>`_
  125. and the `API
  126. page <https://bitbucket.org/decalage/olefileio_pl/wiki/API>`_ which
  127. describe how to use olefile in Python applications. A copy of the same
  128. documentation is also provided in the doc subfolder of the olefile
  129. package.
  130. Real-life examples
  131. ------------------
  132. A real-life example: `using OleFileIO\_PL for malware analysis and
  133. forensics <http://blog.gregback.net/2011/03/using-remnux-for-forensic-puzzle-6/>`_.
  134. See also `this
  135. paper <https://computer-forensics.sans.org/community/papers/gcfa/grow-forensic-tools-taxonomy-python-libraries-helpful-forensic-analysis_6879>`_
  136. about python tools for forensics, which features olefile.
  137. License
  138. -------
  139. olefile (formerly OleFileIO\_PL) is copyright (c) 2005-2015 Philippe
  140. Lagadec (`http://www.decalage.info <http://www.decalage.info>`_)
  141. All rights reserved.
  142. Redistribution and use in source and binary forms, with or without
  143. modification, are permitted provided that the following conditions are
  144. met:
  145. - Redistributions of source code must retain the above copyright
  146. notice, this list of conditions and the following disclaimer.
  147. - Redistributions in binary form must reproduce the above copyright
  148. notice, this list of conditions and the following disclaimer in the
  149. documentation and/or other materials provided with the distribution.
  150. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
  151. IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
  152. TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
  153. PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
  154. HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  155. SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
  156. TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
  157. PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
  158. LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
  159. NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
  160. SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  161. --------------
  162. olefile is based on source code from the OleFileIO module of the Python
  163. Imaging Library (PIL) published by Fredrik Lundh under the following
  164. license:
  165. The Python Imaging Library (PIL) is
  166. - Copyright (c) 1997-2005 by Secret Labs AB
  167. - Copyright (c) 1995-2005 by Fredrik Lundh
  168. By obtaining, using, and/or copying this software and/or its associated
  169. documentation, you agree that you have read, understood, and will comply
  170. with the following terms and conditions:
  171. Permission to use, copy, modify, and distribute this software and its
  172. associated documentation for any purpose and without fee is hereby
  173. granted, provided that the above copyright notice appears in all copies,
  174. and that both that copyright notice and this permission notice appear in
  175. supporting documentation, and that the name of Secret Labs AB or the
  176. author not be used in advertising or publicity pertaining to
  177. distribution of the software without specific, written prior permission.
  178. SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO
  179. THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
  180. FITNESS. IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR BE LIABLE FOR
  181. ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
  182. RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
  183. CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
  184. CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.