rfc4978.IMAP_Compress_extension.txt 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507
  1. Network Working Group A. Gulbrandsen
  2. Request for Comments: 4978 Oryx Mail Systems GmbH
  3. Category: Standards Track August 2007
  4. The IMAP COMPRESS Extension
  5. Status of this Memo
  6. This document specifies an Internet standards track protocol for the
  7. Internet community, and requests discussion and suggestions for
  8. improvements. Please refer to the current edition of the "Internet
  9. Official Protocol Standards" (STD 1) for the standardization state
  10. and status of this protocol. Distribution of this memo is unlimited.
  11. Abstract
  12. The COMPRESS extension allows an IMAP connection to be effectively
  13. and efficiently compressed.
  14. Table of Contents
  15. 1. Introduction and Overview .......................................2
  16. 2. Conventions Used in This Document ...............................2
  17. 3. The COMPRESS Command ............................................3
  18. 4. Compression Efficiency ..........................................4
  19. 5. Formal Syntax ...................................................6
  20. 6. Security Considerations .........................................6
  21. 7. IANA Considerations .............................................6
  22. 8. Acknowledgements ................................................7
  23. 9. References ......................................................7
  24. 9.1. Normative References .......................................7
  25. 9.2. Informative References .....................................7
  26. Gulbrandsen Standards Track [Page 1]
  27. RFC 4978 The IMAP COMPRESS Extension August 2007
  28. 1. Introduction and Overview
  29. A server which supports the COMPRESS extension indicates this with
  30. one or more capability names consisting of "COMPRESS=" followed by a
  31. supported compression algorithm name as described in this document.
  32. The goal of COMPRESS is to reduce the bandwidth usage of IMAP.
  33. Compared to PPP compression (see [RFC1962]) and modem-based
  34. compression (see [MNP] and [V42BIS]), COMPRESS offers much better
  35. compression efficiency. COMPRESS can be used together with Transport
  36. Security Layer (TLS) [RFC4346], Simple Authentication and Security
  37. layer (SASL) encryption, Virtual Private Networks (VPNs), etc.
  38. Compared to TLS compression [RFC3749], COMPRESS has the following
  39. (dis)advantages:
  40. - COMPRESS can be implemented easily both by IMAP servers and
  41. clients.
  42. - IMAP COMPRESS benefits from an intimate knowledge of the IMAP
  43. protocol's state machine, allowing for dynamic and aggressive
  44. optimization of the underlying compression algorithm's parameters.
  45. - When the TLS layer implements compression, any protocol using that
  46. layer can transparently benefit from that compression (e.g., SMTP
  47. and IMAP). COMPRESS is specific to IMAP.
  48. In order to increase interoperation, it is desirable to have as few
  49. different compression algorithms as possible, so this document
  50. specifies only one. The DEFLATE algorithm (defined in [RFC1951]) is
  51. standard, widely available and fairly efficient, so it is the only
  52. algorithm defined by this document.
  53. In order to increase interoperation, IMAP servers that advertise this
  54. extension SHOULD also advertise the TLS DEFLATE compression mechanism
  55. as defined in [RFC3749]. IMAP clients MAY use either COMPRESS or TLS
  56. compression, however, if the client and server support both, it is
  57. RECOMMENDED that the client choose TLS compression.
  58. The extension adds one new command (COMPRESS) and no new responses.
  59. 2. Conventions Used in This Document
  60. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
  61. "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
  62. document are to be interpreted as described in [RFC2119].
  63. Formal syntax is defined by [RFC4234] as modified by [RFC3501].
  64. Gulbrandsen Standards Track [Page 2]
  65. RFC 4978 The IMAP COMPRESS Extension August 2007
  66. In the examples, "C:" and "S:" indicate lines sent by the client and
  67. server respectively. "[...]" denotes elision.
  68. 3. The COMPRESS Command
  69. Arguments: Name of compression mechanism: "DEFLATE".
  70. Responses: None
  71. Result: OK The server will compress its responses and expects the
  72. client to compress its commands.
  73. NO Compression is already active via another layer.
  74. BAD Command unknown, invalid or unknown argument, or COMPRESS
  75. already active.
  76. The COMPRESS command instructs the server to use the named
  77. compression mechanism ("DEFLATE" is the only one defined) for all
  78. commands and/or responses after COMPRESS.
  79. The client MUST NOT send any further commands until it has seen the
  80. result of COMPRESS. If the response was OK, the client MUST compress
  81. starting with the first command after COMPRESS. If the server
  82. response was BAD or NO, the client MUST NOT turn on compression.
  83. If the server responds NO because it knows that the same mechanism is
  84. active already (e.g., because TLS has negotiated the same mechanism),
  85. it MUST send COMPRESSIONACTIVE as resp-text-code (see [RFC3501],
  86. Section 7.1), and the resp-text SHOULD say which layer compresses.
  87. If the server issues an OK response, the server MUST compress
  88. starting immediately after the CRLF which ends the tagged OK
  89. response. (Responses issued by the server before the OK response
  90. will, of course, still be uncompressed.) If the server issues a BAD
  91. or NO response, the server MUST NOT turn on compression.
  92. For DEFLATE (as for many other compression mechanisms), the
  93. compressor can trade speed against quality. When decompressing there
  94. isn't much of a tradeoff. Consequently, the client and server are
  95. both free to pick the best reasonable rate of compression for the
  96. data they send.
  97. When COMPRESS is combined with TLS (see [RFC4346]) or SASL (see
  98. [RFC4422]) security layers, the sending order of the three extensions
  99. MUST be first COMPRESS, then SASL, and finally TLS. That is, before
  100. data is transmitted it is first compressed. Second, if a SASL
  101. security layer has been negotiated, the compressed data is then
  102. signed and/or encrypted accordingly. Third, if a TLS security layer
  103. has been negotiated, the data from the previous step is signed and/or
  104. Gulbrandsen Standards Track [Page 3]
  105. RFC 4978 The IMAP COMPRESS Extension August 2007
  106. encrypted accordingly. When receiving data, the processing order
  107. MUST be reversed. This ensures that before sending, data is
  108. compressed before it is encrypted, independent of the order in which
  109. the client issues COMPRESS, AUTHENTICATE, and STARTTLS.
  110. The following example illustrates how commands and responses are
  111. compressed during a simple login sequence:
  112. S: * OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE]
  113. C: a starttls
  114. S: a OK TLS active
  115. From this point on, everything is encrypted.
  116. C: b login arnt tnra
  117. S: b OK Logged in as arnt
  118. C: c compress deflate
  119. S: d OK DEFLATE active
  120. From this point on, everything is compressed before being
  121. encrypted.
  122. The following example demonstrates how a server may refuse to
  123. compress twice:
  124. S: * OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE]
  125. [...]
  126. C: c compress deflate
  127. S: c NO [COMPRESSIONACTIVE] DEFLATE active via TLS
  128. 4. Compression Efficiency
  129. This section is informative, not normative.
  130. IMAP poses some unusual problems for a compression layer.
  131. Upstream is fairly simple. Most IMAP clients send the same few
  132. commands again and again, so any compression algorithm that can
  133. exploit repetition works efficiently. The APPEND command is an
  134. exception; clients that send many APPEND commands may want to
  135. surround large literals with flushes in the same way as is
  136. recommended for servers later in this section.
  137. Downstream has the unusual property that several kinds of data are
  138. sent, confusing all dictionary-based compression algorithms.
  139. Gulbrandsen Standards Track [Page 4]
  140. RFC 4978 The IMAP COMPRESS Extension August 2007
  141. One type is IMAP responses. These are highly compressible; zlib
  142. using its least CPU-intensive setting compresses typical responses to
  143. 25-40% of their original size.
  144. Another type is email headers. These are equally compressible, and
  145. benefit from using the same dictionary as the IMAP responses.
  146. A third type is email body text. Text is usually fairly short and
  147. includes much ASCII, so the same compression dictionary will do a
  148. good job here, too. When multiple messages in the same thread are
  149. read at the same time, quoted lines etc. can often be compressed
  150. almost to zero.
  151. Finally, attachments (non-text email bodies) are transmitted, either
  152. in binary form or encoded with base-64.
  153. When attachments are retrieved in binary form, DEFLATE may be able to
  154. compress them, but the format of the attachment is usually not IMAP-
  155. like, so the dictionary built while compressing IMAP does not help.
  156. The compressor has to adapt its dictionary from IMAP to the
  157. attachment's format, and then back. A few file formats aren't
  158. compressible at all using deflate, e.g., .gz, .zip, and .jpg files.
  159. When attachments are retrieved in base-64 form, the same problems
  160. apply, but the base-64 encoding adds another problem. 8-bit
  161. compression algorithms such as deflate work well on 8-bit file
  162. formats, however base-64 turns a file into something resembling 6-bit
  163. bytes, hiding most of the 8-bit file format from the compressor.
  164. When using the zlib library (see [RFC1951]), the functions
  165. deflateInit2(), deflate(), inflateInit2(), and inflate() suffice to
  166. implement this extension. The windowBits value must be in the range
  167. -8 to -15, or else deflateInit2() uses the wrong format.
  168. deflateParams() can be used to improve compression rate and resource
  169. use. The Z_FULL_FLUSH argument to deflate() can be used to clear the
  170. dictionary (the receiving peer does not need to do anything).
  171. A client can improve downstream compression by implementing BINARY
  172. (defined in [RFC3516]) and using FETCH BINARY instead of FETCH BODY.
  173. In the author's experience, the improvement ranges from 5% to 40%
  174. depending on the attachment being downloaded.
  175. A server can improve downstream compression if it hints to the
  176. compressor that the data type is about to change strongly, e.g., by
  177. sending a Z_FULL_FLUSH at the start and end of large non-text
  178. literals (before and after '*CHAR8' in the definition of literal in
  179. RFC 3501, page 86). Small literals are best left alone. A possible
  180. boundary is 5k.
  181. Gulbrandsen Standards Track [Page 5]
  182. RFC 4978 The IMAP COMPRESS Extension August 2007
  183. A server can improve the CPU efficiency both of the server and the
  184. client if it adjusts the compression level (e.g., using the
  185. deflateParams() function in zlib) at these points, to avoid trying to
  186. compress incompressible attachments. A very simple strategy is to
  187. change the level to 0 at the start of a literal provided the first
  188. two bytes are either 0x1F 0x8B (as in deflate-compressed files) or
  189. 0xFF 0xD8 (JPEG), and to keep it at 1-5 the rest of the time. More
  190. complex strategies are possible.
  191. 5. Formal Syntax
  192. The following syntax specification uses the Augmented Backus-Naur
  193. Form (ABNF) notation as specified in [RFC4234]. This syntax augments
  194. the grammar specified in [RFC3501]. [RFC4234] defines SP and
  195. [RFC3501] defines command-auth, capability, and resp-text-code.
  196. Except as noted otherwise, all alphabetic characters are case-
  197. insensitive. The use of upper or lower case characters to define
  198. token strings is for editorial clarity only. Implementations MUST
  199. accept these strings in a case-insensitive fashion.
  200. command-auth =/ compress
  201. compress = "COMPRESS" SP algorithm
  202. capability =/ "COMPRESS=" algorithm
  203. ;; multiple COMPRESS capabilities allowed
  204. algorithm = "DEFLATE"
  205. resp-text-code =/ "COMPRESSIONACTIVE"
  206. Note that due the syntax of capability names, future algorithm names
  207. must be atoms.
  208. 6. Security Considerations
  209. As for TLS compression [RFC3749].
  210. 7. IANA Considerations
  211. The IANA has added COMPRESS=DEFLATE to the list of IMAP capabilities.
  212. Gulbrandsen Standards Track [Page 6]
  213. RFC 4978 The IMAP COMPRESS Extension August 2007
  214. 8. Acknowledgements
  215. Eric Burger, Dave Cridland, Tony Finch, Ned Freed, Philip Guenther,
  216. Randall Gellens, Tony Hansen, Cullen Jennings, Stephane Maes, Alexey
  217. Melnikov, Lyndon Nerenberg, and Zoltan Ordogh have all helped with
  218. this document.
  219. The author would also like to thank various people in the rooms at
  220. meetings, whose help is real, but not reflected in the author's
  221. mailbox.
  222. 9. References
  223. 9.1. Normative References
  224. [RFC1951] Deutsch, P., "DEFLATE Compressed Data Format Specification
  225. version 1.3", RFC 1951, May 1996.
  226. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
  227. Requirement Levels", BCP 14, RFC 2119, March 1997.
  228. [RFC3501] Crispin, M., "INTERNET MESSAGE ACCESS PROTOCOL - VERSION
  229. 4rev1", RFC 3501, March 2003.
  230. [RFC4234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
  231. Specifications: ABNF", RFC 4234, October 2005.
  232. 9.2. Informative References
  233. [RFC1962] Rand, D., "The PPP Compression Control Protocol (CCP)",
  234. RFC 1962, June 1996.
  235. [RFC3516] Nerenberg, L., "IMAP4 Binary Content Extension", RFC 3516,
  236. April 2003.
  237. [RFC3749] Hollenbeck, S., "Transport Layer Security Protocol
  238. Compression Methods", RFC 3749, May 2004.
  239. [RFC4346] Dierks, T. and E. Rescorla, "The Transport Layer Security
  240. (TLS) Protocol Version 1.1", RFC 4346, April 2006.
  241. [RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and
  242. Security Layer (SASL)", RFC 4422, June 2006.
  243. [V42BIS] ITU, "V.42bis: Data compression procedures for data
  244. circuit-terminating equipment (DCE) using error correction
  245. procedures", http://www.itu.int/rec/T-REC-V.42bis, January
  246. 1990.
  247. Gulbrandsen Standards Track [Page 7]
  248. RFC 4978 The IMAP COMPRESS Extension August 2007
  249. [MNP] Gilbert Held, "The Complete Modem Reference", Second
  250. Edition, Wiley Professional Computing, ISBN 0-471-00852-4,
  251. May 1994.
  252. Author's Address
  253. Arnt Gulbrandsen
  254. Oryx Mail Systems GmbH
  255. Schweppermannstr. 8
  256. D-81671 Muenchen
  257. Germany
  258. Fax: +49 89 4502 9758
  259. EMail: arnt@oryx.com
  260. Gulbrandsen Standards Track [Page 8]
  261. RFC 4978 The IMAP COMPRESS Extension August 2007
  262. Full Copyright Statement
  263. Copyright (C) The IETF Trust (2007).
  264. This document is subject to the rights, licenses and restrictions
  265. contained in BCP 78, and except as set forth therein, the authors
  266. retain all their rights.
  267. This document and the information contained herein are provided on an
  268. "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
  269. OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
  270. THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
  271. OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
  272. THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
  273. WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
  274. Intellectual Property
  275. The IETF takes no position regarding the validity or scope of any
  276. Intellectual Property Rights or other rights that might be claimed to
  277. pertain to the implementation or use of the technology described in
  278. this document or the extent to which any license under such rights
  279. might or might not be available; nor does it represent that it has
  280. made any independent effort to identify any such rights. Information
  281. on the procedures with respect to rights in RFC documents can be
  282. found in BCP 78 and BCP 79.
  283. Copies of IPR disclosures made to the IETF Secretariat and any
  284. assurances of licenses to be made available, or the result of an
  285. attempt made to obtain a general license or permission for the use of
  286. such proprietary rights by implementers or users of this
  287. specification can be obtained from the IETF on-line IPR repository at
  288. http://www.ietf.org/ipr.
  289. The IETF invites any interested party to bring to its attention any
  290. copyrights, patents or patent applications, or other proprietary
  291. rights that may cover technology that may be required to implement
  292. this standard. Please address the information to the IETF at
  293. ietf-ipr@ietf.org.
  294. Gulbrandsen Standards Track [Page 9]