METADATA 9.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242
  1. Metadata-Version: 2.1
  2. Name: idna
  3. Version: 3.4
  4. Summary: Internationalized Domain Names in Applications (IDNA)
  5. Author-email: Kim Davies <kim@cynosure.com.au>
  6. Requires-Python: >=3.5
  7. Description-Content-Type: text/x-rst
  8. Classifier: Development Status :: 5 - Production/Stable
  9. Classifier: Intended Audience :: Developers
  10. Classifier: Intended Audience :: System Administrators
  11. Classifier: License :: OSI Approved :: BSD License
  12. Classifier: Operating System :: OS Independent
  13. Classifier: Programming Language :: Python
  14. Classifier: Programming Language :: Python :: 3
  15. Classifier: Programming Language :: Python :: 3 :: Only
  16. Classifier: Programming Language :: Python :: 3.5
  17. Classifier: Programming Language :: Python :: 3.6
  18. Classifier: Programming Language :: Python :: 3.7
  19. Classifier: Programming Language :: Python :: 3.8
  20. Classifier: Programming Language :: Python :: 3.9
  21. Classifier: Programming Language :: Python :: 3.10
  22. Classifier: Programming Language :: Python :: 3.11
  23. Classifier: Programming Language :: Python :: Implementation :: CPython
  24. Classifier: Programming Language :: Python :: Implementation :: PyPy
  25. Classifier: Topic :: Internet :: Name Service (DNS)
  26. Classifier: Topic :: Software Development :: Libraries :: Python Modules
  27. Classifier: Topic :: Utilities
  28. Project-URL: Changelog, https://github.com/kjd/idna/blob/master/HISTORY.rst
  29. Project-URL: Issue tracker, https://github.com/kjd/idna/issues
  30. Project-URL: Source, https://github.com/kjd/idna
  31. Internationalized Domain Names in Applications (IDNA)
  32. =====================================================
  33. Support for the Internationalized Domain Names in
  34. Applications (IDNA) protocol as specified in `RFC 5891
  35. <https://tools.ietf.org/html/rfc5891>`_. This is the latest version of
  36. the protocol and is sometimes referred to as “IDNA 2008”.
  37. This library also provides support for Unicode Technical
  38. Standard 46, `Unicode IDNA Compatibility Processing
  39. <https://unicode.org/reports/tr46/>`_.
  40. This acts as a suitable replacement for the “encodings.idna”
  41. module that comes with the Python standard library, but which
  42. only supports the older superseded IDNA specification (`RFC 3490
  43. <https://tools.ietf.org/html/rfc3490>`_).
  44. Basic functions are simply executed:
  45. .. code-block:: pycon
  46. >>> import idna
  47. >>> idna.encode('ドメイン.テスト')
  48. b'xn--eckwd4c7c.xn--zckzah'
  49. >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
  50. ドメイン.テスト
  51. Installation
  52. ------------
  53. This package is available for installation from PyPI:
  54. .. code-block:: bash
  55. $ python3 -m pip install idna
  56. Usage
  57. -----
  58. For typical usage, the ``encode`` and ``decode`` functions will take a
  59. domain name argument and perform a conversion to A-labels or U-labels
  60. respectively.
  61. .. code-block:: pycon
  62. >>> import idna
  63. >>> idna.encode('ドメイン.テスト')
  64. b'xn--eckwd4c7c.xn--zckzah'
  65. >>> print(idna.decode('xn--eckwd4c7c.xn--zckzah'))
  66. ドメイン.テスト
  67. You may use the codec encoding and decoding methods using the
  68. ``idna.codec`` module:
  69. .. code-block:: pycon
  70. >>> import idna.codec
  71. >>> print('домен.испытание'.encode('idna'))
  72. b'xn--d1acufc.xn--80akhbyknj4f'
  73. >>> print(b'xn--d1acufc.xn--80akhbyknj4f'.decode('idna'))
  74. домен.испытание
  75. Conversions can be applied at a per-label basis using the ``ulabel`` or
  76. ``alabel`` functions if necessary:
  77. .. code-block:: pycon
  78. >>> idna.alabel('测试')
  79. b'xn--0zwm56d'
  80. Compatibility Mapping (UTS #46)
  81. +++++++++++++++++++++++++++++++
  82. As described in `RFC 5895 <https://tools.ietf.org/html/rfc5895>`_, the
  83. IDNA specification does not normalize input from different potential
  84. ways a user may input a domain name. This functionality, known as
  85. a “mapping”, is considered by the specification to be a local
  86. user-interface issue distinct from IDNA conversion functionality.
  87. This library provides one such mapping, that was developed by the
  88. Unicode Consortium. Known as `Unicode IDNA Compatibility Processing
  89. <https://unicode.org/reports/tr46/>`_, it provides for both a regular
  90. mapping for typical applications, as well as a transitional mapping to
  91. help migrate from older IDNA 2003 applications.
  92. For example, “Königsgäßchen” is not a permissible label as *LATIN
  93. CAPITAL LETTER K* is not allowed (nor are capital letters in general).
  94. UTS 46 will convert this into lower case prior to applying the IDNA
  95. conversion.
  96. .. code-block:: pycon
  97. >>> import idna
  98. >>> idna.encode('Königsgäßchen')
  99. ...
  100. idna.core.InvalidCodepoint: Codepoint U+004B at position 1 of 'Königsgäßchen' not allowed
  101. >>> idna.encode('Königsgäßchen', uts46=True)
  102. b'xn--knigsgchen-b4a3dun'
  103. >>> print(idna.decode('xn--knigsgchen-b4a3dun'))
  104. königsgäßchen
  105. Transitional processing provides conversions to help transition from
  106. the older 2003 standard to the current standard. For example, in the
  107. original IDNA specification, the *LATIN SMALL LETTER SHARP S* (ß) was
  108. converted into two *LATIN SMALL LETTER S* (ss), whereas in the current
  109. IDNA specification this conversion is not performed.
  110. .. code-block:: pycon
  111. >>> idna.encode('Königsgäßchen', uts46=True, transitional=True)
  112. 'xn--knigsgsschen-lcb0w'
  113. Implementors should use transitional processing with caution, only in
  114. rare cases where conversion from legacy labels to current labels must be
  115. performed (i.e. IDNA implementations that pre-date 2008). For typical
  116. applications that just need to convert labels, transitional processing
  117. is unlikely to be beneficial and could produce unexpected incompatible
  118. results.
  119. ``encodings.idna`` Compatibility
  120. ++++++++++++++++++++++++++++++++
  121. Function calls from the Python built-in ``encodings.idna`` module are
  122. mapped to their IDNA 2008 equivalents using the ``idna.compat`` module.
  123. Simply substitute the ``import`` clause in your code to refer to the new
  124. module name.
  125. Exceptions
  126. ----------
  127. All errors raised during the conversion following the specification
  128. should raise an exception derived from the ``idna.IDNAError`` base
  129. class.
  130. More specific exceptions that may be generated as ``idna.IDNABidiError``
  131. when the error reflects an illegal combination of left-to-right and
  132. right-to-left characters in a label; ``idna.InvalidCodepoint`` when
  133. a specific codepoint is an illegal character in an IDN label (i.e.
  134. INVALID); and ``idna.InvalidCodepointContext`` when the codepoint is
  135. illegal based on its positional context (i.e. it is CONTEXTO or CONTEXTJ
  136. but the contextual requirements are not satisfied.)
  137. Building and Diagnostics
  138. ------------------------
  139. The IDNA and UTS 46 functionality relies upon pre-calculated lookup
  140. tables for performance. These tables are derived from computing against
  141. eligibility criteria in the respective standards. These tables are
  142. computed using the command-line script ``tools/idna-data``.
  143. This tool will fetch relevant codepoint data from the Unicode repository
  144. and perform the required calculations to identify eligibility. There are
  145. three main modes:
  146. * ``idna-data make-libdata``. Generates ``idnadata.py`` and
  147. ``uts46data.py``, the pre-calculated lookup tables using for IDNA and
  148. UTS 46 conversions. Implementors who wish to track this library against
  149. a different Unicode version may use this tool to manually generate a
  150. different version of the ``idnadata.py`` and ``uts46data.py`` files.
  151. * ``idna-data make-table``. Generate a table of the IDNA disposition
  152. (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix
  153. B.1 of RFC 5892 and the pre-computed tables published by `IANA
  154. <https://www.iana.org/>`_.
  155. * ``idna-data U+0061``. Prints debugging output on the various
  156. properties associated with an individual Unicode codepoint (in this
  157. case, U+0061), that are used to assess the IDNA and UTS 46 status of a
  158. codepoint. This is helpful in debugging or analysis.
  159. The tool accepts a number of arguments, described using ``idna-data
  160. -h``. Most notably, the ``--version`` argument allows the specification
  161. of the version of Unicode to use in computing the table data. For
  162. example, ``idna-data --version 9.0.0 make-libdata`` will generate
  163. library data against Unicode 9.0.0.
  164. Additional Notes
  165. ----------------
  166. * **Packages**. The latest tagged release version is published in the
  167. `Python Package Index <https://pypi.org/project/idna/>`_.
  168. * **Version support**. This library supports Python 3.5 and higher.
  169. As this library serves as a low-level toolkit for a variety of
  170. applications, many of which strive for broad compatibility with older
  171. Python versions, there is no rush to remove older intepreter support.
  172. Removing support for older versions should be well justified in that the
  173. maintenance burden has become too high.
  174. * **Python 2**. Python 2 is supported by version 2.x of this library.
  175. While active development of the version 2.x series has ended, notable
  176. issues being corrected may be backported to 2.x. Use "idna<3" in your
  177. requirements file if you need this library for a Python 2 application.
  178. * **Testing**. The library has a test suite based on each rule of the
  179. IDNA specification, as well as tests that are provided as part of the
  180. Unicode Technical Standard 46, `Unicode IDNA Compatibility Processing
  181. <https://unicode.org/reports/tr46/>`_.
  182. * **Emoji**. It is an occasional request to support emoji domains in
  183. this library. Encoding of symbols like emoji is expressly prohibited by
  184. the technical standard IDNA 2008 and emoji domains are broadly phased
  185. out across the domain industry due to associated security risks. For
  186. now, applications that wish need to support these non-compliant labels
  187. may wish to consider trying the encode/decode operation in this library
  188. first, and then falling back to using `encodings.idna`. See `the Github
  189. project <https://github.com/kjd/idna/issues/18>`_ for more discussion.