HACKING 23 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528
  1. Technical Notes about PCRE
  2. --------------------------
  3. These are very rough technical notes that record potentially useful information
  4. about PCRE internals. For information about testing PCRE, see the pcretest
  5. documentation and the comment at the head of the RunTest file.
  6. Historical note 1
  7. -----------------
  8. Many years ago I implemented some regular expression functions to an algorithm
  9. suggested by Martin Richards. These were not Unix-like in form, and were quite
  10. restricted in what they could do by comparison with Perl. The interesting part
  11. about the algorithm was that the amount of space required to hold the compiled
  12. form of an expression was known in advance. The code to apply an expression did
  13. not operate by backtracking, as the original Henry Spencer code and current
  14. Perl code does, but instead checked all possibilities simultaneously by keeping
  15. a list of current states and checking all of them as it advanced through the
  16. subject string. In the terminology of Jeffrey Friedl's book, it was a "DFA
  17. algorithm", though it was not a traditional Finite State Machine (FSM). When
  18. the pattern was all used up, all remaining states were possible matches, and
  19. the one matching the longest subset of the subject string was chosen. This did
  20. not necessarily maximize the individual wild portions of the pattern, as is
  21. expected in Unix and Perl-style regular expressions.
  22. Historical note 2
  23. -----------------
  24. By contrast, the code originally written by Henry Spencer (which was
  25. subsequently heavily modified for Perl) compiles the expression twice: once in
  26. a dummy mode in order to find out how much store will be needed, and then for
  27. real. (The Perl version probably doesn't do this any more; I'm talking about
  28. the original library.) The execution function operates by backtracking and
  29. maximizing (or, optionally, minimizing in Perl) the amount of the subject that
  30. matches individual wild portions of the pattern. This is an "NFA algorithm" in
  31. Friedl's terminology.
  32. OK, here's the real stuff
  33. -------------------------
  34. For the set of functions that form the "basic" PCRE library (which are
  35. unrelated to those mentioned above), I tried at first to invent an algorithm
  36. that used an amount of store bounded by a multiple of the number of characters
  37. in the pattern, to save on compiling time. However, because of the greater
  38. complexity in Perl regular expressions, I couldn't do this. In any case, a
  39. first pass through the pattern is helpful for other reasons.
  40. Support for 16-bit and 32-bit data strings
  41. -------------------------------------------
  42. From release 8.30, PCRE supports 16-bit as well as 8-bit data strings; and from
  43. release 8.32, PCRE supports 32-bit data strings. The library can be compiled
  44. in any combination of 8-bit, 16-bit or 32-bit modes, creating up to three
  45. different libraries. In the description that follows, the word "short" is used
  46. for a 16-bit data quantity, and the word "unit" is used for a quantity that is
  47. a byte in 8-bit mode, a short in 16-bit mode and a 32-bit word in 32-bit mode.
  48. However, so as not to over-complicate the text, the names of PCRE functions are
  49. given in 8-bit form only.
  50. Computing the memory requirement: how it was
  51. --------------------------------------------
  52. Up to and including release 6.7, PCRE worked by running a very degenerate first
  53. pass to calculate a maximum store size, and then a second pass to do the real
  54. compile - which might use a bit less than the predicted amount of memory. The
  55. idea was that this would turn out faster than the Henry Spencer code because
  56. the first pass is degenerate and the second pass can just store stuff straight
  57. into the vector, which it knows is big enough.
  58. Computing the memory requirement: how it is
  59. -------------------------------------------
  60. By the time I was working on a potential 6.8 release, the degenerate first pass
  61. had become very complicated and hard to maintain. Indeed one of the early
  62. things I did for 6.8 was to fix Yet Another Bug in the memory computation. Then
  63. I had a flash of inspiration as to how I could run the real compile function in
  64. a "fake" mode that enables it to compute how much memory it would need, while
  65. actually only ever using a few hundred bytes of working memory, and without too
  66. many tests of the mode that might slow it down. So I refactored the compiling
  67. functions to work this way. This got rid of about 600 lines of source. It
  68. should make future maintenance and development easier. As this was such a major
  69. change, I never released 6.8, instead upping the number to 7.0 (other quite
  70. major changes were also present in the 7.0 release).
  71. A side effect of this work was that the previous limit of 200 on the nesting
  72. depth of parentheses was removed. However, there is a downside: pcre_compile()
  73. runs more slowly than before (30% or more, depending on the pattern) because it
  74. is doing a full analysis of the pattern. My hope was that this would not be a
  75. big issue, and in the event, nobody has commented on it.
  76. At release 8.34, a limit on the nesting depth of parentheses was re-introduced
  77. (default 250, settable at build time) so as to put a limit on the amount of
  78. system stack used by pcre_compile(). This is a safety feature for environments
  79. with small stacks where the patterns are provided by users.
  80. Traditional matching function
  81. -----------------------------
  82. The "traditional", and original, matching function is called pcre_exec(), and
  83. it implements an NFA algorithm, similar to the original Henry Spencer algorithm
  84. and the way that Perl works. This is not surprising, since it is intended to be
  85. as compatible with Perl as possible. This is the function most users of PCRE
  86. will use most of the time. From release 8.20, if PCRE is compiled with
  87. just-in-time (JIT) support, and studying a compiled pattern with JIT is
  88. successful, the JIT code is run instead of the normal pcre_exec() code, but the
  89. result is the same.
  90. Supplementary matching function
  91. -------------------------------
  92. From PCRE 6.0, there is also a supplementary matching function called
  93. pcre_dfa_exec(). This implements a DFA matching algorithm that searches
  94. simultaneously for all possible matches that start at one point in the subject
  95. string. (Going back to my roots: see Historical Note 1 above.) This function
  96. intreprets the same compiled pattern data as pcre_exec(); however, not all the
  97. facilities are available, and those that are do not always work in quite the
  98. same way. See the user documentation for details.
  99. The algorithm that is used for pcre_dfa_exec() is not a traditional FSM,
  100. because it may have a number of states active at one time. More work would be
  101. needed at compile time to produce a traditional FSM where only one state is
  102. ever active at once. I believe some other regex matchers work this way. JIT
  103. support is not available for this kind of matching.
  104. Changeable options
  105. ------------------
  106. The /i, /m, or /s options (PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and some
  107. others) may change in the middle of patterns. From PCRE 8.13, their processing
  108. is handled entirely at compile time by generating different opcodes for the
  109. different settings. The runtime functions do not need to keep track of an
  110. options state any more.
  111. Format of compiled patterns
  112. ---------------------------
  113. The compiled form of a pattern is a vector of unsigned units (bytes in 8-bit
  114. mode, shorts in 16-bit mode, 32-bit words in 32-bit mode), containing items of
  115. variable length. The first unit in an item contains an opcode, and the length
  116. of the item is either implicit in the opcode or contained in the data that
  117. follows it.
  118. In many cases listed below, LINK_SIZE data values are specified for offsets
  119. within the compiled pattern. LINK_SIZE always specifies a number of bytes. The
  120. default value for LINK_SIZE is 2, but PCRE can be compiled to use 3-byte or
  121. 4-byte values for these offsets, although this impairs the performance. (3-byte
  122. LINK_SIZE values are available only in 8-bit mode.) Specifing a LINK_SIZE
  123. larger than 2 is necessary only when patterns whose compiled length is greater
  124. than 64K are going to be processed. In this description, we assume the "normal"
  125. compilation options. Data values that are counts (e.g. quantifiers) are two
  126. bytes long in 8-bit mode (most significant byte first), or one unit in 16-bit
  127. and 32-bit modes.
  128. Opcodes with no following data
  129. ------------------------------
  130. These items are all just one unit long
  131. OP_END end of pattern
  132. OP_ANY match any one character other than newline
  133. OP_ALLANY match any one character, including newline
  134. OP_ANYBYTE match any single unit, even in UTF-8/16 mode
  135. OP_SOD match start of data: \A
  136. OP_SOM, start of match (subject + offset): \G
  137. OP_SET_SOM, set start of match (\K)
  138. OP_CIRC ^ (start of data)
  139. OP_CIRCM ^ multiline mode (start of data or after newline)
  140. OP_NOT_WORD_BOUNDARY \W
  141. OP_WORD_BOUNDARY \w
  142. OP_NOT_DIGIT \D
  143. OP_DIGIT \d
  144. OP_NOT_HSPACE \H
  145. OP_HSPACE \h
  146. OP_NOT_WHITESPACE \S
  147. OP_WHITESPACE \s
  148. OP_NOT_VSPACE \V
  149. OP_VSPACE \v
  150. OP_NOT_WORDCHAR \W
  151. OP_WORDCHAR \w
  152. OP_EODN match end of data or newline at end: \Z
  153. OP_EOD match end of data: \z
  154. OP_DOLL $ (end of data, or before final newline)
  155. OP_DOLLM $ multiline mode (end of data or before newline)
  156. OP_EXTUNI match an extended Unicode grapheme cluster
  157. OP_ANYNL match any Unicode newline sequence
  158. OP_ASSERT_ACCEPT )
  159. OP_ACCEPT ) These are Perl 5.10's "backtracking control
  160. OP_COMMIT ) verbs". If OP_ACCEPT is inside capturing
  161. OP_FAIL ) parentheses, it may be preceded by one or more
  162. OP_PRUNE ) OP_CLOSE, each followed by a count that
  163. OP_SKIP ) indicates which parentheses must be closed.
  164. OP_THEN )
  165. OP_ASSERT_ACCEPT is used when (*ACCEPT) is encountered within an assertion.
  166. This ends the assertion, not the entire pattern match.
  167. Backtracking control verbs with optional data
  168. ---------------------------------------------
  169. (*THEN) without an argument generates the opcode OP_THEN and no following data.
  170. OP_MARK is followed by the mark name, preceded by a one-unit length, and
  171. followed by a binary zero. For (*PRUNE), (*SKIP), and (*THEN) with arguments,
  172. the opcodes OP_PRUNE_ARG, OP_SKIP_ARG, and OP_THEN_ARG are used, with the name
  173. following in the same format as OP_MARK.
  174. Matching literal characters
  175. ---------------------------
  176. The OP_CHAR opcode is followed by a single character that is to be matched
  177. casefully. For caseless matching, OP_CHARI is used. In UTF-8 or UTF-16 modes,
  178. the character may be more than one unit long. In UTF-32 mode, characters
  179. are always exactly one unit long.
  180. If there is only one character in a character class, OP_CHAR or OP_CHARI is
  181. used for a positive class, and OP_NOT or OP_NOTI for a negative one (that is,
  182. for something like [^a]).
  183. Repeating single characters
  184. ---------------------------
  185. The common repeats (*, +, ?), when applied to a single character, use the
  186. following opcodes, which come in caseful and caseless versions:
  187. Caseful Caseless
  188. OP_STAR OP_STARI
  189. OP_MINSTAR OP_MINSTARI
  190. OP_POSSTAR OP_POSSTARI
  191. OP_PLUS OP_PLUSI
  192. OP_MINPLUS OP_MINPLUSI
  193. OP_POSPLUS OP_POSPLUSI
  194. OP_QUERY OP_QUERYI
  195. OP_MINQUERY OP_MINQUERYI
  196. OP_POSQUERY OP_POSQUERYI
  197. Each opcode is followed by the character that is to be repeated. In ASCII mode,
  198. these are two-unit items; in UTF-8 or UTF-16 modes, the length is variable; in
  199. UTF-32 mode these are one-unit items. Those with "MIN" in their names are the
  200. minimizing versions. Those with "POS" in their names are possessive versions.
  201. Other repeats make use of these opcodes:
  202. Caseful Caseless
  203. OP_UPTO OP_UPTOI
  204. OP_MINUPTO OP_MINUPTOI
  205. OP_POSUPTO OP_POSUPTOI
  206. OP_EXACT OP_EXACTI
  207. Each of these is followed by a count and then the repeated character. OP_UPTO
  208. matches from 0 to the given number. A repeat with a non-zero minimum and a
  209. fixed maximum is coded as an OP_EXACT followed by an OP_UPTO (or OP_MINUPTO or
  210. OPT_POSUPTO).
  211. Another set of matching repeating opcodes (called OP_NOTSTAR, OP_NOTSTARI,
  212. etc.) are used for repeated, negated, single-character classes such as [^a]*.
  213. The normal single-character opcodes (OP_STAR, etc.) are used for repeated
  214. positive single-character classes.
  215. Repeating character types
  216. -------------------------
  217. Repeats of things like \d are done exactly as for single characters, except
  218. that instead of a character, the opcode for the type is stored in the data
  219. unit. The opcodes are:
  220. OP_TYPESTAR
  221. OP_TYPEMINSTAR
  222. OP_TYPEPOSSTAR
  223. OP_TYPEPLUS
  224. OP_TYPEMINPLUS
  225. OP_TYPEPOSPLUS
  226. OP_TYPEQUERY
  227. OP_TYPEMINQUERY
  228. OP_TYPEPOSQUERY
  229. OP_TYPEUPTO
  230. OP_TYPEMINUPTO
  231. OP_TYPEPOSUPTO
  232. OP_TYPEEXACT
  233. Match by Unicode property
  234. -------------------------
  235. OP_PROP and OP_NOTPROP are used for positive and negative matches of a
  236. character by testing its Unicode property (the \p and \P escape sequences).
  237. Each is followed by two units that encode the desired property as a type and a
  238. value. The types are a set of #defines of the form PT_xxx, and the values are
  239. enumerations of the form ucp_xx, defined in the ucp.h source file. The value is
  240. relevant only for PT_GC (General Category), PT_PC (Particular Category), and
  241. PT_SC (Script).
  242. Repeats of these items use the OP_TYPESTAR etc. set of opcodes, followed by
  243. three units: OP_PROP or OP_NOTPROP, and then the desired property type and
  244. value.
  245. Character classes
  246. -----------------
  247. If there is only one character in a class, OP_CHAR or OP_CHARI is used for a
  248. positive class, and OP_NOT or OP_NOTI for a negative one (that is, for
  249. something like [^a]).
  250. A set of repeating opcodes (called OP_NOTSTAR etc.) are used for repeated,
  251. negated, single-character classes. The normal single-character opcodes
  252. (OP_STAR, etc.) are used for repeated positive single-character classes.
  253. When there is more than one character in a class, and all the code points are
  254. less than 256, OP_CLASS is used for a positive class, and OP_NCLASS for a
  255. negative one. In either case, the opcode is followed by a 32-byte (16-short,
  256. 8-word) bit map containing a 1 bit for every character that is acceptable. The
  257. bits are counted from the least significant end of each unit. In caseless mode,
  258. bits for both cases are set.
  259. The reason for having both OP_CLASS and OP_NCLASS is so that, in UTF-8/16/32
  260. mode, subject characters with values greater than 255 can be handled correctly.
  261. For OP_CLASS they do not match, whereas for OP_NCLASS they do.
  262. For classes containing characters with values greater than 255 or that contain
  263. \p or \P, OP_XCLASS is used. It optionally uses a bit map if any code points
  264. are less than 256, followed by a list of pairs (for a range) and single
  265. characters. In caseless mode, both cases are explicitly listed.
  266. OP_XCLASS is followed by a unit containing flag bits: XCL_NOT indicates that
  267. this is a negative class, and XCL_MAP indicates that a bit map is present.
  268. There follows the bit map, if XCL_MAP is set, and then a sequence of items
  269. coded as follows:
  270. XCL_END marks the end of the list
  271. XCL_SINGLE one character follows
  272. XCL_RANGE two characters follow
  273. XCL_PROP a Unicode property (type, value) follows
  274. XCL_NOTPROP a Unicode property (type, value) follows
  275. If a range starts with a code point less than 256 and ends with one greater
  276. than 256, an XCL_RANGE item is used, without setting any bits in the bit map.
  277. This means that if no other items in the class set bits in the map, a map is
  278. not needed.
  279. Back references
  280. ---------------
  281. OP_REF (caseful) or OP_REFI (caseless) is followed by a count containing the
  282. reference number if the reference is to a unique capturing group (either by
  283. number or by name). When named groups are used, there may be more than one
  284. group with the same name. In this case, a reference by name generates OP_DNREF
  285. or OP_DNREFI. These are followed by two counts: the index (not the byte offset)
  286. in the group name table of the first entry for the requred name, followed by
  287. the number of groups with the same name.
  288. Repeating character classes and back references
  289. -----------------------------------------------
  290. Single-character classes are handled specially (see above). This section
  291. applies to other classes and also to back references. In both cases, the repeat
  292. information follows the base item. The matching code looks at the following
  293. opcode to see if it is one of
  294. OP_CRSTAR
  295. OP_CRMINSTAR
  296. OP_CRPOSSTAR
  297. OP_CRPLUS
  298. OP_CRMINPLUS
  299. OP_CRPOSPLUS
  300. OP_CRQUERY
  301. OP_CRMINQUERY
  302. OP_CRPOSQUERY
  303. OP_CRRANGE
  304. OP_CRMINRANGE
  305. OP_CRPOSRANGE
  306. All but the last three are single-unit items, with no data. The others are
  307. followed by the minimum and maximum repeat counts.
  308. Brackets and alternation
  309. ------------------------
  310. A pair of non-capturing round brackets is wrapped round each expression at
  311. compile time, so alternation always happens in the context of brackets.
  312. [Note for North Americans: "bracket" to some English speakers, including
  313. myself, can be round, square, curly, or pointy. Hence this usage rather than
  314. "parentheses".]
  315. Non-capturing brackets use the opcode OP_BRA. Originally PCRE was limited to 99
  316. capturing brackets and it used a different opcode for each one. From release
  317. 3.5, the limit was removed by putting the bracket number into the data for
  318. higher-numbered brackets. From release 7.0 all capturing brackets are handled
  319. this way, using the single opcode OP_CBRA.
  320. A bracket opcode is followed by LINK_SIZE bytes which give the offset to the
  321. next alternative OP_ALT or, if there aren't any branches, to the matching
  322. OP_KET opcode. Each OP_ALT is followed by LINK_SIZE bytes giving the offset to
  323. the next one, or to the OP_KET opcode. For capturing brackets, the bracket
  324. number is a count that immediately follows the offset.
  325. OP_KET is used for subpatterns that do not repeat indefinitely, and OP_KETRMIN
  326. and OP_KETRMAX are used for indefinite repetitions, minimally or maximally
  327. respectively (see below for possessive repetitions). All three are followed by
  328. LINK_SIZE bytes giving (as a positive number) the offset back to the matching
  329. bracket opcode.
  330. If a subpattern is quantified such that it is permitted to match zero times, it
  331. is preceded by one of OP_BRAZERO, OP_BRAMINZERO, or OP_SKIPZERO. These are
  332. single-unit opcodes that tell the matcher that skipping the following
  333. subpattern entirely is a valid branch. In the case of the first two, not
  334. skipping the pattern is also valid (greedy and non-greedy). The third is used
  335. when a pattern has the quantifier {0,0}. It cannot be entirely discarded,
  336. because it may be called as a subroutine from elsewhere in the regex.
  337. A subpattern with an indefinite maximum repetition is replicated in the
  338. compiled data its minimum number of times (or once with OP_BRAZERO if the
  339. minimum is zero), with the final copy terminating with OP_KETRMIN or OP_KETRMAX
  340. as appropriate.
  341. A subpattern with a bounded maximum repetition is replicated in a nested
  342. fashion up to the maximum number of times, with OP_BRAZERO or OP_BRAMINZERO
  343. before each replication after the minimum, so that, for example, (abc){2,5} is
  344. compiled as (abc)(abc)((abc)((abc)(abc)?)?)?, except that each bracketed group
  345. has the same number.
  346. When a repeated subpattern has an unbounded upper limit, it is checked to see
  347. whether it could match an empty string. If this is the case, the opcode in the
  348. final replication is changed to OP_SBRA or OP_SCBRA. This tells the matcher
  349. that it needs to check for matching an empty string when it hits OP_KETRMIN or
  350. OP_KETRMAX, and if so, to break the loop.
  351. Possessive brackets
  352. -------------------
  353. When a repeated group (capturing or non-capturing) is marked as possessive by
  354. the "+" notation, e.g. (abc)++, different opcodes are used. Their names all
  355. have POS on the end, e.g. OP_BRAPOS instead of OP_BRA and OP_SCPBRPOS instead
  356. of OP_SCBRA. The end of such a group is marked by OP_KETRPOS. If the minimum
  357. repetition is zero, the group is preceded by OP_BRAPOSZERO.
  358. Once-only (atomic) groups
  359. -------------------------
  360. These are just like other subpatterns, but they start with the opcode
  361. OP_ONCE or OP_ONCE_NC. The former is used when there are no capturing brackets
  362. within the atomic group; the latter when there are. The distinction is needed
  363. for when there is a backtrack to before the group - any captures within the
  364. group must be reset, so it is necessary to retain backtracking points inside
  365. the group even after it is complete in order to do this. When there are no
  366. captures in an atomic group, all the backtracking can be discarded when it is
  367. complete. This is more efficient, and also uses less stack.
  368. The check for matching an empty string in an unbounded repeat is handled
  369. entirely at runtime, so there are just these two opcodes for atomic groups.
  370. Assertions
  371. ----------
  372. Forward assertions are also just like other subpatterns, but starting with one
  373. of the opcodes OP_ASSERT or OP_ASSERT_NOT. Backward assertions use the opcodes
  374. OP_ASSERTBACK and OP_ASSERTBACK_NOT, and the first opcode inside the assertion
  375. is OP_REVERSE, followed by a count of the number of characters to move back the
  376. pointer in the subject string. In ASCII mode, the count is a number of units,
  377. but in UTF-8/16 mode each character may occupy more than one unit; in UTF-32
  378. mode each character occupies exactly one unit. A separate count is present in
  379. each alternative of a lookbehind assertion, allowing them to have different
  380. fixed lengths.
  381. Conditional subpatterns
  382. -----------------------
  383. These are like other subpatterns, but they start with the opcode OP_COND, or
  384. OP_SCOND for one that might match an empty string in an unbounded repeat. If
  385. the condition is a back reference, this is stored at the start of the
  386. subpattern using the opcode OP_CREF followed by a count containing the
  387. reference number, provided that the reference is to a unique capturing group.
  388. If the reference was by name and there is more than one group with that name,
  389. OP_DNCREF is used instead. It is followed by two counts: the index in the group
  390. names table, and the number of groups with the same name.
  391. If the condition is "in recursion" (coded as "(?(R)"), or "in recursion of
  392. group x" (coded as "(?(Rx)"), the group number is stored at the start of the
  393. subpattern using the opcode OP_RREF (with a value of zero for "the whole
  394. pattern") or OP_DNRREF (with data as for OP_DNCREF). For a DEFINE condition,
  395. just the single unit OP_DEF is used (it has no associated data). Otherwise, a
  396. conditional subpattern always starts with one of the assertions.
  397. Recursion
  398. ---------
  399. Recursion either matches the current regex, or some subexpression. The opcode
  400. OP_RECURSE is followed by aLINK_SIZE value that is the offset to the starting
  401. bracket from the start of the whole pattern. From release 6.5, OP_RECURSE is
  402. automatically wrapped inside OP_ONCE brackets, because otherwise some patterns
  403. broke it. OP_RECURSE is also used for "subroutine" calls, even though they are
  404. not strictly a recursion.
  405. Callout
  406. -------
  407. OP_CALLOUT is followed by one unit of data that holds a callout number in the
  408. range 0 to 254 for manual callouts, or 255 for an automatic callout. In both
  409. cases there follows a count giving the offset in the pattern string to the
  410. start of the following item, and another count giving the length of this item.
  411. These values make is possible for pcretest to output useful tracing information
  412. using automatic callouts.
  413. Philip Hazel
  414. November 2013