parsing.doc 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363
  1. Chris Hertel, Samba Team
  2. November 1997
  3. This is a quick overview of the lexical analysis, syntax, and semantics
  4. of the smb.conf file.
  5. Lexical Analysis:
  6. Basically, the file is processed on a line by line basis. There are
  7. four types of lines that are recognized by the lexical analyzer
  8. (params.c):
  9. Blank lines - Lines containing only whitespace.
  10. Comment lines - Lines beginning with either a semi-colon or a
  11. pound sign (';' or '#').
  12. Section header lines - Lines beginning with an open square bracket
  13. ('[').
  14. Parameter lines - Lines beginning with any other character.
  15. (The default line type.)
  16. The first two are handled exclusively by the lexical analyzer, which
  17. ignores them. The latter two line types are scanned for
  18. - Section names
  19. - Parameter names
  20. - Parameter values
  21. These are the only tokens passed to the parameter loader
  22. (loadparm.c). Parameter names and values are divided from one
  23. another by an equal sign: '='.
  24. Handling of Whitespace:
  25. Whitespace is defined as all characters recognized by the isspace()
  26. function (see ctype(3C)) except for the newline character ('\n')
  27. The newline is excluded because it identifies the end of the line.
  28. - The lexical analyzer scans past white space at the beginning of a
  29. line.
  30. - Section and parameter names may contain internal white space. All
  31. whitespace within a name is compressed to a single space character.
  32. - Internal whitespace within a parameter value is kept verbatim with
  33. the exception of carriage return characters ('\r'), all of which
  34. are removed.
  35. - Leading and trailing whitespace is removed from names and values.
  36. Handling of Line Continuation:
  37. Long section header and parameter lines may be extended across
  38. multiple lines by use of the backslash character ('\\'). Line
  39. continuation is ignored for blank and comment lines.
  40. If the last (non-whitespace) character within a section header or on
  41. a parameter line is a backslash, then the next line will be
  42. (logically) concatonated with the current line by the lexical
  43. analyzer. For example:
  44. param name = parameter value string \
  45. with line continuation.
  46. Would be read as
  47. param name = parameter value string with line continuation.
  48. Note that there are five spaces following the word 'string',
  49. representing the one space between 'string' and '\\' in the top
  50. line, plus the four preceeding the word 'with' in the second line.
  51. (Yes, I'm counting the indentation.)
  52. Line continuation characters are ignored on blank lines and at the end
  53. of comments. They are *only* recognized within section and parameter
  54. lines.
  55. Line Continuation Quirks:
  56. Note the following example:
  57. param name = parameter value string \
  58. \
  59. with line continuation.
  60. The middle line is *not* parsed as a blank line because it is first
  61. concatonated with the top line. The result is
  62. param name = parameter value string with line continuation.
  63. The same is true for comment lines.
  64. param name = parameter value string \
  65. ; comment \
  66. with a comment.
  67. This becomes:
  68. param name = parameter value string ; comment with a comment.
  69. On a section header line, the closing bracket (']') is considered a
  70. terminating character, and the rest of the line is ignored. The lines
  71. [ section name ] garbage \
  72. param name = value
  73. are read as
  74. [section name]
  75. param name = value
  76. Syntax:
  77. The syntax of the smb.conf file is as follows:
  78. <file> :== { <section> } EOF
  79. <section> :== <section header> { <parameter line> }
  80. <section header> :== '[' NAME ']'
  81. <parameter line> :== NAME '=' VALUE NL
  82. Basically, this means that
  83. - a file is made up of zero or more sections, and is terminated by
  84. an EOF (we knew that).
  85. - A section is made up of a section header followed by zero or more
  86. parameter lines.
  87. - A section header is identified by an opening bracket and
  88. terminated by the closing bracket. The enclosed NAME identifies
  89. the section.
  90. - A parameter line is divided into a NAME and a VALUE. The *first*
  91. equal sign on the line separates the NAME from the VALUE. The
  92. VALUE is terminated by a newline character (NL = '\n').
  93. About params.c:
  94. The parsing of the config file is a bit unusual if you are used to
  95. lex, yacc, bison, etc. Both lexical analysis (scanning) and parsing
  96. are performed by params.c. Values are loaded via callbacks to
  97. loadparm.c.
  98. --------------------------------------------------------------------------
  99. Samba DEBUG
  100. Chris Hertel, Samba Team
  101. July, 1998
  102. Here's the scoop on the update to the DEBUG() system.
  103. First, my goals are:
  104. * Backward compatibility (ie., I don't want to break any Samba code
  105. that already works).
  106. * Debug output should be timestamped and easy to read (format-wise).
  107. * Debug output should be parsable by software.
  108. * There should be convenient tools for composing debug messages.
  109. NOTE: the Debug functionality has been moved from util.c to the new
  110. debug.c module.
  111. New Output Syntax
  112. The syntax of a debugging log file is represented as:
  113. <debugfile> :== { <debugmsg> }
  114. <debugmsg> :== <debughdr> '\n' <debugtext>
  115. <debughdr> :== '[' TIME ',' LEVEL ']' FILE ':' [FUNCTION] '(' LINE ')'
  116. <debugtext> :== { <debugline> }
  117. <debugline> :== TEXT '\n'
  118. TEXT is a string of characters excluding the newline character.
  119. LEVEL is the DEBUG level of the message (an integer in the range
  120. 0..10).
  121. TIME is a timestamp.
  122. FILE is the name of the file from which the debug message was
  123. generated.
  124. FUNCTION is the function from which the debug message was generated.
  125. LINE is the line number of the debug statement that generated the
  126. message.
  127. Basically, what that all means is:
  128. * A debugging log file is made up of debug messages.
  129. * Each debug message is made up of a header and text. The header is
  130. separated from the text by a newline.
  131. * The header begins with the timestamp and debug level of the
  132. message enclosed in brackets. The filename, function, and line
  133. number at which the message was generated follow. The filename is
  134. terminated by a colon, and the function name is terminated by the
  135. parenthesis which contain the line number. Depending upon the
  136. compiler, the function name may be missing (it is generated by the
  137. __FUNCTION__ macro, which is not universally implemented, dangit).
  138. * The message text is made up of zero or more lines, each terminated
  139. by a newline.
  140. Here's some example output:
  141. [1998/08/03 12:55:25, 1] nmbd.c:(659)
  142. Netbios nameserver version 1.9.19-prealpha started.
  143. Copyright Andrew Tridgell 1994-1997
  144. [1998/08/03 12:55:25, 3] loadparm.c:(763)
  145. Initializing global parameters
  146. Note that in the above example the function names are not listed on
  147. the header line. That's because the example above was generated on an
  148. SGI Indy, and the SGI compiler doesn't support the __FUNCTION__ macro.
  149. The DEBUG() Macro
  150. Use of the DEBUG() macro is unchanged. DEBUG() takes two parameters.
  151. The first is the message level, the second is the body of a function
  152. call to the Debug1() function.
  153. That's confusing.
  154. Here's an example which may help a bit. If you would write
  155. printf( "This is a %s message.\n", "debug" );
  156. to send the output to stdout, then you would write
  157. DEBUG( 0, ( "This is a %s message.\n", "debug" ) );
  158. to send the output to the debug file. All of the normal printf()
  159. formatting escapes work.
  160. Note that in the above example the DEBUG message level is set to 0.
  161. Messages at level 0 always print. Basically, if the message level is
  162. less than or equal to the global value DEBUGLEVEL, then the DEBUG
  163. statement is processed.
  164. The output of the above example would be something like:
  165. [1998/07/30 16:00:51, 0] file.c:function(128)
  166. This is a debug message.
  167. Each call to DEBUG() creates a new header *unless* the output produced
  168. by the previous call to DEBUG() did not end with a '\n'. Output to the
  169. debug file is passed through a formatting buffer which is flushed
  170. every time a newline is encountered. If the buffer is not empty when
  171. DEBUG() is called, the new input is simply appended.
  172. ...but that's really just a Kludge. It was put in place because
  173. DEBUG() has been used to write partial lines. Here's a simple (dumb)
  174. example of the kind of thing I'm talking about:
  175. DEBUG( 0, ("The test returned " ) );
  176. if( test() )
  177. DEBUG(0, ("True") );
  178. else
  179. DEBUG(0, ("False") );
  180. DEBUG(0, (".\n") );
  181. Without the format buffer, the output (assuming test() returned true)
  182. would look like this:
  183. [1998/07/30 16:00:51, 0] file.c:function(256)
  184. The test returned
  185. [1998/07/30 16:00:51, 0] file.c:function(258)
  186. True
  187. [1998/07/30 16:00:51, 0] file.c:function(261)
  188. .
  189. Which isn't much use. The format buffer kludge fixes this problem.
  190. The DEBUGADD() Macro
  191. In addition to the kludgey solution to the broken line problem
  192. described above, there is a clean solution. The DEBUGADD() macro never
  193. generates a header. It will append new text to the current debug
  194. message even if the format buffer is empty. The syntax of the
  195. DEBUGADD() macro is the same as that of the DEBUG() macro.
  196. DEBUG( 0, ("This is the first line.\n" ) );
  197. DEBUGADD( 0, ("This is the second line.\nThis is the third line.\n" ) );
  198. Produces
  199. [1998/07/30 16:00:51, 0] file.c:function(512)
  200. This is the first line.
  201. This is the second line.
  202. This is the third line.
  203. The DEBUGLVL() Macro
  204. One of the problems with the DEBUG() macro was that DEBUG() lines
  205. tended to get a bit long. Consider this example from
  206. nmbd_sendannounce.c:
  207. DEBUG(3,("send_local_master_announcement: type %x for name %s on subnet %s for workgroup %s\n",
  208. type, global_myname, subrec->subnet_name, work->work_group));
  209. One solution to this is to break it down using DEBUG() and DEBUGADD(),
  210. as follows:
  211. DEBUG( 3, ( "send_local_master_announcement: " ) );
  212. DEBUGADD( 3, ( "type %x for name %s ", type, global_myname ) );
  213. DEBUGADD( 3, ( "on subnet %s ", subrec->subnet_name ) );
  214. DEBUGADD( 3, ( "for workgroup %s\n", work->work_group ) );
  215. A similar, but arguably nicer approach is to use the DEBUGLVL() macro.
  216. This macro returns True if the message level is less than or equal to
  217. the global DEBUGLEVEL value, so:
  218. if( DEBUGLVL( 3 ) )
  219. {
  220. dbgtext( "send_local_master_announcement: " );
  221. dbgtext( "type %x for name %s ", type, global_myname );
  222. dbgtext( "on subnet %s ", subrec->subnet_name );
  223. dbgtext( "for workgroup %s\n", work->work_group );
  224. }
  225. (The dbgtext() function is explained below.)
  226. There are a few advantages to this scheme:
  227. * The test is performed only once.
  228. * You can allocate variables off of the stack that will only be used
  229. within the DEBUGLVL() block.
  230. * Processing that is only relevant to debug output can be contained
  231. within the DEBUGLVL() block.
  232. New Functions
  233. dbgtext()
  234. This function prints debug message text to the debug file (and
  235. possibly to syslog) via the format buffer. The function uses a
  236. variable argument list just like printf() or Debug1(). The
  237. input is printed into a buffer using the vslprintf() function,
  238. and then passed to format_debug_text().
  239. If you use DEBUGLVL() you will probably print the body of the
  240. message using dbgtext().
  241. dbghdr()
  242. This is the function that writes a debug message header.
  243. Headers are not processed via the format buffer. Also note that
  244. if the format buffer is not empty, a call to dbghdr() will not
  245. produce any output. See the comments in dbghdr() for more info.
  246. It is not likely that this function will be called directly. It
  247. is used by DEBUG() and DEBUGADD().
  248. format_debug_text()
  249. This is a static function in debug.c. It stores the output text
  250. for the body of the message in a buffer until it encounters a
  251. newline. When the newline character is found, the buffer is
  252. written to the debug file via the Debug1() function, and the
  253. buffer is reset. This allows us to add the indentation at the
  254. beginning of each line of the message body, and also ensures
  255. that the output is written a line at a time (which cleans up
  256. syslog output).