ViewVC Help
View File | Revision Log | Show Annotations | View Changeset | Root Listing
root/svn/vendor/ircservices-5.1.24/docs/tech/8.html
Revision: 3389
Committed: Fri Apr 25 14:12:15 2014 UTC (11 years, 4 months ago) by michael
Content type: text/html
File size: 77916 byte(s)
Log Message:
- Imported ircservices-5.1.24

File Contents

# Content
1 <?xml version="1.0" encoding="ISO-8859-1"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11-strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
4 <head>
5 <meta http-equiv="Content-Style-Type" content="text/css"/>
6 <style type="text/css">@import "style.css";</style>
7 <title>IRC Services Technical Reference Manual - 8. Other modules</title>
8 </head>
9
10 <body>
11 <h1 class="title" id="top">IRC Services Technical Reference Manual</h1>
12
13 <h2 class="section-title">8. Other modules</h2>
14
15 <p class="section-toc">
16 8-1. <a href="#s1">Encryption modules</a>
17 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-1-1. <a href="#s1-1"><tt>encryption/md5</tt>: MD5 hashing</a>
18 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-1-2. <a href="#s1-2"><tt>encryption/unix-crypt</tt>: Encryption with the <tt>crypt()</tt> system function</a>
19 <br/>8-2. <a href="#s2">HTTP server modules</a>
20 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-2-1. <a href="#s2-1">Client data structure and related constants</a>
21 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-2-2. <a href="#s2-2">HTTP server utility routines</a>
22 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-2-3. <a href="#s2-3"><tt>httpd/main</tt>: Main server module</a>
23 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-2-4. <a href="#s2-4"><tt>httpd/auth-ip</tt>: Authorization by IP address</a>
24 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-2-5. <a href="#s2-5"><tt>httpd/auth-password</tt>: Authorization by password</a>
25 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-2-6. <a href="#s2-6"><tt>httpd/top-page</tt>: Static page for server root</a>
26 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-2-7. <a href="#s2-7"><tt>httpd/redirect</tt>: Redirects to nickname/channel URLs</a>
27 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-2-8. <a href="#s2-8"><tt>httpd/dbaccess</tt>: Provides database access via HTTP</a>
28 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-2-9. <a href="#s2-9"><tt>httpd/debug</tt>: Debugging module</a>
29 <br/>8-3. <a href="#s3">Mail-sending modules</a>
30 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-3-1. <a href="#s3-1"><tt>mail/main</tt>: Main mail module</a>
31 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-3-2. <a href="#s3-2"><tt>mail/sendmail</tt>: Sends mail using the <tt>sendmail</tt> program</a>
32 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-3-3. <a href="#s3-3"><tt>mail/smtp</tt>: Sends mail using SMTP</a>
33 <br/>8-4. <a href="#s4">Miscellaneous modules</a>
34 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-4-1. <a href="#s4-1"><tt>misc/xml-export</tt>: Data export using XML</a>
35 <br/>&nbsp;&nbsp;&nbsp;&nbsp;8-4-2. <a href="#s4-2"><tt>misc/xml-import</tt>: Data import using XML</a>
36 </p>
37
38 <p class="backlink"><a href="7.html">Previous section: Services pseudoclients</a> |
39 <a href="index.html">Table of Contents</a> |
40 <a href="9.html">Next section: The database conversion tool</a></p>
41
42 <!------------------------------------------------------------------------>
43 <hr/>
44
45 <h3 class="subsection-title" id="s1">8-1. Encryption modules</h3>
46
47 <p>As discussed in <a href="2.html#s9-1">section 2-9-1</a>, Services
48 includes facilities for encrypting passwords. While the Services core
49 provides an interface for encryption, the actual encryption processing is
50 handled by encryption modules, located in the <tt>modules/encryption</tt>
51 directory. Two encryption modules are included with Services:
52 <tt>encryption/md5</tt>, using the MD5 hash function to encrypt passwords,
53 and <tt>encryption/unix-crypt</tt>, using the system library's
54 <tt>crypt()</tt> function.</p>
55
56 <p>Encryption modules generally have three parts:</p>
57
58 <ul>
59 <li class="spaced">implementations of the <tt>CipherInfo</tt> functions
60 <tt><i>encrypt</i>()</tt>, <tt><i>decrypt</i>()</tt>, and
61 <tt><i>check_password</i>()</tt>;</li>
62
63 <li class="spaced">a <tt>CipherInfo</tt> data structure, containing the
64 cipher's identifying name and pointers to the three functions;
65 and</li>
66
67 <li class="spaced">calls to <tt>register_cipher()</tt> and
68 <tt>unregister_cipher()</tt> in the module initialization and
69 cleanup routines.</li>
70 </ul>
71
72 <p>The three <tt>CipherInfo</tt> functions mentioned above provide
73 encryption, decryption, and encrypt-and-compare functionality for the
74 particular cipher implemented by the module. They are defined as follows
75 (the actual function names are of course up to the particular module):</p>
76
77 <dl>
78 <dt><tt>int <b>encrypt</b>(const char *<i>src</i>, int <i>len</i>, char *<i>dest</i>, int <i>size</i>)</tt></dt>
79 <dd>Encrypts the plaintext stored in <tt><i>src</i></tt>, which is
80 <tt><i>len</i></tt> bytes long, and stores the result in the buffer
81 pointed to by <tt><i>dest</i></tt> (of size <tt><i>size</i></tt>
82 bytes). The source plaintext is <i>not</i> (necessarily)
83 null-terminated, and should be treated as a block of binary data
84 rather than a textual string. Returns:
85 <ul>
86 <li>0 on success</li>
87 <li>+<i>N</i> (a positive integer) if the destination buffer is too
88 small; <i>N</i> is the minimum size buffer (in bytes)
89 required to hold the encrypted data</li>
90 <li>-1 on other error</li>
91 </ul></dd>
92
93 <dt><tt>int <b>decrypt</b>(const char *<i>src</i>, char *<i>dest</i>, int <i>size</i>)</tt></dt>
94 <dd>Decrypts the ciphertext stored in <tt><i>src</i></tt>, storing the
95 result in the buffer pointed to by <tt><i>dest</i></tt> (of size
96 <tt><i>size</i></tt> bytes). Returns:
97 <ul>
98 <li>0 on success</li>
99 <li>+<i>N</i> (a positive integer) if the destination buffer is too
100 small; <i>N</i> is the minimum size buffer (in bytes)
101 required to hold the encrypted data</li>
102 <li>-2 if the encryption algorithm does not allow decription</li>
103 <li>-1 on other error</li>
104 </ul></dd>
105
106 <dt><tt>int <b>check_password</b>(const char *<i>plaintext</i>, const char *<i>password</i>)</tt></dt>
107 <dd>Compares the null-terminated string <tt><i>plaintext</i></tt>
108 against the encrypted data <tt><i>password</i></tt>. Returns:
109 <ul>
110 <li>1 if the password matches</li>
111 <li>0 if the password does not match</li>
112 <li>-1 if an error occurred while checking</li>
113 </ul></dd>
114 </dl>
115
116 <p>The core encryption source file, <tt>encrypt.c</tt> in the top source
117 directory, contains definitions of these three functions for use when no
118 encryption module is loaded; the functions simply copy the plaintext string
119 into or out of the provided encryption buffer, truncating as necessary.
120 (As a result, only the first <tt>PASSMAX</tt> bytes of longer passwords are
121 valid; any password beginning with those same bytes will be treated as
122 equivalent, similar to the way old Unix-like systems ignored any characters
123 in passwords after the first 8.)</p>
124
125 <p class="backlink"><a href="#top">Back to top</a></p>
126
127
128 <h4 class="subsubsection-title" id="s1-1">8-1-1. <tt>encryption/md5</tt>: MD5 hashing</h4>
129
130 <p>The <tt>encryption/md5</tt> module, defined in <tt>md5.c</tt>, uses the
131 MD5 message-digest algorithm to encrypt passwords. The bulk of the file
132 consists of a literal copy of the <tt>md5c.c</tt> implementation published
133 by RSA Data Security, Inc.; the <tt>CipherInfo</tt> implementation function
134 <tt>md5_encrypt()</tt> simply calls these functions to obtain a 16-byte
135 hash of its input and returns that hash (as binary data, not a hexadecimal
136 string).</p>
137
138 <p>Of the remaining two <tt>CipherInfo</tt> functions, <tt>md5_decrypt()</tt>
139 simply returns the special value -2, indicating that MD5 passwords cannot
140 be decrypted; <tt>md5_check_password()</tt> calls <tt>md5_encrypt()</tt> on
141 the plaintext string it is passed, comparing the resulting hash against the
142 given password buffer to determine whether the password is correct.</p>
143
144 <p>The module includes one configuration option,
145 <tt>EnableAnopeWorkaround</tt>. This is intended to be used with databases
146 that have been imported from the Epona or Anope programs, some versions of
147 which have a bug (which, to be fair, was inherited from an earlier version
148 of Services) causing MD5-encrypted passwords to be stored incorrectly. The
149 bug is in assuming that the <tt>MD5Final()</tt> routine returns an ASCII
150 string of hexadecimal characters&mdash;in fact, it returns the raw 128-bit
151 hash value&mdash;and attempting to convert that value into binary,
152 resulting in 8 bytes of garbled hash data and 8 bytes that are essentially
153 random. The workaround implemented by <tt>EnableAnopeWorkaround</tt>
154 performs this same procedure when checking passwords if the hash itself
155 does not match; since it only compares the 8 valid bytes of the corrupted
156 hash, there is naturally a greater possibility of a hash collision, which
157 would result in an incorrect password mistakenly being signaled as correct.
158 See also the relevant part of <a href="../5.html#3-2">section 5-3-2 of the
159 user's manual</a>.</p>
160
161 <p class="backlink"><a href="#top">Back to top</a></p>
162
163
164 <h4 class="subsubsection-title" id="s1-2">8-1-2. <tt>encryption/unix-crypt</tt>: Encryption with the <tt>crypt()</tt> system function</h4>
165
166 <p>The <tt>encryption/unix-crypt</tt> module, defined in
167 <tt>unix-crypt.c</tt>, makes use of the <tt>crypt()</tt> function defined
168 in the system libraries to encrypt passwords. Due to this, it may not be a
169 desirable choice where portability of data is concerned, since differing
170 systems may have incompatible implementations of <tt>crypt()</tt>; on the
171 other hand, it allows Services to take advantage of more secure encryption
172 algorithms as the operating system comes to support them, without having to
173 write new Services modules as well. The impetus for the development of
174 this module was the use of <tt>crypt()</tt> as one encryption method in the
175 PTlink Services program (coincidentally, it was also this program's use of
176 a "cipher type" field stored with passwords that provided the inspiration
177 for the redesign of encryption functionality in Services 5.0).</p>
178
179 <p>The only noteworthy aspect of the <tt>encryption/unix-crypt</tt> module
180 is the encryption routine, <tt>unixcrypt_encrypt()</tt>. Since the
181 <tt>crypt()</tt> function requires a null-terminated password string (the
182 input is not guaranteed to be null-terminated) and a "salt" parameter,
183 these have to be prepared beforehand; the password is copied into a buffer
184 of size PASSMAX and a trailing null attached, and the "salt" string is
185 generated using the <tt>random()</tt> function. These are then passed to
186 <tt>crypt()</tt>, and the result copied into the output buffer, assuming
187 it is large enough. (Some modern systems implement <tt>crypt()</tt> using
188 an MD5 hash, returned as a 32-character hexadecimal string with a
189 distinguishing prefix; for such cases, <tt>PASSMAX</tt> must be raised from
190 the default of 32, or passwords will not fit.)</p>
191
192 <p class="backlink"><a href="#top">Back to top</a></p>
193
194 <!------------------------------------------------------------------------>
195 <hr/>
196
197 <h3 class="subsection-title" id="s2">8-2. HTTP server modules</h3>
198
199 <p>Services includes a simple HTTP server that can be used to access
200 Services data from outside IRC. The server is implemented by several
201 modules in the <tt>modules/httpd</tt> directory: a core server module
202 (<a href="#s2-3">section 8-2-3</a>), authorization modules (sections
203 (<a href="#s2-4">8-2-4</a> and <a href="#s2-5">8-2-5</a>), and resource
204 modules (sectiona <a href="#s2-6">8-2-6</a> through
205 <a href="#s2-9">8-2-9</a>). All modules make use of a common header file
206 containing data structure and constant definitions, described in
207 <a href="#s2-1">8-2-1</a>; there are also several utility functions
208 shared by all modules (and compiled into the core server module), discussed
209 in <a href="#s2-2">section 8-2-2</a>.</p>
210
211 <p class="backlink"><a href="#top">Back to top</a></p>
212
213
214 <h4 class="subsubsection-title" id="s2-1">8-2-1. Client data structure and related constants</h4>
215
216 <p>All modules make use of the header file <tt>http.h</tt>. This header
217 file contains a definition of the <tt>Client</tt> structure, used by the
218 modules to store information about a single client, along with various
219 HTTP-server-related constants and declarations of the utility routines
220 listed in <a href="#s2-2">section 8-2-2</a>.</p>
221
222 <p>The <tt>Client</tt> structure contains the following fields:</p>
223
224 <dl>
225 <dt><tt>Socket *<b>socket</b></tt></dt>
226 <dd>Contains the <tt>Socket</tt> structure used for communicating with
227 the client (see <a href="3.html">section 3</a>).</dd>
228
229 <dt><tt>Timeout *<b>timeout</b></tt></dt>
230 <dd>A timeout (see <a href="2.html#s7">section 2-7</a>) used to
231 disconnect clients after a certain period of idle time.</dd>
232
233 <dt><tt>char <b>address</b>[22]</tt></dt>
234 <dd>The client's IP address and port number, as a string. (22 bytes is
235 exactly long enough to hold a string of the form
236 "<tt>123.123.123.123:12345</tt>".)</dd>
237
238 <dt><tt>uint32 <b>ip</b></tt></dt>
239 <dd>The client's IP address, in network byte order.</dd>
240
241 <dt><tt>uint16 <b>port</b></tt></dt>
242 <dd>The client's (remote) port number, in network byte order.</dd>
243
244 <dt><tt>int <b>request_count</b></tt></dt>
245 <dd>The number of requests that the client has made over the course of
246 the connection, used to disconnect clients that make more than a
247 certain number of requests.</dd>
248
249 <dt><tt>int <b>in_request</b></tt></dt>
250 <dd>A flag indicating whether a request is currently being processed
251 for the client.</dd>
252
253 <dt><tt>char *<b>request_buf</b></tt></dt>
254 <dd>The buffer used to hold request data received from the client.</dd>
255
256 <dt><tt>int32 <b>request_len</b></tt></dt>
257 <dd>The number of bytes of request data received from the client for
258 this request (<i>i.e.,</i> the number of bytes stored in
259 <tt>request_buf</tt>).</dd>
260
261 <dt><tt>int <b>version_major</b></tt></dt>
262 <dd>The major version of HTTP in use (the "<tt><i>x</i></tt>" in
263 <tt>HTTP/<i>x</i>.<i>y</i></tt>).</dd>
264
265 <dt><tt>int <b>version_minor</b></tt></dt>
266 <dd>The minor version of HTTP in use (the "<tt><i>y</i></tt>" in
267 <tt>HTTP/<i>x</i>.<i>y</i></tt>).</dd>
268
269 <dt><tt>int <b>method</b></tt></dt>
270 <dd>The request method (one of the <tt>METHOD_*</tt> constants; see
271 below).</dd>
272
273 <dt><tt>char *<b>url</b></tt></dt>
274 <dd>The URL given by the client. Points into <tt>request_buffer</tt>.</dd>
275
276 <dt><tt>char *<b>data</b></tt></dt>
277 <dd><tt>POST</tt> data for the request, or the query string for a
278 <tt>GET</tt> or <tt>HEAD</tt> request. Points into
279 <tt>request_buffer</tt>.</dd>
280
281 <dt><tt>int32 <b>data_len</b></tt></dt>
282 <dd><tt>POST</tt> data length, in bytes.</dd>
283
284 <dt><tt>char **<b>headers</b></tt>
285 <br/><tt>int32 <b>headers_count</b></tt></dt>
286 <dd>A variable-length array containing the request headers. Each
287 element of the array consists of the header name and its value
288 separated by a null byte; the entries point into
289 <tt>request_buffer</tt>.</dd>
290
291 <dt><tt>char **<b>variables</b></tt>
292 <br/><tt>int32 <b>variables_count</b></tt></dt>
293 <dd>A variable-length array containing any variables found in
294 <tt>POST</tt> data or a <tt>GET</tt> or <tt>HEAD</tt> request.
295 Each element of the array consists of the variable's name and value
296 separated by a null byte, with URL escapes converted to their
297 respective characters.</dd>
298 </dl>
299
300 <p>There are also several constants defined by the header file:</p>
301
302 <dl>
303 <dt><tt>HTTP_LINEMAX</tt> (4096)</dt>
304 <dd>Defines the maximum length (including the trailing null byte) of a
305 request line that the server will handle. Lines longer than this
306 will cause the request to be aborted with an HTTP error.</dd>
307
308 <dt><tt>HTTP_AUTH_*</tt></dt>
309 <dd>Constants used as return values from authorization functions (see
310 <a href="#s2-3">section 8-2-3</a>).</dd>
311
312 <dt><tt>HTTP_METHOD_*</tt></dt>
313 <dd>Constants used to indicate the request method in the <tt>method</tt>
314 field of the <tt>Client</tt> structure.</dd>
315 </dl>
316
317 <p>These are followed by constants for the various HTTP return codes, as
318 defined by the relevant RFC documents. Not all (or even most) of these
319 are used by Services modules, but all are included for completeness. The
320 name of each constant includes a character indicating the type of response
321 (much like the first digit of the numeric code): "<tt>I</tt> for
322 Informational, "<tt>S</tt>" for Successful, and so on.</p>
323
324 <p class="backlink"><a href="#top">Back to top</a></p>
325
326
327 <h4 class="subsubsection-title" id="s2-2">8-2-2. HTTP server utility routines</h4>
328
329 <p>The <tt>util.c</tt> source file contains several common functions used
330 by HTTP server modules, listed below. <tt>util.c</tt> is linked into the
331 main HTTP server module, <tt>httpd/main</tt>, so all submodules can make
332 use of them without the necessity of explicitly importing each function.</p>
333
334 <dl>
335 <dt><tt>char *<b>http_get_header</b>(Client *<i>c</i>, const char *<i>header</i>)</tt></dt>
336 <dd>Returns the contents of the header <tt><i>header</i></tt> in the
337 given client's currently active request, or <tt>NULL</tt> if the
338 request did not include such a header. If <tt><i>header</i></tt>
339 is <tt>NULL</tt>, returns the next instance of the header last
340 searched for; this usage allows the caller to cycle through
341 multiple headers of the same name, much like <tt>strtok()</tt>
342 iterates through tokens in a string.</dd>
343
344 <dt><tt>char *<b>http_get_variable</b>(Client *<i>c</i>, const char *<i>variable</i>)</tt></dt>
345 <dd>Returns the contents of the variable <tt><i>variable</i></tt> in
346 the given client's currently active request, or <tt>NULL</tt> if
347 the request did not include such a variable. Like
348 <tt>http_get_header()</tt>, a <tt>NULL</tt> value for the
349 <tt><i>variable</i></tt> parameter allows iterating through
350 multiple instances of a variable.</dd>
351
352 <dt><tt>char *<b>http_quote_html</b>(const char *<i>str</i>, char *<i>outbuf</i>, int32 <i>outsize</i>)</tt></dt>
353 <dd>Applies HTML-style quoting to <tt><i>str</i></tt>, replacing the
354 characters <tt>&lt; &gt; &amp;</tt> with "<tt>&amp;lt;</tt>",
355 "<tt>&amp;gt;</tt>", and "<tt>&amp;amp;</tt>" respectively.
356 <!-- It sure is messy trying to talk about HTML in HTML... -->
357 The result is placed in <tt><i>outbuf</i></tt>, and is truncated if
358 necessary to fit within <tt><i>outsize</i></tt> bytes, including
359 the trailing null byte; however, HTML entities inserted by this
360 routine will never be partially truncated (if an entity would cause
361 a buffer overflow, the output string will be terminated at the
362 location where the entity would have been inserted). The routine
363 returns <tt><i>outbuf</i></tt>, except when a parameter is invalid,
364 in which case <tt>NULL</tt> is returned.</dd>
365
366 <dt><tt>char *<b>http_quote_url</b>(const char *<i>str</i>, char *<i>outbuf</i>, int32 <i>outsize</i>, int <i>slash_question</i>)</tt></dt>
367 <dd>Applies URL escaping to <tt><i>str</i></tt>, replacing with their
368 equivalent <tt>%<i>nn</i></tt> escapes any characters not in the
369 set:
370 <br/><tt>&nbsp;&nbsp;&nbsp;&nbsp;A-Z a-z 0-9 - . _</tt>
371 <br/>As with <tt>http_quote_html()</tt>, stores the (possibly
372 truncated, but without partial escapes) result in
373 <tt><i>outbuf</i></tt>, and returns <tt><i>outbuf</i></tt>, or
374 <tt>NULL</tt> on invalid parameters.</dd>
375
376 <dt><tt>char *<b>http_unquote_url</b>(char *<i>buf</i>)</tt></dt>
377 <dd>Converts any URL escapes in the string <tt><i>buf</i></tt> to their
378 corresponding characters, overwriting the buffer. A truncated
379 escape at the end of the string is discarded, as is any malformed
380 escape (a <tt>%</tt> followed by two characters, one or both of
381 which are not hexadecimal digits). Returns <tt><i>buf</i></tt>.
382 (Note that Unicode escapes of the form <tt>%U<i>nnnn</i></tt> are
383 <i>not</i> handled by this routine, and will be interpreted as a
384 malformed escape followed by three ordinary characters.)</dd>
385
386 <dt><tt>void <b>http_send_response</b>(Client *<i>c</i>, int <i>code</i>)</tt></dt>
387 <dd>Sends an HTTP response line with the response code
388 <tt><i>code</i></tt>, followed by a <tt>Date:</tt> header. The
389 header portion of the response is not terminated, so the caller can
390 send additional headers as necessary.</dd>
391
392 <dt><tt>void <b>http_error</b>(Client *<i>c</i>, int <i>code</i>, const char *<i>format</i>, ...)</tt></dt>
393 <dd>Sends an error message (response headers and body) to the given
394 client, then closes the client's connection. The HTTP response
395 code for the error message is given by <tt><i>code</i></tt>.
396 <tt><i>format</i></tt> gives an optional <tt>printf()</tt>-style
397 format string to use for generating the body of the error message;
398 if it is <tt>NULL</tt>, then default body text is chosen based on
399 the response code.</dd>
400 </dl>
401
402 <p class="backlink"><a href="#top">Back to top</a></p>
403
404
405 <h4 class="subsubsection-title" id="s2-3">8-2-3. <tt>httpd/main</tt>: Main server module</h4>
406
407 <p>The core of the HTTP server is implemented by the <tt>httpd/main</tt>
408 module, defined in the source file <tt>main.c</tt> (along with
409 <tt>util.c</tt>, mentioned above). This module takes care of establishing
410 a listener socket with which to accept client connections, receiving and
411 parsing requests from clients, and passing those requests off to handlers
412 which generate data to send back to the client. (The core module does not
413 respond to any requests by itself, except for generating errors for
414 requests that cannot be successfully processed.</p>
415
416 <p>Unlike most other modules, which take actions in response to messages
417 received from the IRC network, the HTTP server operates independently,
418 relying on the socket framework (see <a href="3.html">section 3</a>) to
419 inform it of activity. The module initialization routine,
420 <tt>init_module()</tt>, opens the port or ports specified by the
421 <tt>ListenTo</tt> configuration directive, creating listener sockets which
422 call back to the <tt>do_accept()</tt> function when a connection is
423 received. The initialization routine also creates two callbacks,
424 "<tt>auth</tt>" and "<tt>request</tt>", into which submodules can hook to
425 provide authorization or request handling services; these are covered in
426 the discussion of request handling below.</p>
427
428 <p>When a connection has been accepted on a socket, the <tt>do_accept()</tt>
429 routine first ensures that the client address is available (as it may be
430 necessary for authorization purposes), then creates and initializes a
431 <tt>Client</tt> structure in which to store information about the client.
432 This is done before checking the number of active connections so that, if
433 the client is to be disconnected due to load, an appropriate error response
434 can be sent with <tt>http_error()</tt> (which requires a valid
435 <tt>Client</tt> structure). If all goes well, read-line and disconnect
436 callbacks are set on the new socket, along with a timeout (as given by the
437 <tt>IdleTimeout</tt> configuration directive), and <tt>do_accept()</tt>
438 returns.</p>
439
440 <p>The actual request processing takes place in two stages: first the
441 full request is received from the client (unless the connection is aborted
442 with an error), and then the request is passed to the relevant handlers.
443 These stages are handled by the <tt>do_readline()</tt> socket callback
444 function and the <tt>handle_request()</tt> routine.</p>
445
446 <p><tt>do_readline()</tt> is called for each line of the request received
447 from the client, and parses each line into appropriate parts of the
448 <tt>Client</tt> structure. The routine tells the first (request) line from
449 subsequent (header) lines by whether or not the <tt>url</tt> field of the
450 <tt>Client</tt> structure is set; if the first line has been successfully
451 processed, this field will always have a non-<tt>NULL</tt> value. Header
452 lines are handled by the subroutine <tt>parse_header()</tt>, which checks
453 whether the line is a new header or a continuation line of a previous
454 header and processes it accordingly.</p>
455
456 <p>Once the blank line signaling the end of headers has been received,
457 <tt>do_readline()</tt> checks whether the request has a body part (a
458 <tt>POST</tt> request with a nonzero <tt>Content-Length</tt> header). If
459 so, the read-line callback on the socket is removed, and
460 <tt>do_readdata()</tt> is instead added as a read-data callback;
461 <tt>do_readdata()</tt> reads in the requisite number of body data bytes and
462 calls <tt>handle_request()</tt>. Otherwise, <tt>do_readline()</tt> calls
463 <tt>handle_request()</tt> itself, after first truncating any query portion
464 of the URL of a <tt>GET</tt> or <tt>HEAD</tt> request and putting the query
465 data in the <tt>Client</tt> structure's <tt>data</tt> field.</p>
466
467 <p><tt>handle_request()</tt> first takes any <tt>GET</tt> query or
468 <tt>POST</tt> data and splits it up into variables and values, by calling
469 either <tt>parse_data()</tt> or <tt>parse_data_multipart()</tt> depending
470 on the request type. After this, it increments the client'S request count,
471 sets the <tt>in_request</tt> flag, and then sets a local variable
472 <tt>close</tt> which is used to indicate whether the client connection
473 should be closed when the request processing is finished. After this setup
474 is complete, <tt>handle_request()</tt> calls the two callbacks
475 "<tt>auth</tt>" and "<tt>request</tt>" to perform the actual request
476 handling; callback functions for both callbacks take the <tt>Client</tt>
477 structure and a pointer to the <tt>close</tt> variable (which may be
478 modified) as parameters.</p>
479
480 <p>The "<tt>auth</tt>" callback is used for request authorization. Each
481 callback function must return one of the <tt>HTTP_AUTH_*</tt> values
482 defined in <tt>http.h</tt>. A value of <tt>HTTP_AUTH_ALLOW</tt> causes the
483 request to be allowed at that point, skipping any subsequent callback
484 functions; likewise, a value of <tt>HTTP_AUTH_DENY</tt> causes the request
485 to be immediately denied. <tt>HTTP_AUTH_UNDECIDED</tt> can be used when
486 the callback function has nothing to say about the instant request, and
487 allows the next callback function to handle authorization. If all callback
488 functions return <tt>HTTP_AUTH_UNDECIDED</tt> (or no callback functions are
489 registered), the request is allowed.</p>
490
491 <p>The "<tt>request</tt>" callback is used for actual request processing.
492 Each callback function should check the URL to determine whether it is one
493 to be processed by that function or not; if so, then the routine should
494 take appropriate action and return a nonzero value, causing any subsequent
495 callback functions to be skipped. If all callback functions return zero,
496 the core server module will send a "not found" (404) error to the client.</p>
497
498 <p>Once the request has been processed, <tt>handle_request</tt> either
499 closes the socket or clears out the <tt>Client</tt> structure, depending on
500 whether the <tt>close</tt> flag is set (nonzero) or clear (zero). In the
501 latter case, request processing for the connection then starts over with
502 parsing of request lines by <tt>do_readline()</tt>. As an adjunct to the
503 <tt>clear</tt> flag, if the <tt>Client</tt> structure's <tt>in_request</tt>
504 field has a negative value, the connection is closed as well; this is to
505 allow <tt>http_error()</tt>, which does not receive a pointer to the
506 <tt>close</tt> flag, to signal that the client should be disconnected.</p>
507
508 <p>It should be noted that client sockets are set to blocking mode (see
509 the description of <tt>sock_set_blocking()</tt> in
510 <a href="3.html#s2-1">section 3-2-1</a>), to simplify implementation of
511 request handlers. Depending on the modules and setting used, this can
512 allow a malicious user to cause Services to freeze by requesting a large
513 amount of data from Services (enough to increase the socket buffer to its
514 maximum size) and deliberately not receiving any of that data.</p>
515
516 <p class="backlink"><a href="#top">Back to top</a></p>
517
518
519 <h4 class="subsubsection-title" id="s2-4">8-2-4. <tt>httpd/auth-ip</tt>: Authorization by IP address</h4>
520
521 <p>The <tt>httpd/auth-ip</tt> module, defined in <tt>auth-ip.c</tt>, is one
522 of two authentication modules included with the Services HTTP server, and
523 allows requests to be allowed or denied based on the IP address of the
524 client. The module maintains a list of allow/deny rules, each with an
525 associated URL path prefix, IP address, and network mask; when a request is
526 found that matches a rule's prefix/address/mask triplet, the request is
527 either allowed or denied based on the type of rule. (If the request
528 matches more than one rule, only the first in the table&mdash;also the
529 first in the file&mdash;is applied.)</p>
530
531 <p>The callback function for the server core's "<tt>auth</tt>" callback,
532 <tt>do_auth()</tt>, is very simple, needing only to iterate through the
533 rule table to find a matching rule for the request. The hard work of
534 converting the list of <tt>AllowHost</tt> and <tt>DenyHost</tt> rules into
535 a table that can be easily processed is handled at module configuration
536 time via custom handler functions for the two directives,
537 <tt>do_AllowHost()</tt> and <tt>do_DenyHost()</tt>. In fact, these are
538 both stubs which call a common routine, <tt>do_AllowDenyHost()</tt>, with
539 an extra parameter to indicate the rule type (allow or deny).</p>
540
541 <p>Note that this module interprets "allow" rules to mean "allow unless
542 denied by another authorization method", and not "allow regardless of any
543 other circumstances". Thus, if a request matches an "allow" rule, the
544 callback function returns <tt>HTTP_AUTH_UNDECIDED</tt> rather than
545 <tt>HTTP_AUTH_ALLOW</tt>.</p>
546
547 <p class="backlink"><a href="#top">Back to top</a></p>
548
549
550 <h4 class="subsubsection-title" id="s2-5">8-2-5. <tt>httpd/auth-password</tt>: Authorization by password</h4>
551
552 <p>The <tt>httpd/auth-password</tt> module, defined in
553 <tt>auth-password.c</tt>, performs authorization based on a username and
554 password provided by a client (using the WWW-Basic HTTP authorization
555 method). If a request is denied, the authorization handler sends an
556 HTTP "401 Unauthorized" message to the client, giving the realm name
557 specified in the rule to provide a user prompt. Other than this, and the
558 comparative simplicity of the configuration directive handler functions,
559 this module is more or less identical to <tt>auth-ip.c</tt>.</p>
560
561 <p>As with the <tt>httpd/auth-ip</tt> module (and also mentioned in
562 comments in the source code for this module), "allow" rules are treated as
563 "allow subject to other permission checks" rather than "allow
564 unconditionally", and the callback function <tt>do_auth()</tt> returns
565 <tt>HTTP_AUTH_UNDECIDED</tt> rather than <tt>HTTP_AUTH_ALLOW</tt> for such
566 rules.</p>
567
568 <p class="backlink"><a href="#top">Back to top</a></p>
569
570
571 <h4 class="subsubsection-title" id="s2-6">8-2-6. <tt>httpd/top-page</tt>: Static page for server root</h4>
572
573 <p>The <tt>httpd/top-page</tt> module, defined in <tt>top-page.c</tt>, is a
574 very simple request handler which (depending on configuration settings)
575 sends either the contents of a local file or an HTTP redirect in response
576 to a request for the server's top page ("<tt>/</tt>").</p>
577
578 <p class="backlink"><a href="#top">Back to top</a></p>
579
580
581 <h4 class="subsubsection-title" id="s2-7">8-2-7. <tt>httpd/redirect</tt>: Redirects to nickname/channel URLs</h4>
582
583 <p>The <tt>httpd/redirect</tt> module, defined in <tt>redirect.c</tt>,
584 allows URLs stored with registered nicknames and channels to be accessed
585 through the HTTP server. Two URL prefixes, one each for nicknames and
586 channels, are defined via configuration directives (<tt>NicknamePrefix</tt>
587 and <tt>ChannelPrefix</tt> respectively); when a request is received that
588 matches one of the prefixes, the remainder of the URL is used as a nickname
589 or channel name, and a redirect is sent for the URL associated with the
590 nickname or channel (if not registered or no URL is stored, an error is
591 returned).</p>
592
593 <p>Since the "<tt>#</tt>" character is treated specially by web browsers,
594 channel names are specified without the "<tt>#</tt>", which is added back
595 internally when accessing the channel's data. For example, if
596 <tt>ChannelPrefix</tt> is "<tt>/channel/</tt>", then a URL of
597 "<tt>/channel/SomeChannel</tt>" will redirect to the URL record for the
598 channel <tt>#SomeChannel</tt>.</p>
599
600 <p>Naturally, in order to access nickname and channel data, the module
601 must interface with the NickServ and ChanServ modules. This is done via
602 the "<tt>load module</tt>" and "<tt>unload module</tt>" callbacks, which
603 watch for the <tt>nickserv/main</tt> and <tt>chanserv/main</tt> modules to
604 be loaded and save pointers to necessary functions. To avoid problems
605 arising from the order in which the module is loaded, the
606 <tt>init_module()</tt> routine also checks for the presence of these
607 modules, and calls the "<tt>load module</tt>" callback function
608 <tt>do_load_module()</tt> manually if they are already loaded.</p>
609
610 <p class="backlink"><a href="#top">Back to top</a></p>
611
612
613 <h4 class="subsubsection-title" id="s2-8">8-2-8. <tt>httpd/dbaccess</tt>: Provides database access via HTTP</h4>
614
615 <p>The <tt>http/dbaccess</tt> module, defined in <tt>dbaccess.c</tt>,
616 provides access to the data stored in the Services pseudoclient databases.
617 It is easily the most complex of the HTTP server modules, as it must
618 interface with each of the pseudoclient modules to obtain the data it
619 provides to the client, and it must remain up-to-date with any changes to
620 the internal data storage format used by the various modules.</p>
621
622 <p>At the top of the file are several definitions used to simplify access
623 to imported functions and variables. As noted in the source code, these
624 are not only referenced when the corresponding module has been loaded and
625 the symbols successfully dereferenced, so there is no need to check the
626 pointers for <tt>NULL</tt> values. <i>(Implementation note: Nonetheless,
627 it would be a good idea to do so anyway, just in case.)</i> These are
628 followed by the <tt>PRINT_SELOPT()</tt> macro, used to generate HTML for
629 selecting among one of several display options, and the
630 <tt>my_strftime()</tt> function, which converts a <tt>time_t</tt> timestamp
631 value to a standard-format string and HTML-quotes the result.</p>
632
633 <p>The main request handler routine, <tt>do_request()</tt>, is located
634 following these initial definitions. The only actual work performed by
635 this routine, however, is checking the URL against the prefix defined for
636 use by the module (in the <tt>Prefix</tt> variable, set by the same-named
637 configuration directive), and generating a root page under <tt>Prefix</tt>
638 redirecting to each of the available sets of data, one per pseudoclient
639 (and one for XML export, as noted below). All requests for subpages are
640 delivered to the appropriate subpath handler.</p>
641
642 <p>This routine is followed by the subpath handlers themselves, each with a
643 name of the form <tt>handle_<i>XXX</i>()</tt> indicating the subpath
644 handled by the routine (with a few exceptions, noted below). Each handler
645 takes the <tt>Client *<i>c</i></tt> and <tt>int *<i>close_ptr</i></tt>
646 parameters from the original request, along with a <tt>char *<i>path</i></tt>
647 parameter indicating the remainder of the URL path below the handler's own
648 subpath.</p>
649
650 <p>The first of these handlers is the OperServ data handler,
651 <tt>handle_operserv()</tt>. In addition to the current number of users
652 and operators along with basic data recorded by OperServ (the maximum user
653 count and time), the page includes links to further subhandlers for
654 autokills and exclusions, news items, session exceptions, and S-lines.
655 Each of these has its own handler function; with the exception of news
656 items (handled by <tt>handle_operserv_news()</tt>), the subhandlers make
657 use of a common routine, <tt>handle_operserv_maskdata()</tt>, to output
658 the appropriate data. (However, there is no support for an explicit path
659 <tt>/operserv/maskdata</tt>.)</p>
660
661 <p>The <tt>handle_operserv_maskdata()</tt> routine has two modes of
662 operation, as do many of the lowest-level data handlers. When called with
663 no further subpath (<i>e.g.</i> <tt>/operserv/akill/</tt>), a list of
664 mask-data records of the appropriate type is sent to the client as a list
665 of links. Selecting one of these will go to a path with that string as the
666 final path element, and will cause the routine to display detailed
667 information about the selected entry, much like using the <tt>VIEW</tt>
668 subcommand of OperServ's various mask-data commands.</p>
669
670 <p>Unlike the other OperServ data sets, there is no detailed information to
671 show about news items. Therefore, the <tt>handle_operserv_news()</tt>
672 routine simply outputs a list of news items (both logon news and operator
673 news), like the <tt>LOGONNEWS LIST</tt> and <tt>OPERNEWS LIST</tt>
674 commands.</p>
675
676 <p>The OperServ data handlers are followed by <tt>handle_nickserv()</tt>,
677 for displaying nickname data. Unlike <tt>handle_operserv()</tt>, this
678 routine does not call on any subroutines, as there are only two modes of
679 operation: listing registered nicknames (handled at the top of the routine)
680 and displaying detailed information on a specific nickname (handled by the
681 long remainder of the routine). The length of the routine is mainly the
682 result of the need to quote all special characters in nickname data, to
683 prevent malicious users from corrupting the output by setting particular
684 strings in their nickname data.</p>
685
686 <p>This is followed by <tt>handle_chanserv()</tt>, which functions
687 similarly to <tt>handle_nickserv()</tt> except that it works on channels
688 rather than nicknames. However, to reduce the amount of data sent in
689 response to a single request, the privilege level, channel access, and
690 autokick lists are split off into separate pages, accessed by appending
691 "<tt>/levels</tt>", "<tt>/access</tt>", or "<tt>/autokick</tt>"
692 respectively to the URL. The local variable <tt>mode</tt> keeps track of
693 what type of data the routine is to display.</p>
694
695 <p>Next is <tt>handle_statserv()</tt>, which predictably displays
696 information from the StatServ pseudoclient's database. As StatServ
697 currently only tracks a minimal amount of data, the implementation is
698 comparatively simple, either listing the servers recorded with StatServ or
699 displaying information for a selected server.</p>
700
701 <p>Finally, <tt>handle_xml_export()</tt> is used to generate an XML data
702 set containing all data registered with Services pseudoclients, using the
703 <tt>misc/xml-export</tt> module described in <a href="#s4-1">section
704 8-4-1</a>. As browsers may attempt to parse the data rather than
705 displaying or saving it if a content type of <tt>text/xml</tt> is used,
706 the module instead sends the type <tt>text/plain</tt>. (The acerbic
707 comment in the source code has to do with a misfeature in at least some
708 versions of the Microsoft Internet Explorer web browser; such versions
709 ignore a <tt>Content-Type: text/plain</tt> header and attempt to interpret
710 the data using internal heuristics, resulting in users being unable to view
711 the XML data.)</p>
712
713 <p class="backlink"><a href="#top">Back to top</a></p>
714
715
716 <h4 class="subsubsection-title" id="s2-9">8-2-9. <tt>httpd/debug</tt>: Debugging module</h4>
717
718 <p>The <tt>http/debug</tt> module, defined in <tt>debug.c</tt>, is intended
719 to be used for debugging the HTTP server, and dumps several fields of the
720 <tt>Client</tt> structure in response to requests to a particular URL (set
721 by the <tt>DebugURL</tt> configuration directive). While the module does
722 not return any sensitive information to the client, only information about
723 the client itself, it is still bad practice to leave any unnecessary
724 functionality such as this enabled, so this module should not be (and is
725 not intended to be) loaded except when debugging.</p>
726
727 <p>The <tt>do_request()</tt> function in the source code, which does the
728 actual request handling, also includes a number of comments explaining the
729 request-handling process in more detail.</p>
730
731 <p class="backlink"><a href="#top">Back to top</a></p>
732
733 <!------------------------------------------------------------------------>
734 <hr/>
735
736 <h3 class="subsection-title" id="s3">8-3. Mail-sending modules</h3>
737
738 <p>In order to facilitate features such as mail authentication and memo
739 forwarding, Services includes a set of modules allowing mail to be sent to
740 remote systems. As with the built-in HTTP server described in
741 <a href="#s2">section 8-2</a>, this functionality operates independently
742 of the primary pseudoclients and IRC network connection (except to the
743 extent that the sending of mail is typically initiated in response to a
744 pseudoclient command).</p>
745
746 <p>The mail-sending subsystem is composed of a core module implementing the
747 mail interface, <tt>mail/main</tt>, and submodules for specific methods of
748 sending mail. All relevant source files are located in the
749 <tt>modules/mail</tt> directory.</p>
750
751 <p class="backlink"><a href="#top">Back to top</a></p>
752
753
754 <h4 class="subsubsection-title" id="s3-1">8-3-1. <tt>mail/main</tt>: Main mail module</h4>
755
756 <p>The core mail-sending functionality is located in the <tt>mail/main</tt>
757 module, defined in <tt>main.c</tt>. The module consists of two interfaces:
758 an external interface, declared in the <tt>mail.h</tt> header file, for use
759 by other modules to send mail, and an internal interface, declared in the
760 <tt>mail-local.h</tt> header file, used for communicating with the
761 low-level modules that perform the actual send operation.</p>
762
763 <p>The external interface consists of a single function, <tt>sendmail()</tt>,
764 declared as follows:</p>
765
766 <div class="code">void <b>sendmail</b>(const char *<i>to</i>, const char *<i>subject</i>,
767 const char *<i>body</i>, const char *<i>charset</i>,
768 MailCallback <i>completion_callback</i>, void *<i>callback_data</i>)</div>
769
770 <ul>
771 <li class="spaced"><tt>const char *<i>to</i></tt>: The address to which the
772 message is to be sent.</li>
773 <li class="spaced"><tt>const char *<i>subject</i></tt>: The subject line to
774 use with the message.</li>
775 <li class="spaced"><tt>const char *<i>body</i></tt>: The body of the
776 message (newlines are permitted within the message body).</li>
777 <li class="spaced"><tt>const char *<i>charset</i></tt>: <i>Optional.</i>
778 The MIME character set (<i>e.g.</i>, "<tt>iso-8859-1</tt>") in
779 which the message text is written. If not specified, no character
780 set is assumed.</li>
781 <li class="spaced"><tt>MailCallback <i>completion_callback</i></tt>:
782 <i>Optional.</i> The function to be called when mail sending
783 completes (see below).</li>
784 <li class="spaced"><tt>void *<i>callback_data</i></tt>: <i>Optional.</i>
785 Arbitrary data passed unchanged to the completion callback.</li>
786 </ul>
787
788 <p>The first thing to note about this function is that it does not return a
789 value. Mail sending is performed asynchronously (subject to limitations of
790 the particular low-level module in use), so that when the function returns,
791 the requested message has been queued but not necessarily sent. In order
792 to signal the result of a mail-sending operation, <tt>sendmail()</tt> takes
793 a callback function parameter (<tt><i>completion_callback</i></tt>); this
794 function is called when the sending operation has completed, successfully
795 or otherwise. The function type is defined as <tt><i>MailCallback</i></tt>
796 in <tt>mail.h</tt>:</p>
797
798 <div class="code">typedef void (*<b>MailCallback</b>)(int <i>status</i>, void *<i>data</i>)</div>
799
800 <p>where <tt><i>data</i></tt> is the <tt><i>callback_data</i></tt> value
801 passed to <tt>sendmail()</tt>, and <tt><i>status</i></tt> is one of the
802 following values:</p>
803
804 <ul>
805 <li><tt>MAIL_STATUS_SENT</tt>: The message was successfully sent.</li>
806 <li><tt>MAIL_STATUS_ERROR</tt>: An unspecified error occurred while sending
807 the message.</li>
808 <li><tt>MAIL_STATUS_NORSRC</tt>: Insufficient resources were available to
809 perform the send operation.</li>
810 <li><tt>MAIL_STATUS_REFUSED</tt>: Delivery of the message was refused by
811 the remote system.</li>
812 <li><tt>MAIL_STATUS_TIMEOUT</tt>: A timeout occurred while trying to send
813 the message.</li>
814 <li><tt>MAIL_STATUS_ABORTED</tt>: The operation was aborted (because the
815 low-level mail module was removed before the message was sent, for
816 example).</li>
817 </ul>
818
819 <p>It is important to note that, while <tt>sendmail()</tt> does not wait
820 for the message to be sent before returning, there is nothing preventing
821 the low-level module from delivering the message immediately if possible,
822 and in cases such as sending to a user on the local system, the callback
823 function may be called even before <tt>sendmail()</tt> itself returns! For
824 this reason, the caller must ensure that all setup required by the callback
825 function is performed <i>before</i> calling <tt>sendmail()</tt>.</p>
826
827 <p><tt>sendmail()</tt>, in turn, does its work by calling out to functions
828 implemented in a low-level module. The interface consists of two functions
829 which the low-level module must provide, along with a function provided by
830 the core module for signaling the completion of a mail operation:</p>
831
832 <dl>
833 <dt><tt>void (*<b>low_send</b>)(MailMessage *<i>msg</i>)</tt></dt>
834 <dd>Provided by the low-level module, this function performs the actual
835 work of starting the send operation, and is called by
836 <tt>sendmail()</tt> once parameter and other checks have been
837 performed. As with <tt>sendmail()</tt>, the routine does not
838 return a value, but instead calls <tt>send_finished()</tt> (see
839 below) to signal the message's status. Typically, this routine
840 will perform any necessary module-specific checks, then start the
841 asynchronous send operation and return without calling
842 <tt>send_finished()</tt>.
843
844 <p>The parameter passed to this routine is a structure (see below)
845 describing the message to be sent. On entry, the structure's
846 <tt>from</tt>, <tt>to</tt>, <tt>subject</tt>, and <tt>body</tt> are
847 guaranteed to be non-<tt>NULL</tt>. The strings in these fields
848 and the <tt>fromname</tt> field (which may be <tt>NULL</tt>) can be
849 changed freely, but the pointer values should be left
850 unmodified.</p></dd>
851
852 <dt><tt>void (*<b>low_abort</b>)(MailMessage *<i>msg</i>)</tt></dt>
853 <dd>Provided by the low-level module, this function takes any actions
854 needed to abort the sending of a message currently in progress;
855 the message to abort is indicated by the <tt><i>msg</i></tt>
856 parameter, which will be the same as passed to a previous call to
857 <tt>low_send()</tt>. The given message <i>must</i> be aborted, as
858 there is no way for the routine to signal a failure to abort. The
859 routine should not call <tt>send_finished()</tt>, as the core
860 module will take care of setting the message completion status.</dd>
861
862 <dt><tt>void <b>send_finished</b>(MailMessage *<i>msg</i>, int <i>status</i>)</tt></dt>
863 <dd>Provided by the core module, this function is called by low-level
864 modules to signal that a message has been successfully sent or an
865 error has occurred that prevents the message from being sent. The
866 <tt><i>msg</i></tt> parameter is the same one passed to
867 <tt>low_send()</tt>, and <tt><i>status</i></tt> is one of the
868 status codes listed above (<tt>MAIL_STATUS_*</tt>).</dd>
869 </dl>
870
871 <p>As can be seen from the above, both <tt>low_send</tt> and
872 <tt>low_abort</tt> are declared as function pointers in the core module;
873 low-level modules must set these to point to their own implementations of
874 the functions. <i>Implementation note: It would be better to use a
875 <tt>register()</tt>/<tt>unregister()</tt> pair of functions, as with the
876 encryption and database code.</i></p>
877
878 <p>The <tt>MailMessage</tt> structure used as a parameter in the above
879 functions is used to collect the various parameters of a message into a
880 single group for passing to the low-level modules. The pointer itself also
881 serves as a unique ID value for each message in transit. The structure
882 contains the following fields:</p>
883
884 <ul>
885 <li><tt>MailMessage *<b>next</b>, *<b>prev</b></tt>: Used by the core
886 module to manage the list of in-transit messages.</li>
887 <li><tt>char *<b>from</b></tt>: Copied from the value given in the
888 <tt>FromAddress</tt> configuration directive.</li>
889 <li><tt>char *<b>fromname</b></tt>: Copied from the value given in the
890 <tt>FromName</tt> configuration directive, or <tt>NULL</tt> if no
891 <tt>FromName</tt> directive was given.</li>
892 <li><tt>char *<b>to</b></tt>: Copied from the <tt><i>to</i></tt> parameter
893 to <tt>sendmail()</tt>.</li>
894 <li><tt>char *<b>subject</b></tt>: Copied from the <tt><i>subject</i></tt>
895 parameter to <tt>sendmail()</tt>.</li>
896 <li><tt>char *<b>body</b></tt>: Copied from the <tt><i>body</i></tt>
897 parameter to <tt>sendmail()</tt>.</li>
898 <li><tt>char *<b>charset</b></tt>: Copied from the <tt><i>charset</i></tt>
899 parameter to <tt>sendmail()</tt>, or <tt>NULL</tt> if the
900 <tt><i>charset</i></tt> parameter was <tt>NULL</tt>.</li>
901 <li><tt>MailCallback <b>completion_callback</b></tt>: Set to the
902 <tt><i>completion_callback</i></tt> parameter to
903 <tt>sendmail()</tt>.</li>
904 <li><tt>void *<b>callback_data</b></tt>: Set to the
905 <tt><i>callback_data</i></tt> parameter to <tt>sendmail()</tt>.</li>
906 <li><tt>Timeout *<b>timeout</b></tt>: Used by the core module to manage
907 send timeouts.</li>
908 </ul>
909
910 <p>The core module itself, defined in <tt>main.c</tt>, simply serves as a
911 kind of "glue" between external callers and the low-level modules; it
912 consists of the implementations of <tt>sendmail()</tt> and
913 <tt>send_finished()</tt>, along with a timeout callback function
914 (<tt>send_timeout()</tt>) for messages which remain in transit longer than
915 the time specified by the <tt>SendTimeout</tt> configuration directive.
916 When <tt>sendmail()</tt> is called, it performs checks on its parameters
917 (calling the callback function with an error code if a problem is found),
918 then sets up a <tt>MailMessage</tt> structure for the message, activates a
919 timeout if <tt>SendTimeout</tt> is enabled, and calls <tt>low_send()</tt>
920 to begin the actual sending process. When the low-level module calls
921 <tt>send_finished()</tt>, it likewise calls the completion callback
922 function with the specified status, then unlinks and frees the
923 <tt>MailMessage</tt> structure for the message. Messages can be aborted
924 if they time out, or if the core module is removed with any messages
925 still in transit.</p>
926
927 <p class="backlink"><a href="#top">Back to top</a></p>
928
929
930 <h4 class="subsubsection-title" id="s3-2">8-3-2. <tt>mail/sendmail</tt>: Sends mail using the <tt>sendmail</tt> program</h4>
931
932 <p>The <tt>mail/sendmail</tt> module, defined in <tt>sendmail.c</tt>, makes
933 use of an external "sendmail" program to send mail. The module was
934 designed primarily as a test module to ensure that the core mail processing
935 code worked correctly, to help isolate problems before development of the
936 more complex SMTP module started; it has been retained to support systems
937 which cannot use SMTP to send mail directly, but such systems are presumed
938 to be rare, and little effort has been put into improving this module. In
939 particular, the module (and thus Services itself) blocks while interacting
940 with the external program, potentially causing Services to lag and even
941 opening up the possibility of denial-of-service attacks on Services (by
942 repeatedly sending messages to addresses which take a long time to
943 process).</p>
944
945 <p>The entire logic of the module, outside of the module initialization and
946 cleanup code (which actually comprises about half of the source file), is
947 contained in <tt>send_sendmail()</tt>, the implementation of the
948 <tt>low_send()</tt> routine called by the core module's <tt>sendmail()</tt>
949 function. <tt>send_sendmail()</tt> opens a pipe to the program specified
950 by the <tt>SendmailPath</tt> directive, which is assumed to take a
951 "<tt>-t</tt>" option to read the recipient address from the message
952 headers, as the standard Unix <tt>sendmail</tt> program does. The message
953 is then written over the pipe, and <tt>pclose()</tt> is called to wait for
954 the message sending operation to complete. This latter step, which is
955 required to free the pipe resources as well, places Services at the mercy
956 of the external program, as <tt>pclose()</tt> will not return until the
957 process exits. <i>Implementation note: One improvement would be to make
958 the pipe non-blocking, but as Services has no facilities for monitoring
959 arbitrary file descriptors, this would require a periodic check via a
960 timeout routine to see whether the child process had exited.</i> Finally,
961 the message status is reported based on the exit code of the child
962 process.</p>
963
964 <p class="backlink"><a href="#top">Back to top</a></p>
965
966
967 <h4 class="subsubsection-title" id="s3-3">8-3-3. <tt>mail/smtp</tt>: Sends mail using SMTP</h4>
968
969 <p>The <tt>mail/smtp</tt> module, defined in <tt>smtp.c</tt>, sends mail
970 via the SMTP protocol. While the module makes some simplifying
971 assumptions, notably that a relay server is available that will accept and
972 distribute mail on behalf of Services, it is more robustly designed than
973 the <tt>mail/sendmail</tt> module, and is the recommended module for use in
974 Services.</p>
975
976 <p>As mentioned above, the <tt>mail/smtp</tt> module relies on the presence
977 of an external relay server, which can be as simple as an SMTP daemon
978 running on the same machine, that will accept message from Services via
979 SMTP and relay them to the appropriate destinations. By doing this, the
980 module is freed from the necessity of performing DNS lookups for each
981 message sent, significantly reducing the complexity of the module.
982 However, this also means that invalid addresses cannot be detected, except
983 to the extent that the relay server checks for them during the SMTP
984 connection from Services.</p>
985
986 <p>For each message to be sent, the module creates a new connection to the
987 relay server, taking advantage of the socket callbacks described in
988 <a href="3.html">section 3</a> to process SMTP communications
989 asynchronously. The socket used for each message, along with the
990 <tt>MailMessage</tt> structure itself and other per-message data, is stored
991 in a <tt>SocketInfo</tt> structure; the module maintains a list of these
992 structures, one for each message in transit. The <tt>SocketInfo</tt>
993 structure contains the following fields:</p>
994
995 <dl>
996 <dt><tt>struct SocketInfo_ *<b>next</b>, *<b>prev</b></tt></dt>
997 <dd>Used to maintain the linked list of structures. (<tt>struct
998 SocketInfo_</tt> is the same type as <tt>SocketInfo</tt>, and is
999 used here only because the structure is defined as part of the
1000 <tt>typedef</tt>.)</dd>
1001
1002 <dt><tt>Socket *<b>sock</b></tt></dt>
1003 <dd>The socket being used to send the message.</dd>
1004
1005 <dt><tt>MailMessage *<b>msg</b></tt></dt>
1006 <dd>The message data structure passed in from the core module.</dd>
1007
1008 <dt><tt>int <b>msg_status</b></tt></dt>
1009 <dd>The message status code to be passed to <tt>send_finished()</tt>.</dd>
1010
1011 <dt><tt>int <b>relaynum</b></tt></dt>
1012 <dd>The index (into the <tt>RelayHosts[]</tt> array) of the relay
1013 server currently in use. If a connection to the first server
1014 fails, the code will increment this field and retry the connection
1015 until the list of relay hosts is exhausted.</dd>
1016
1017 <dt><tt>enum {...} <b>state</b></tt></dt>
1018 <dd>The current state of the connection:
1019 <ul>
1020 <li><b><tt>ST_GREETING</tt>:</b> Waiting for the remote server's
1021 greeting.</li>
1022 <li><b><tt>ST_HELO</tt>:</b> Waiting for a response to the
1023 <tt>HELO</tt> command.</li>
1024 <li><b><tt>ST_MAIL</tt>:</b> Waiting for a response to the
1025 <tt>MAIL</tt> command.</li>
1026 <li><b><tt>ST_RCPT</tt>:</b> Waiting for a response to the
1027 <tt>RCPT</tt> command.</li>
1028 <li><b><tt>ST_DATA</tt>:</b> Waiting for a response to the
1029 <tt>DATA</tt> command.</li>
1030 <li><b><tt>ST_FINISH</tt>:</b> Waiting for the server to confirm
1031 that it has accepted the message.</li>
1032 </ul></dd>
1033
1034 <dt><tt>int <b>replycode</b></tt></dt>
1035 <dd>The reply code associated with the line currently being received
1036 from the server. A value of zero indicates that the next character
1037 received will be the beginning of a new line.</dd>
1038
1039 <dt><tt>char <b>replychar</b></tt></dt>
1040 <dd>The fourth character of the line currently being received (normally
1041 either a space or a hyphen, indicating the absence or presence of
1042 continuation lines respectively).</dd>
1043
1044 <dt><tt>int <b>garbage</b></tt></dt>
1045 <dd>The number of garbage (non-reply) lines received from the server,
1046 used to check for an erroneous connection to a non-SMTP server.</dd>
1047 </dl>
1048
1049 <p>When the <tt>low_send()</tt> implementation routine, <tt>send_smtp()</tt>,
1050 is called, it first cleans any double quotes out of the "From" name (since
1051 that name will later be enclosed in double quotes), then sets up a
1052 <tt>SocketInfo</tt> structure for the message and creates a socket for SMTP
1053 communication. On success, the socket's callbacks are set, and
1054 <tt>try_next_relay()</tt> is called to attempt a connection to the first
1055 SMTP relay specified in the configuration file. (The <tt>msg_status</tt>
1056 field of <tt>SocketInfo</tt> is set to <tt>MAIL_STATUS_ERROR</tt> to
1057 provide a fallback value in case an error in the module results in
1058 <tt>send_finished()</tt> being called without an explicit status being set;
1059 the "don't depend on this" is simply a reminder to ensure that the status
1060 is in fact set correctly, rather than relying on that default value, since
1061 the default could potentially change.)</p>
1062
1063 <p><tt>try_next_relay()</tt>, in turn, increments the <tt>relaynum</tt>
1064 field, then checks whether it has exceeded the number of configured relay
1065 servers. If so, sending is terminated with an error code based on the
1066 value of <tt>errno</tt> as returned from the last system call (the routine
1067 is assumed to be called immediately after a socket-related system call);
1068 otherwise, a connection is initiated to the next relay server, looping back
1069 to the top of the function if the <tt>conn()</tt> call fails.</p>
1070
1071 <p>Actual socket processing is handled by the <tt>smtp_readline()</tt> and
1072 <tt>smtp_disconnect()</tt> functions. The latter, <tt>smtp_disconnect()</tt>,
1073 simply calls <tt>send_finished()</tt>, passing either the value of
1074 <tt>msg_status</tt> (if the connection was closed locally) or an
1075 appropriate error status (if the connection was broken remotely or failed),
1076 then frees the <tt>SocketInfo</tt> structure with <tt>free_socketinfo()</tt>,
1077 which also closes the socket itself. (If the routine is called as the
1078 result of a failed connection, however, it calls <tt>try_next_relay()</tt>
1079 instead.)</p>
1080
1081 <p><tt>smtp_readline()</tt> is the workhorse of the <tt>mail/smtp</tt>
1082 module, processing data read from the server and sending the SMTP commands
1083 necessary to relay the message. The routine first reads a line of data
1084 from the socket, ensuring that it ends with a newline and removing that
1085 newline. (While the socket subsystem ensures that a full line is
1086 available when the read-line callback is called, <tt>smtp_readline()</tt>
1087 is also able to handle partial lines, except in the pathological case of a
1088 truncated reply code.) If the text received is at the beginning of a line,
1089 the 3-digit reply code and continuation character are parsed and stored in
1090 the <tt>SocketInfo</tt> structure corresponding to the socket. When a
1091 complete, non-continued response line has been received,
1092 <tt>smtp_readline()</tt> then either generates an error (for 4xx or 5xx
1093 error responses from the SMTP server) or sends the next command or message
1094 data to the server, depending on the connection state, and the state is
1095 incremented. (After sending the final <tt>QUIT</tt> command, the socket is
1096 closed, causing <tt>send_finished()</tt> to be called from the socket
1097 disconnection callback.)</p>
1098
1099 <p>The module's implementation of the <tt>low_abort()</tt> function can be
1100 found in <tt>smtp_abort()</tt>. The routine simply looks up the
1101 <tt>SocketInfo</tt> corresponding to the message, then frees it,
1102 disconnecting the socket in the process.</p>
1103
1104 <p class="backlink"><a href="#top">Back to top</a></p>
1105
1106 <!------------------------------------------------------------------------>
1107 <hr/>
1108
1109 <h3 class="subsection-title" id="s4">8-4. Miscellaneous modules</h3>
1110
1111 <p>This section documents the two remaining modules which do not fit
1112 neatly into any other category: the <tt>misc/xml-export</tt> and
1113 <tt>misc/xml-import</tt> modules, used for exporting Services pseudoclient
1114 data to an XML file and vice versa. Both of these modules are located in
1115 the <tt>modules/misc</tt> directory.</p>
1116
1117 <p class="backlink"><a href="#top">Back to top</a></p>
1118
1119
1120 <h4 class="subsubsection-title" id="s4-1">8-4-1. <tt>misc/xml-export</tt>: Data export using XML</h4>
1121
1122 <p>The <tt>misc/xml-export</tt> module, defined in <tt>xml-export.c</tt>
1123 along with declarations in <tt>xml.h</tt>, provides a method through which
1124 Services pseudoclient data can be exported into an XML file suitable for
1125 use with external programs. It should be noted that this module does not
1126 make use of the standard database interface, relying instead on direct
1127 calls to the appropriate modules' database access functions and direct
1128 access to the corresponding data structures, and thus cannot export data
1129 added by third-party modules. This limitation is a result of the module's
1130 implementation in version 5.0, before the current database system was
1131 developed; one possible solution would be to reimplement this module and
1132 <tt>misc/xml-import</tt> as database modules
1133 (see <a href="11.html#s1">section 11-1</a>).</p>
1134
1135 <p>One thing worth noting about the structure of the module is that, since
1136 it is also compiled into the <tt>convert-db</tt> tool, there are a number
1137 of code segments (mainly logging calls) that need to be compiled
1138 differently. These are protected by preprocessor conditionals on the
1139 <tt>CONVERT_DB</tt> symbol, defined by <tt>tools/Makefile</tt> (see
1140 <a href="10.html#s3-4">section 10-3-4</a>).</p>
1141
1142 <p>Exporting is handled by the <tt>xml_export()</tt> routine defined near
1143 the bottom of the file. This routine takes two parameters: a function
1144 pointer of type <tt>xml_writefunc_t</tt>, specifying the function to be
1145 called to output data, and an arbitrary pointer value which is passed
1146 unchanged to the function. The <tt>xml_writefunc_t</tt> type is defined in
1147 <tt>xml.h</tt> as:</p>
1148
1149 <div class="code">int (*<b>xml_writefunc_t</b>)(void *<i>data</i>, const char *<i>fmt</i>, ...)</div>
1150
1151 <p>where <tt><i>data</i></tt> is the pointer parameter passed to
1152 <tt>xml_export()</tt> and <tt><i>fmt</i></tt> is a <tt>printf()</tt>-style
1153 format string. (This prototype was chosen so that <tt>fprintf()</tt> could
1154 be used as a callback function. <tt>sprintf()</tt> also fits the
1155 prototype, but should be avoided due to the likelihood of buffer
1156 overflows.)</p>
1157
1158 <p><tt>xml_export()</tt> does not actually export any data itself, other
1159 than writing the <tt>&lt;?xml?&gt;</tt> header tag and top-level
1160 <tt>&lt;ircservices-db&gt;</tt> enclosing tags. Rather, it calls helper
1161 routines to export each class of data, passing the write function pointer
1162 and data pointer along to each routine.</p>
1163
1164 <p>The first of these helper routines is <tt>export_constants()</tt>.
1165 This routine does not export any data <i>per se</i>, but instead writes
1166 out the values of various constants used by Services; this allows other
1167 programs which read in the data to interpret numerical data such as
1168 channel access levels and special values of limits properly, rather than
1169 relying on the definitions used in any particular version of Services (or
1170 whatever other program may have generated the data).</p>
1171
1172 <p>Following this is <tt>export_operserv_data()</tt>, the first of the
1173 actual data export routines. This routine writes out the maximum user
1174 count and timestamp, along with the super-user password if present. The
1175 password is written in encrypted format, and is first passed through the
1176 <tt>xml_quotebuf()</tt> function to avoid the danger of special characters
1177 like <tt>&lt;</tt>, <tt>&gt;</tt>, or the null character from causing
1178 problems when the data is read in. This latter function, defined near the
1179 top of the file, converts all non-ASCII bytes in the passed-in buffer to
1180 their equivalent character codes, and converts the three characters
1181 <tt>&lt;</tt> <tt>&gt;</tt> <tt>&amp;</tt> to "<tt>&amp;lt;</tt>",
1182 "<tt>&amp;gt;</tt>", and "<tt>&amp;amp;</tt>" respectively. The size of
1183 the static return buffer, <tt>BUFSIZE*6+1</tt>, is so that an input buffer
1184 of up to <tt>BUFSIZE</tt> bytes can be encoded with no truncation (the
1185 longest possible encoding for a single byte is 6 characters:
1186 "<tt>&#<i>nnn</i>;</tt>").</p>
1187
1188 <p>The next routine, <tt>export_nick_db()</tt>, is the first of the true
1189 database export routines, iterating through all nickname groups and then
1190 all nicknames to dump the data for each record to the XML output stream.
1191 The routine takes advantage of these <tt>XML_PUT_*</tt> macros defined at
1192 the top of the source file to simplify the writing of the various structure
1193 fields and substructures. These macros are:</p>
1194
1195 <ul>
1196 <li><b><tt>XML_PUT_STRING()</tt>:</b> Writes out a string field.</li>
1197 <li><b><tt>XML_PUT_PASS()</tt>:</b> Writes out a password field.</li>
1198 <li><b><tt>XML_PUT_LONG()</tt>:</b> Writes out a signed integer field of
1199 size no greater than <tt>long</tt> (but possibly smaller).</li>
1200 <li><b><tt>XML_PUT_ULONG()</tt>:</b> Writes out an unsigned integer field
1201 of size no greater than <tt>unsigned long</tt> (but possibly
1202 smaller).</li>
1203 <li><b><tt>XML_PUT_STRARR()</tt>:</b> Writes out a variable-length string
1204 array field.</li>
1205 </ul>
1206
1207 <p>Each macro takes three parameters: <tt><i>indent</i></tt>, a string
1208 prefixed to the output line for indenting; <tt><i>structure</i></tt>, the
1209 structure (not structure pointer) in which the field to write resides; and
1210 <tt><i>field</i></tt>, the name of the field to write. The value written
1211 is enclosed in tags named the same as the field name.</p>
1212
1213 <p>The subsequent database export routines&mdash;<tt>export_channel_db()</tt>,
1214 <tt>export_news_db()</tt>, <tt>export_maskdata</tt>, and
1215 <tt>export_statserv_db()</tt>&mdash;export the corresponding databases in a
1216 similar manner. One point of note is the writing of mode locks in
1217 <tt>export_channel_db()</tt>: since the <tt>on</tt> and <tt>off</tt> fields
1218 of the <tt>ModeLock</tt> structure are strings rather than bitmasks in the
1219 <tt>convert-db</tt> tool, as noted in <a href="7.html#s4-1-1">section
1220 7-4-1-1</a>, they are handled differently depending on whether the
1221 preprocessor symbol <tt>CONVERT_DB</tt> is defined.</p>
1222
1223 <p>The <tt>misc/xml-export</tt> module also includes a callback function
1224 for the core's "<tt>command line</tt>" callback, allowing the pseudoclient
1225 databases to be exported without connecting to the network. The callback
1226 function, <tt>do_command_line()</tt>, checks for the <tt>-export</tt>
1227 option; if present, the XML database dump is written to the named file, or
1228 to standard output if no filename is given, and the function returns 3 (on
1229 success) or 2 (on error) to signal the core code to terminate immediately.</p>
1230
1231 <p class="backlink"><a href="#top">Back to top</a></p>
1232
1233
1234 <h4 class="subsubsection-title" id="s4-2">8-4-2. <tt>misc/xml-import</tt>: Data import using XML</h4>
1235
1236 <p>The <tt>misc/xml-import</tt> module, defined in <tt>xml-import.c</tt>,
1237 performs the opposite function of the <tt>misc/xml-export</tt> module,
1238 reading data from an XML file and adding it to the various pseudoclient
1239 databases. As with the <tt>misc/xml-export</tt> module, this module is
1240 heavily intertwined with the pseudoclient modules and is unable to handle
1241 data used by third-party modules. Note that the <tt>xml.h</tt> header file
1242 is included by <tt>xml-import.c</tt>, as it is considered a common XML
1243 header file for both import and export, but there are no declarations in
1244 <tt>xml.h</tt> that are actually used in this module.</p>
1245
1246 <p>Since the import of data will typically create new records, the
1247 <tt>xml-import</tt> module requires a way to allocate and initialize a
1248 record of each of the various structure types. This is done for nickname
1249 and channel records by defining the <tt>STANDALONE_NICKSERV</tt> and
1250 <tt>STANDALONE_CHANSERV</tt> preprocessor symbols and including
1251 <tt>modules/nickserv/util.c</tt> and <tt>modules/chanserv/util.c</tt> (see
1252 also <a href="7.html#s3-1-4">section 7-3-1-4</a>), and for other record
1253 types by allocating with <tt>calloc()</tt> and freeing with custom free
1254 routines. This is admittedly a very kludgey way of doing things, but again
1255 is a carryover from previous versions, before the current database system
1256 was developed.</p>
1257
1258 <p>When importing data, there is the possibility that data in the imported
1259 XML file will conflict with data already stored in Services' databases. In
1260 the case of OperServ mask-data (autokill, etc.) records and StatServ server
1261 entries, the record in the imported data is always dropped; however, for
1262 nicknames and channels, one of several methods of handling collisions can
1263 be chosen. The various methods, along with the corresponding configuration
1264 options and the flags used to represent them internally, are:</p>
1265
1266 <ul>
1267 <li class="spaced"><b><tt>XMLI_NICKCOLL_SKIPGROUP</tt>:</b> When a nickname
1268 in the imported data conflicts with a nickname in the database, the
1269 entire nickname group in the imported data containing the
1270 conflicting nickname is discarded. This is the default behavior.</li>
1271
1272 <li class="spaced"><b><tt>XMLI_NICKCOLL_SKIPNICK</tt>:</b> When a nickname
1273 in the imported data conflicts with a nickname in the database,
1274 only that nickname is discarded; if any other (non-colliding)
1275 nicknames remain in the same nickname group, they are imported
1276 normally, otherwise the resulting empty group is discarded. This
1277 behavior is selected by <tt>OnNicknameCollision skipnick</tt>.</li>
1278
1279 <li class="spaced"><b><tt>XMLI_NICKCOLL_OVERWRITE</tt>:</b> When a nickname
1280 in the imported data conflicts with a nickname in the database, the
1281 nickname in the database is dropped, along with its nickname group
1282 if there are no other nicknames in the group. This behavior is
1283 selected by <tt>OnNicknameCollision overwrite</tt>.</li>
1284
1285 <li class="spaced"><b><tt>XMLI_NICKCOLL_ABORT</tt>:</b> When a nickname in
1286 the imported data conflicts with a nickname in the database, the
1287 import procedure is aborted after the XML data has been read in.
1288 This behavior is selected by <tt>OnNicknameCollision abort</tt>.</li>
1289 </ul>
1290
1291 <ul>
1292 <li class="spaced"><b><tt>XMLI_CHANCOLL_SKIP</tt>:</b> When a channel in
1293 the imported data conflicts with a channel in the database, the
1294 channel in the imported data is discarded. This is the default
1295 behavior.</li>
1296
1297 <li class="spaced"><b><tt>XMLI_CHANCOLL_OVERWRITE</tt>:</b> When a channel
1298 in the imported data conflicts with a channel in the database, the
1299 channel in the database is dropped. This behavior is selected by
1300 <tt>OnChannelCollision overwrite</tt>.</li>
1301
1302 <li class="spaced"><b><tt>XMLI_CHANCOLL_ABORT</tt>:</b> When a channel in
1303 the imported data conflicts with a channel in the database, the
1304 import procedure is aborted after the XML data has been read in.
1305 This behavior is selected by <tt>OnNicknameCollision abort</tt>.</li>
1306 </ul>
1307
1308 <p>One flag from each set is stored in the file-local variable <tt>flags</tt>
1309 at module initialization or reconfiguration time, based on the configuration
1310 file settings.</p>
1311
1312 <p>XML input is assumed to be from a file, whose file pointer is stored in
1313 the file-local variable <tt>import_file</tt>. The local function
1314 <tt>get_byte()</tt> reads in a byte from this file, returning the value of
1315 that byte or -1 on error, as well as performing buffering (which is
1316 probably redundant with the buffering performed by the stdio functions) and
1317 updating byte and line counters for use in error messages. The macro
1318 <tt>NEXT_BYTE</tt> encapsulates this call, assigning the return value of
1319 <tt>get_byte()</tt> to a variable <tt>c</tt> and returning -1 when
1320 end-of-file is reached.</p>
1321
1322 <p>The XML data is processed by a simple XML parser, implemented by the
1323 <tt>parse_tag()</tt> routine. This routine calls <tt>read_tag()</tt> to
1324 parse a single tag, then looks up the tag in the <tt>tags[]</tt> table and
1325 calls the associated handler to read and process the tag's contents, and
1326 returns a pointer to those contents (whose type can vary depending on the
1327 tag). The function has three special return values: <tt>CONTINUE</tt> for
1328 tags that were processed successfully but contain no data, <tt>NULL</tt> to
1329 indicate an error processing a tag, or <tt>PARSETAG_END</tt> when the
1330 closing tag corresponding to the tag given in the <tt><i>caller_tag</i></tt>
1331 parameter has been found (or end-of-file is reached). The parser does not
1332 handle empty tags (of the "<tt>&lt;tag/&gt;</tt>" syntax), as they are not
1333 used in well-formed Services data dumps; every tag has some sort of data
1334 associated with it.</p>
1335
1336 <p><tt>read_tag()</tt>, in turn, reads bytes from the file until it locates
1337 the beginning of a tag, then parses the tag name and any attribute (only
1338 the first attribute is processed). The function itself returns 1 for an
1339 opening tag, 0 for a closing tag, or a negative value on error; the tag
1340 name, attribute name, attribute value, pre-tag text, and text length are
1341 stored in the variables pointed to by the parameters <tt><i>tag_ret</i></tt>,
1342 <tt><i>attr_ret</i></tt>, <tt><i>attrval_ret</i></tt>,
1343 <tt><i>text_ret</i></tt>, and <tt><i>textlen_ret</i></tt>, respectively.
1344 The strings returned point into a dynamically-allocated buffer local to the
1345 function, which can be freed by calling it with <tt><i>tag_ret</i></tt> set
1346 to <tt>NULL</tt>.</p>
1347
1348 <p>Each tag handler takes as parameters the tag name, attribute name
1349 (<tt>NULL</tt> if no attribute is present), and attribute value string
1350 (also <tt>NULL</tt> if no attribute is present). Since many tags consist
1351 of simple integer or string values, they make use of the common handlers
1352 <tt>th_text()</tt>, <tt>th_int32()</tt>, <tt>th_uint32()</tt>,
1353 <tt>th_time()</tt>, and <tt>th_strarray()</tt>. Of these, <tt>th_text()</tt>
1354 returns a <tt>TextInfo</tt> structure containing the <tt>malloc()</tt>'d
1355 text buffer, null-terminated, along with the length in bytes of the string
1356 (not including the null terminator); <tt>th_strarray()</tt> returns an
1357 <tt>ArrayInfo</tt> structure containing the <tt>malloc()</tt>'d,
1358 null-terminated string elements and element count; the other handlers
1359 return a pointer to the relevant type. The returned variables themselves
1360 are stored in static buffers local to each handler.</p>
1361
1362 <p>For simple tag handlers like the standard handlers mentioned above,
1363 handling a tag consists of simply parsing the text between the start and
1364 end tags for that tag. This is done by repeatedly calling
1365 <tt>parse_tag()</tt>, passing the handler's <tt><i>tag</i></tt> parameter
1366 as <tt><i>caller_tag</i></tt>, until the function returns
1367 <tt>PARSETAG_END</tt>, and converting the inter-tag text from the final
1368 <tt>parse_tag()</tt> call (the code assumes no intervening tags) to the
1369 proper format. For the case of <tt>th_strarray()</tt>, the
1370 <tt>parse_tag()</tt> loop checks for <tt>&lt;array-element&gt;</tt> tags,
1371 converting their contents to an <tt>ArrayInfo</tt> structure.</p>
1372
1373 <p>The handlers for specific types, like <tt>NickInfo</tt> and
1374 <tt>ChannelInfo</tt>, are more complex, having to deal with multiple
1375 subtags, but follow the same general structure. These handlers return
1376 dynamically allocated structures which are added directly into the import
1377 data list upon being returned from the tag handler.</p>
1378
1379 <p>The overall import process consists of reading the contents of the
1380 <tt>&lt;ircservices-db&gt;</tt> into data structures in memory, then
1381 merging those data structures into the appropriate databases. The reading
1382 and parsing is handled by the <tt>read_data()</tt> routine; if it succeeds,
1383 the data is then merged into the databases with <tt>merge_data()</tt>, and
1384 the loaded data is freed with <tt>free_data()</tt>. These routines are
1385 called by the top-level <tt>xml_import()</tt> function.</p>
1386
1387 <p><tt>read_data()</tt> takes the place of the tag handler for the
1388 <tt>&lt;ircservices-db&gt;</tt> tag, which is read in manually by
1389 <tt>xml_import()</tt> (by calling <tt>read_tag()</tt>). Like other tag
1390 handlers, it loops calling <tt>parse_tag()</tt> to read in subtag contents,
1391 adding each returned structure into the temporary databases used for
1392 storing the data to import. <tt>read_data()</tt> also takes care of
1393 checking for collisions with data already existing in the pseudoclient
1394 databases, and taking proper action in such cases. The routine returns
1395 nonzero if all data was successfully read in and no collisions caused an
1396 abort, else zero.</p>
1397
1398 <p>If <tt>read_data()</tt> succeeds, <tt>merge_data()</tt> is then called
1399 to store the read-in records in the main Services databases. An extra
1400 check is performed here for nicknames and channels, ensuring that no
1401 collisions occur unless the collision flags specified overwriting current
1402 records; deletion of such colliding records is also performed at this stage
1403 (rather than when the data is read in, to avoid the case of a nickname or
1404 channel getting deleted and an error then being found later in the imported
1405 data). In the case of colliding nickname group IDs, the imported group is
1406 renumbered to use a free ID value, and all relevant channel entries
1407 (founders, successors, and access list entries) are adjusted accordingly.</p>
1408
1409 <p>The top-level <tt>xml_import()</tt> function is in turn called by the
1410 <tt>do_command_line()</tt> callback function, hooked into the core's
1411 "<tt>command line</tt>" callback. Like the <tt>misc/xml-export</tt>
1412 module, this module checks for a specific command-line option (in this
1413 case, "<tt>-import</tt>"; if found, <tt>xml_import()</tt> is called with
1414 the file given as a parameter to the option (an error is generated if the
1415 parameter is missing or the file cannot be opened), and the function's
1416 return value (2 or 3) signals Services to exit with an exit code indicating
1417 the success or failure of the import.</p>
1418
1419 <p>Formerly, the <tt>httpd/dbaccess</tt> module (see <a href="#s2-8">section
1420 8-2-8</a>) also provided the ability to import XML data via this module, by
1421 uploading a file via HTTP. This functionality was removed, however, mainly
1422 to avoid the security and stability issues raised by deleting data records
1423 (nicknames and channels) already in use on the network.</p>
1424
1425 <p class="backlink"><a href="#top">Back to top</a></p>
1426
1427 <!------------------------------------------------------------------------>
1428 <hr/>
1429
1430 <p class="backlink"><a href="7.html">Previous section: Services pseudoclients</a> |
1431 <a href="index.html">Table of Contents</a> |
1432 <a href="9.html">Next section: The database conversion tool</a></p>
1433
1434 </body>
1435 </html>