1 |
<?xml version="1.0" encoding="ISO-8859-1"?> |
2 |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11-strict.dtd"> |
3 |
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> |
4 |
<head> |
5 |
<meta http-equiv="Content-Style-Type" content="text/css"/> |
6 |
<style type="text/css">@import "style.css";</style> |
7 |
<title>IRC Services Technical Reference Manual - 8. Other modules</title> |
8 |
</head> |
9 |
|
10 |
<body> |
11 |
<h1 class="title" id="top">IRC Services Technical Reference Manual</h1> |
12 |
|
13 |
<h2 class="section-title">8. Other modules</h2> |
14 |
|
15 |
<p class="section-toc"> |
16 |
8-1. <a href="#s1">Encryption modules</a> |
17 |
<br/> 8-1-1. <a href="#s1-1"><tt>encryption/md5</tt>: MD5 hashing</a> |
18 |
<br/> 8-1-2. <a href="#s1-2"><tt>encryption/unix-crypt</tt>: Encryption with the <tt>crypt()</tt> system function</a> |
19 |
<br/>8-2. <a href="#s2">HTTP server modules</a> |
20 |
<br/> 8-2-1. <a href="#s2-1">Client data structure and related constants</a> |
21 |
<br/> 8-2-2. <a href="#s2-2">HTTP server utility routines</a> |
22 |
<br/> 8-2-3. <a href="#s2-3"><tt>httpd/main</tt>: Main server module</a> |
23 |
<br/> 8-2-4. <a href="#s2-4"><tt>httpd/auth-ip</tt>: Authorization by IP address</a> |
24 |
<br/> 8-2-5. <a href="#s2-5"><tt>httpd/auth-password</tt>: Authorization by password</a> |
25 |
<br/> 8-2-6. <a href="#s2-6"><tt>httpd/top-page</tt>: Static page for server root</a> |
26 |
<br/> 8-2-7. <a href="#s2-7"><tt>httpd/redirect</tt>: Redirects to nickname/channel URLs</a> |
27 |
<br/> 8-2-8. <a href="#s2-8"><tt>httpd/dbaccess</tt>: Provides database access via HTTP</a> |
28 |
<br/> 8-2-9. <a href="#s2-9"><tt>httpd/debug</tt>: Debugging module</a> |
29 |
<br/>8-3. <a href="#s3">Mail-sending modules</a> |
30 |
<br/> 8-3-1. <a href="#s3-1"><tt>mail/main</tt>: Main mail module</a> |
31 |
<br/> 8-3-2. <a href="#s3-2"><tt>mail/sendmail</tt>: Sends mail using the <tt>sendmail</tt> program</a> |
32 |
<br/> 8-3-3. <a href="#s3-3"><tt>mail/smtp</tt>: Sends mail using SMTP</a> |
33 |
<br/>8-4. <a href="#s4">Miscellaneous modules</a> |
34 |
<br/> 8-4-1. <a href="#s4-1"><tt>misc/xml-export</tt>: Data export using XML</a> |
35 |
<br/> 8-4-2. <a href="#s4-2"><tt>misc/xml-import</tt>: Data import using XML</a> |
36 |
</p> |
37 |
|
38 |
<p class="backlink"><a href="7.html">Previous section: Services pseudoclients</a> | |
39 |
<a href="index.html">Table of Contents</a> | |
40 |
<a href="9.html">Next section: The database conversion tool</a></p> |
41 |
|
42 |
<!------------------------------------------------------------------------> |
43 |
<hr/> |
44 |
|
45 |
<h3 class="subsection-title" id="s1">8-1. Encryption modules</h3> |
46 |
|
47 |
<p>As discussed in <a href="2.html#s9-1">section 2-9-1</a>, Services |
48 |
includes facilities for encrypting passwords. While the Services core |
49 |
provides an interface for encryption, the actual encryption processing is |
50 |
handled by encryption modules, located in the <tt>modules/encryption</tt> |
51 |
directory. Two encryption modules are included with Services: |
52 |
<tt>encryption/md5</tt>, using the MD5 hash function to encrypt passwords, |
53 |
and <tt>encryption/unix-crypt</tt>, using the system library's |
54 |
<tt>crypt()</tt> function.</p> |
55 |
|
56 |
<p>Encryption modules generally have three parts:</p> |
57 |
|
58 |
<ul> |
59 |
<li class="spaced">implementations of the <tt>CipherInfo</tt> functions |
60 |
<tt><i>encrypt</i>()</tt>, <tt><i>decrypt</i>()</tt>, and |
61 |
<tt><i>check_password</i>()</tt>;</li> |
62 |
|
63 |
<li class="spaced">a <tt>CipherInfo</tt> data structure, containing the |
64 |
cipher's identifying name and pointers to the three functions; |
65 |
and</li> |
66 |
|
67 |
<li class="spaced">calls to <tt>register_cipher()</tt> and |
68 |
<tt>unregister_cipher()</tt> in the module initialization and |
69 |
cleanup routines.</li> |
70 |
</ul> |
71 |
|
72 |
<p>The three <tt>CipherInfo</tt> functions mentioned above provide |
73 |
encryption, decryption, and encrypt-and-compare functionality for the |
74 |
particular cipher implemented by the module. They are defined as follows |
75 |
(the actual function names are of course up to the particular module):</p> |
76 |
|
77 |
<dl> |
78 |
<dt><tt>int <b>encrypt</b>(const char *<i>src</i>, int <i>len</i>, char *<i>dest</i>, int <i>size</i>)</tt></dt> |
79 |
<dd>Encrypts the plaintext stored in <tt><i>src</i></tt>, which is |
80 |
<tt><i>len</i></tt> bytes long, and stores the result in the buffer |
81 |
pointed to by <tt><i>dest</i></tt> (of size <tt><i>size</i></tt> |
82 |
bytes). The source plaintext is <i>not</i> (necessarily) |
83 |
null-terminated, and should be treated as a block of binary data |
84 |
rather than a textual string. Returns: |
85 |
<ul> |
86 |
<li>0 on success</li> |
87 |
<li>+<i>N</i> (a positive integer) if the destination buffer is too |
88 |
small; <i>N</i> is the minimum size buffer (in bytes) |
89 |
required to hold the encrypted data</li> |
90 |
<li>-1 on other error</li> |
91 |
</ul></dd> |
92 |
|
93 |
<dt><tt>int <b>decrypt</b>(const char *<i>src</i>, char *<i>dest</i>, int <i>size</i>)</tt></dt> |
94 |
<dd>Decrypts the ciphertext stored in <tt><i>src</i></tt>, storing the |
95 |
result in the buffer pointed to by <tt><i>dest</i></tt> (of size |
96 |
<tt><i>size</i></tt> bytes). Returns: |
97 |
<ul> |
98 |
<li>0 on success</li> |
99 |
<li>+<i>N</i> (a positive integer) if the destination buffer is too |
100 |
small; <i>N</i> is the minimum size buffer (in bytes) |
101 |
required to hold the encrypted data</li> |
102 |
<li>-2 if the encryption algorithm does not allow decription</li> |
103 |
<li>-1 on other error</li> |
104 |
</ul></dd> |
105 |
|
106 |
<dt><tt>int <b>check_password</b>(const char *<i>plaintext</i>, const char *<i>password</i>)</tt></dt> |
107 |
<dd>Compares the null-terminated string <tt><i>plaintext</i></tt> |
108 |
against the encrypted data <tt><i>password</i></tt>. Returns: |
109 |
<ul> |
110 |
<li>1 if the password matches</li> |
111 |
<li>0 if the password does not match</li> |
112 |
<li>-1 if an error occurred while checking</li> |
113 |
</ul></dd> |
114 |
</dl> |
115 |
|
116 |
<p>The core encryption source file, <tt>encrypt.c</tt> in the top source |
117 |
directory, contains definitions of these three functions for use when no |
118 |
encryption module is loaded; the functions simply copy the plaintext string |
119 |
into or out of the provided encryption buffer, truncating as necessary. |
120 |
(As a result, only the first <tt>PASSMAX</tt> bytes of longer passwords are |
121 |
valid; any password beginning with those same bytes will be treated as |
122 |
equivalent, similar to the way old Unix-like systems ignored any characters |
123 |
in passwords after the first 8.)</p> |
124 |
|
125 |
<p class="backlink"><a href="#top">Back to top</a></p> |
126 |
|
127 |
|
128 |
<h4 class="subsubsection-title" id="s1-1">8-1-1. <tt>encryption/md5</tt>: MD5 hashing</h4> |
129 |
|
130 |
<p>The <tt>encryption/md5</tt> module, defined in <tt>md5.c</tt>, uses the |
131 |
MD5 message-digest algorithm to encrypt passwords. The bulk of the file |
132 |
consists of a literal copy of the <tt>md5c.c</tt> implementation published |
133 |
by RSA Data Security, Inc.; the <tt>CipherInfo</tt> implementation function |
134 |
<tt>md5_encrypt()</tt> simply calls these functions to obtain a 16-byte |
135 |
hash of its input and returns that hash (as binary data, not a hexadecimal |
136 |
string).</p> |
137 |
|
138 |
<p>Of the remaining two <tt>CipherInfo</tt> functions, <tt>md5_decrypt()</tt> |
139 |
simply returns the special value -2, indicating that MD5 passwords cannot |
140 |
be decrypted; <tt>md5_check_password()</tt> calls <tt>md5_encrypt()</tt> on |
141 |
the plaintext string it is passed, comparing the resulting hash against the |
142 |
given password buffer to determine whether the password is correct.</p> |
143 |
|
144 |
<p>The module includes one configuration option, |
145 |
<tt>EnableAnopeWorkaround</tt>. This is intended to be used with databases |
146 |
that have been imported from the Epona or Anope programs, some versions of |
147 |
which have a bug (which, to be fair, was inherited from an earlier version |
148 |
of Services) causing MD5-encrypted passwords to be stored incorrectly. The |
149 |
bug is in assuming that the <tt>MD5Final()</tt> routine returns an ASCII |
150 |
string of hexadecimal characters—in fact, it returns the raw 128-bit |
151 |
hash value—and attempting to convert that value into binary, |
152 |
resulting in 8 bytes of garbled hash data and 8 bytes that are essentially |
153 |
random. The workaround implemented by <tt>EnableAnopeWorkaround</tt> |
154 |
performs this same procedure when checking passwords if the hash itself |
155 |
does not match; since it only compares the 8 valid bytes of the corrupted |
156 |
hash, there is naturally a greater possibility of a hash collision, which |
157 |
would result in an incorrect password mistakenly being signaled as correct. |
158 |
See also the relevant part of <a href="../5.html#3-2">section 5-3-2 of the |
159 |
user's manual</a>.</p> |
160 |
|
161 |
<p class="backlink"><a href="#top">Back to top</a></p> |
162 |
|
163 |
|
164 |
<h4 class="subsubsection-title" id="s1-2">8-1-2. <tt>encryption/unix-crypt</tt>: Encryption with the <tt>crypt()</tt> system function</h4> |
165 |
|
166 |
<p>The <tt>encryption/unix-crypt</tt> module, defined in |
167 |
<tt>unix-crypt.c</tt>, makes use of the <tt>crypt()</tt> function defined |
168 |
in the system libraries to encrypt passwords. Due to this, it may not be a |
169 |
desirable choice where portability of data is concerned, since differing |
170 |
systems may have incompatible implementations of <tt>crypt()</tt>; on the |
171 |
other hand, it allows Services to take advantage of more secure encryption |
172 |
algorithms as the operating system comes to support them, without having to |
173 |
write new Services modules as well. The impetus for the development of |
174 |
this module was the use of <tt>crypt()</tt> as one encryption method in the |
175 |
PTlink Services program (coincidentally, it was also this program's use of |
176 |
a "cipher type" field stored with passwords that provided the inspiration |
177 |
for the redesign of encryption functionality in Services 5.0).</p> |
178 |
|
179 |
<p>The only noteworthy aspect of the <tt>encryption/unix-crypt</tt> module |
180 |
is the encryption routine, <tt>unixcrypt_encrypt()</tt>. Since the |
181 |
<tt>crypt()</tt> function requires a null-terminated password string (the |
182 |
input is not guaranteed to be null-terminated) and a "salt" parameter, |
183 |
these have to be prepared beforehand; the password is copied into a buffer |
184 |
of size PASSMAX and a trailing null attached, and the "salt" string is |
185 |
generated using the <tt>random()</tt> function. These are then passed to |
186 |
<tt>crypt()</tt>, and the result copied into the output buffer, assuming |
187 |
it is large enough. (Some modern systems implement <tt>crypt()</tt> using |
188 |
an MD5 hash, returned as a 32-character hexadecimal string with a |
189 |
distinguishing prefix; for such cases, <tt>PASSMAX</tt> must be raised from |
190 |
the default of 32, or passwords will not fit.)</p> |
191 |
|
192 |
<p class="backlink"><a href="#top">Back to top</a></p> |
193 |
|
194 |
<!------------------------------------------------------------------------> |
195 |
<hr/> |
196 |
|
197 |
<h3 class="subsection-title" id="s2">8-2. HTTP server modules</h3> |
198 |
|
199 |
<p>Services includes a simple HTTP server that can be used to access |
200 |
Services data from outside IRC. The server is implemented by several |
201 |
modules in the <tt>modules/httpd</tt> directory: a core server module |
202 |
(<a href="#s2-3">section 8-2-3</a>), authorization modules (sections |
203 |
(<a href="#s2-4">8-2-4</a> and <a href="#s2-5">8-2-5</a>), and resource |
204 |
modules (sectiona <a href="#s2-6">8-2-6</a> through |
205 |
<a href="#s2-9">8-2-9</a>). All modules make use of a common header file |
206 |
containing data structure and constant definitions, described in |
207 |
<a href="#s2-1">8-2-1</a>; there are also several utility functions |
208 |
shared by all modules (and compiled into the core server module), discussed |
209 |
in <a href="#s2-2">section 8-2-2</a>.</p> |
210 |
|
211 |
<p class="backlink"><a href="#top">Back to top</a></p> |
212 |
|
213 |
|
214 |
<h4 class="subsubsection-title" id="s2-1">8-2-1. Client data structure and related constants</h4> |
215 |
|
216 |
<p>All modules make use of the header file <tt>http.h</tt>. This header |
217 |
file contains a definition of the <tt>Client</tt> structure, used by the |
218 |
modules to store information about a single client, along with various |
219 |
HTTP-server-related constants and declarations of the utility routines |
220 |
listed in <a href="#s2-2">section 8-2-2</a>.</p> |
221 |
|
222 |
<p>The <tt>Client</tt> structure contains the following fields:</p> |
223 |
|
224 |
<dl> |
225 |
<dt><tt>Socket *<b>socket</b></tt></dt> |
226 |
<dd>Contains the <tt>Socket</tt> structure used for communicating with |
227 |
the client (see <a href="3.html">section 3</a>).</dd> |
228 |
|
229 |
<dt><tt>Timeout *<b>timeout</b></tt></dt> |
230 |
<dd>A timeout (see <a href="2.html#s7">section 2-7</a>) used to |
231 |
disconnect clients after a certain period of idle time.</dd> |
232 |
|
233 |
<dt><tt>char <b>address</b>[22]</tt></dt> |
234 |
<dd>The client's IP address and port number, as a string. (22 bytes is |
235 |
exactly long enough to hold a string of the form |
236 |
"<tt>123.123.123.123:12345</tt>".)</dd> |
237 |
|
238 |
<dt><tt>uint32 <b>ip</b></tt></dt> |
239 |
<dd>The client's IP address, in network byte order.</dd> |
240 |
|
241 |
<dt><tt>uint16 <b>port</b></tt></dt> |
242 |
<dd>The client's (remote) port number, in network byte order.</dd> |
243 |
|
244 |
<dt><tt>int <b>request_count</b></tt></dt> |
245 |
<dd>The number of requests that the client has made over the course of |
246 |
the connection, used to disconnect clients that make more than a |
247 |
certain number of requests.</dd> |
248 |
|
249 |
<dt><tt>int <b>in_request</b></tt></dt> |
250 |
<dd>A flag indicating whether a request is currently being processed |
251 |
for the client.</dd> |
252 |
|
253 |
<dt><tt>char *<b>request_buf</b></tt></dt> |
254 |
<dd>The buffer used to hold request data received from the client.</dd> |
255 |
|
256 |
<dt><tt>int32 <b>request_len</b></tt></dt> |
257 |
<dd>The number of bytes of request data received from the client for |
258 |
this request (<i>i.e.,</i> the number of bytes stored in |
259 |
<tt>request_buf</tt>).</dd> |
260 |
|
261 |
<dt><tt>int <b>version_major</b></tt></dt> |
262 |
<dd>The major version of HTTP in use (the "<tt><i>x</i></tt>" in |
263 |
<tt>HTTP/<i>x</i>.<i>y</i></tt>).</dd> |
264 |
|
265 |
<dt><tt>int <b>version_minor</b></tt></dt> |
266 |
<dd>The minor version of HTTP in use (the "<tt><i>y</i></tt>" in |
267 |
<tt>HTTP/<i>x</i>.<i>y</i></tt>).</dd> |
268 |
|
269 |
<dt><tt>int <b>method</b></tt></dt> |
270 |
<dd>The request method (one of the <tt>METHOD_*</tt> constants; see |
271 |
below).</dd> |
272 |
|
273 |
<dt><tt>char *<b>url</b></tt></dt> |
274 |
<dd>The URL given by the client. Points into <tt>request_buffer</tt>.</dd> |
275 |
|
276 |
<dt><tt>char *<b>data</b></tt></dt> |
277 |
<dd><tt>POST</tt> data for the request, or the query string for a |
278 |
<tt>GET</tt> or <tt>HEAD</tt> request. Points into |
279 |
<tt>request_buffer</tt>.</dd> |
280 |
|
281 |
<dt><tt>int32 <b>data_len</b></tt></dt> |
282 |
<dd><tt>POST</tt> data length, in bytes.</dd> |
283 |
|
284 |
<dt><tt>char **<b>headers</b></tt> |
285 |
<br/><tt>int32 <b>headers_count</b></tt></dt> |
286 |
<dd>A variable-length array containing the request headers. Each |
287 |
element of the array consists of the header name and its value |
288 |
separated by a null byte; the entries point into |
289 |
<tt>request_buffer</tt>.</dd> |
290 |
|
291 |
<dt><tt>char **<b>variables</b></tt> |
292 |
<br/><tt>int32 <b>variables_count</b></tt></dt> |
293 |
<dd>A variable-length array containing any variables found in |
294 |
<tt>POST</tt> data or a <tt>GET</tt> or <tt>HEAD</tt> request. |
295 |
Each element of the array consists of the variable's name and value |
296 |
separated by a null byte, with URL escapes converted to their |
297 |
respective characters.</dd> |
298 |
</dl> |
299 |
|
300 |
<p>There are also several constants defined by the header file:</p> |
301 |
|
302 |
<dl> |
303 |
<dt><tt>HTTP_LINEMAX</tt> (4096)</dt> |
304 |
<dd>Defines the maximum length (including the trailing null byte) of a |
305 |
request line that the server will handle. Lines longer than this |
306 |
will cause the request to be aborted with an HTTP error.</dd> |
307 |
|
308 |
<dt><tt>HTTP_AUTH_*</tt></dt> |
309 |
<dd>Constants used as return values from authorization functions (see |
310 |
<a href="#s2-3">section 8-2-3</a>).</dd> |
311 |
|
312 |
<dt><tt>HTTP_METHOD_*</tt></dt> |
313 |
<dd>Constants used to indicate the request method in the <tt>method</tt> |
314 |
field of the <tt>Client</tt> structure.</dd> |
315 |
</dl> |
316 |
|
317 |
<p>These are followed by constants for the various HTTP return codes, as |
318 |
defined by the relevant RFC documents. Not all (or even most) of these |
319 |
are used by Services modules, but all are included for completeness. The |
320 |
name of each constant includes a character indicating the type of response |
321 |
(much like the first digit of the numeric code): "<tt>I</tt> for |
322 |
Informational, "<tt>S</tt>" for Successful, and so on.</p> |
323 |
|
324 |
<p class="backlink"><a href="#top">Back to top</a></p> |
325 |
|
326 |
|
327 |
<h4 class="subsubsection-title" id="s2-2">8-2-2. HTTP server utility routines</h4> |
328 |
|
329 |
<p>The <tt>util.c</tt> source file contains several common functions used |
330 |
by HTTP server modules, listed below. <tt>util.c</tt> is linked into the |
331 |
main HTTP server module, <tt>httpd/main</tt>, so all submodules can make |
332 |
use of them without the necessity of explicitly importing each function.</p> |
333 |
|
334 |
<dl> |
335 |
<dt><tt>char *<b>http_get_header</b>(Client *<i>c</i>, const char *<i>header</i>)</tt></dt> |
336 |
<dd>Returns the contents of the header <tt><i>header</i></tt> in the |
337 |
given client's currently active request, or <tt>NULL</tt> if the |
338 |
request did not include such a header. If <tt><i>header</i></tt> |
339 |
is <tt>NULL</tt>, returns the next instance of the header last |
340 |
searched for; this usage allows the caller to cycle through |
341 |
multiple headers of the same name, much like <tt>strtok()</tt> |
342 |
iterates through tokens in a string.</dd> |
343 |
|
344 |
<dt><tt>char *<b>http_get_variable</b>(Client *<i>c</i>, const char *<i>variable</i>)</tt></dt> |
345 |
<dd>Returns the contents of the variable <tt><i>variable</i></tt> in |
346 |
the given client's currently active request, or <tt>NULL</tt> if |
347 |
the request did not include such a variable. Like |
348 |
<tt>http_get_header()</tt>, a <tt>NULL</tt> value for the |
349 |
<tt><i>variable</i></tt> parameter allows iterating through |
350 |
multiple instances of a variable.</dd> |
351 |
|
352 |
<dt><tt>char *<b>http_quote_html</b>(const char *<i>str</i>, char *<i>outbuf</i>, int32 <i>outsize</i>)</tt></dt> |
353 |
<dd>Applies HTML-style quoting to <tt><i>str</i></tt>, replacing the |
354 |
characters <tt>< > &</tt> with "<tt>&lt;</tt>", |
355 |
"<tt>&gt;</tt>", and "<tt>&amp;</tt>" respectively. |
356 |
<!-- It sure is messy trying to talk about HTML in HTML... --> |
357 |
The result is placed in <tt><i>outbuf</i></tt>, and is truncated if |
358 |
necessary to fit within <tt><i>outsize</i></tt> bytes, including |
359 |
the trailing null byte; however, HTML entities inserted by this |
360 |
routine will never be partially truncated (if an entity would cause |
361 |
a buffer overflow, the output string will be terminated at the |
362 |
location where the entity would have been inserted). The routine |
363 |
returns <tt><i>outbuf</i></tt>, except when a parameter is invalid, |
364 |
in which case <tt>NULL</tt> is returned.</dd> |
365 |
|
366 |
<dt><tt>char *<b>http_quote_url</b>(const char *<i>str</i>, char *<i>outbuf</i>, int32 <i>outsize</i>, int <i>slash_question</i>)</tt></dt> |
367 |
<dd>Applies URL escaping to <tt><i>str</i></tt>, replacing with their |
368 |
equivalent <tt>%<i>nn</i></tt> escapes any characters not in the |
369 |
set: |
370 |
<br/><tt> A-Z a-z 0-9 - . _</tt> |
371 |
<br/>As with <tt>http_quote_html()</tt>, stores the (possibly |
372 |
truncated, but without partial escapes) result in |
373 |
<tt><i>outbuf</i></tt>, and returns <tt><i>outbuf</i></tt>, or |
374 |
<tt>NULL</tt> on invalid parameters.</dd> |
375 |
|
376 |
<dt><tt>char *<b>http_unquote_url</b>(char *<i>buf</i>)</tt></dt> |
377 |
<dd>Converts any URL escapes in the string <tt><i>buf</i></tt> to their |
378 |
corresponding characters, overwriting the buffer. A truncated |
379 |
escape at the end of the string is discarded, as is any malformed |
380 |
escape (a <tt>%</tt> followed by two characters, one or both of |
381 |
which are not hexadecimal digits). Returns <tt><i>buf</i></tt>. |
382 |
(Note that Unicode escapes of the form <tt>%U<i>nnnn</i></tt> are |
383 |
<i>not</i> handled by this routine, and will be interpreted as a |
384 |
malformed escape followed by three ordinary characters.)</dd> |
385 |
|
386 |
<dt><tt>void <b>http_send_response</b>(Client *<i>c</i>, int <i>code</i>)</tt></dt> |
387 |
<dd>Sends an HTTP response line with the response code |
388 |
<tt><i>code</i></tt>, followed by a <tt>Date:</tt> header. The |
389 |
header portion of the response is not terminated, so the caller can |
390 |
send additional headers as necessary.</dd> |
391 |
|
392 |
<dt><tt>void <b>http_error</b>(Client *<i>c</i>, int <i>code</i>, const char *<i>format</i>, ...)</tt></dt> |
393 |
<dd>Sends an error message (response headers and body) to the given |
394 |
client, then closes the client's connection. The HTTP response |
395 |
code for the error message is given by <tt><i>code</i></tt>. |
396 |
<tt><i>format</i></tt> gives an optional <tt>printf()</tt>-style |
397 |
format string to use for generating the body of the error message; |
398 |
if it is <tt>NULL</tt>, then default body text is chosen based on |
399 |
the response code.</dd> |
400 |
</dl> |
401 |
|
402 |
<p class="backlink"><a href="#top">Back to top</a></p> |
403 |
|
404 |
|
405 |
<h4 class="subsubsection-title" id="s2-3">8-2-3. <tt>httpd/main</tt>: Main server module</h4> |
406 |
|
407 |
<p>The core of the HTTP server is implemented by the <tt>httpd/main</tt> |
408 |
module, defined in the source file <tt>main.c</tt> (along with |
409 |
<tt>util.c</tt>, mentioned above). This module takes care of establishing |
410 |
a listener socket with which to accept client connections, receiving and |
411 |
parsing requests from clients, and passing those requests off to handlers |
412 |
which generate data to send back to the client. (The core module does not |
413 |
respond to any requests by itself, except for generating errors for |
414 |
requests that cannot be successfully processed.</p> |
415 |
|
416 |
<p>Unlike most other modules, which take actions in response to messages |
417 |
received from the IRC network, the HTTP server operates independently, |
418 |
relying on the socket framework (see <a href="3.html">section 3</a>) to |
419 |
inform it of activity. The module initialization routine, |
420 |
<tt>init_module()</tt>, opens the port or ports specified by the |
421 |
<tt>ListenTo</tt> configuration directive, creating listener sockets which |
422 |
call back to the <tt>do_accept()</tt> function when a connection is |
423 |
received. The initialization routine also creates two callbacks, |
424 |
"<tt>auth</tt>" and "<tt>request</tt>", into which submodules can hook to |
425 |
provide authorization or request handling services; these are covered in |
426 |
the discussion of request handling below.</p> |
427 |
|
428 |
<p>When a connection has been accepted on a socket, the <tt>do_accept()</tt> |
429 |
routine first ensures that the client address is available (as it may be |
430 |
necessary for authorization purposes), then creates and initializes a |
431 |
<tt>Client</tt> structure in which to store information about the client. |
432 |
This is done before checking the number of active connections so that, if |
433 |
the client is to be disconnected due to load, an appropriate error response |
434 |
can be sent with <tt>http_error()</tt> (which requires a valid |
435 |
<tt>Client</tt> structure). If all goes well, read-line and disconnect |
436 |
callbacks are set on the new socket, along with a timeout (as given by the |
437 |
<tt>IdleTimeout</tt> configuration directive), and <tt>do_accept()</tt> |
438 |
returns.</p> |
439 |
|
440 |
<p>The actual request processing takes place in two stages: first the |
441 |
full request is received from the client (unless the connection is aborted |
442 |
with an error), and then the request is passed to the relevant handlers. |
443 |
These stages are handled by the <tt>do_readline()</tt> socket callback |
444 |
function and the <tt>handle_request()</tt> routine.</p> |
445 |
|
446 |
<p><tt>do_readline()</tt> is called for each line of the request received |
447 |
from the client, and parses each line into appropriate parts of the |
448 |
<tt>Client</tt> structure. The routine tells the first (request) line from |
449 |
subsequent (header) lines by whether or not the <tt>url</tt> field of the |
450 |
<tt>Client</tt> structure is set; if the first line has been successfully |
451 |
processed, this field will always have a non-<tt>NULL</tt> value. Header |
452 |
lines are handled by the subroutine <tt>parse_header()</tt>, which checks |
453 |
whether the line is a new header or a continuation line of a previous |
454 |
header and processes it accordingly.</p> |
455 |
|
456 |
<p>Once the blank line signaling the end of headers has been received, |
457 |
<tt>do_readline()</tt> checks whether the request has a body part (a |
458 |
<tt>POST</tt> request with a nonzero <tt>Content-Length</tt> header). If |
459 |
so, the read-line callback on the socket is removed, and |
460 |
<tt>do_readdata()</tt> is instead added as a read-data callback; |
461 |
<tt>do_readdata()</tt> reads in the requisite number of body data bytes and |
462 |
calls <tt>handle_request()</tt>. Otherwise, <tt>do_readline()</tt> calls |
463 |
<tt>handle_request()</tt> itself, after first truncating any query portion |
464 |
of the URL of a <tt>GET</tt> or <tt>HEAD</tt> request and putting the query |
465 |
data in the <tt>Client</tt> structure's <tt>data</tt> field.</p> |
466 |
|
467 |
<p><tt>handle_request()</tt> first takes any <tt>GET</tt> query or |
468 |
<tt>POST</tt> data and splits it up into variables and values, by calling |
469 |
either <tt>parse_data()</tt> or <tt>parse_data_multipart()</tt> depending |
470 |
on the request type. After this, it increments the client'S request count, |
471 |
sets the <tt>in_request</tt> flag, and then sets a local variable |
472 |
<tt>close</tt> which is used to indicate whether the client connection |
473 |
should be closed when the request processing is finished. After this setup |
474 |
is complete, <tt>handle_request()</tt> calls the two callbacks |
475 |
"<tt>auth</tt>" and "<tt>request</tt>" to perform the actual request |
476 |
handling; callback functions for both callbacks take the <tt>Client</tt> |
477 |
structure and a pointer to the <tt>close</tt> variable (which may be |
478 |
modified) as parameters.</p> |
479 |
|
480 |
<p>The "<tt>auth</tt>" callback is used for request authorization. Each |
481 |
callback function must return one of the <tt>HTTP_AUTH_*</tt> values |
482 |
defined in <tt>http.h</tt>. A value of <tt>HTTP_AUTH_ALLOW</tt> causes the |
483 |
request to be allowed at that point, skipping any subsequent callback |
484 |
functions; likewise, a value of <tt>HTTP_AUTH_DENY</tt> causes the request |
485 |
to be immediately denied. <tt>HTTP_AUTH_UNDECIDED</tt> can be used when |
486 |
the callback function has nothing to say about the instant request, and |
487 |
allows the next callback function to handle authorization. If all callback |
488 |
functions return <tt>HTTP_AUTH_UNDECIDED</tt> (or no callback functions are |
489 |
registered), the request is allowed.</p> |
490 |
|
491 |
<p>The "<tt>request</tt>" callback is used for actual request processing. |
492 |
Each callback function should check the URL to determine whether it is one |
493 |
to be processed by that function or not; if so, then the routine should |
494 |
take appropriate action and return a nonzero value, causing any subsequent |
495 |
callback functions to be skipped. If all callback functions return zero, |
496 |
the core server module will send a "not found" (404) error to the client.</p> |
497 |
|
498 |
<p>Once the request has been processed, <tt>handle_request</tt> either |
499 |
closes the socket or clears out the <tt>Client</tt> structure, depending on |
500 |
whether the <tt>close</tt> flag is set (nonzero) or clear (zero). In the |
501 |
latter case, request processing for the connection then starts over with |
502 |
parsing of request lines by <tt>do_readline()</tt>. As an adjunct to the |
503 |
<tt>clear</tt> flag, if the <tt>Client</tt> structure's <tt>in_request</tt> |
504 |
field has a negative value, the connection is closed as well; this is to |
505 |
allow <tt>http_error()</tt>, which does not receive a pointer to the |
506 |
<tt>close</tt> flag, to signal that the client should be disconnected.</p> |
507 |
|
508 |
<p>It should be noted that client sockets are set to blocking mode (see |
509 |
the description of <tt>sock_set_blocking()</tt> in |
510 |
<a href="3.html#s2-1">section 3-2-1</a>), to simplify implementation of |
511 |
request handlers. Depending on the modules and setting used, this can |
512 |
allow a malicious user to cause Services to freeze by requesting a large |
513 |
amount of data from Services (enough to increase the socket buffer to its |
514 |
maximum size) and deliberately not receiving any of that data.</p> |
515 |
|
516 |
<p class="backlink"><a href="#top">Back to top</a></p> |
517 |
|
518 |
|
519 |
<h4 class="subsubsection-title" id="s2-4">8-2-4. <tt>httpd/auth-ip</tt>: Authorization by IP address</h4> |
520 |
|
521 |
<p>The <tt>httpd/auth-ip</tt> module, defined in <tt>auth-ip.c</tt>, is one |
522 |
of two authentication modules included with the Services HTTP server, and |
523 |
allows requests to be allowed or denied based on the IP address of the |
524 |
client. The module maintains a list of allow/deny rules, each with an |
525 |
associated URL path prefix, IP address, and network mask; when a request is |
526 |
found that matches a rule's prefix/address/mask triplet, the request is |
527 |
either allowed or denied based on the type of rule. (If the request |
528 |
matches more than one rule, only the first in the table—also the |
529 |
first in the file—is applied.)</p> |
530 |
|
531 |
<p>The callback function for the server core's "<tt>auth</tt>" callback, |
532 |
<tt>do_auth()</tt>, is very simple, needing only to iterate through the |
533 |
rule table to find a matching rule for the request. The hard work of |
534 |
converting the list of <tt>AllowHost</tt> and <tt>DenyHost</tt> rules into |
535 |
a table that can be easily processed is handled at module configuration |
536 |
time via custom handler functions for the two directives, |
537 |
<tt>do_AllowHost()</tt> and <tt>do_DenyHost()</tt>. In fact, these are |
538 |
both stubs which call a common routine, <tt>do_AllowDenyHost()</tt>, with |
539 |
an extra parameter to indicate the rule type (allow or deny).</p> |
540 |
|
541 |
<p>Note that this module interprets "allow" rules to mean "allow unless |
542 |
denied by another authorization method", and not "allow regardless of any |
543 |
other circumstances". Thus, if a request matches an "allow" rule, the |
544 |
callback function returns <tt>HTTP_AUTH_UNDECIDED</tt> rather than |
545 |
<tt>HTTP_AUTH_ALLOW</tt>.</p> |
546 |
|
547 |
<p class="backlink"><a href="#top">Back to top</a></p> |
548 |
|
549 |
|
550 |
<h4 class="subsubsection-title" id="s2-5">8-2-5. <tt>httpd/auth-password</tt>: Authorization by password</h4> |
551 |
|
552 |
<p>The <tt>httpd/auth-password</tt> module, defined in |
553 |
<tt>auth-password.c</tt>, performs authorization based on a username and |
554 |
password provided by a client (using the WWW-Basic HTTP authorization |
555 |
method). If a request is denied, the authorization handler sends an |
556 |
HTTP "401 Unauthorized" message to the client, giving the realm name |
557 |
specified in the rule to provide a user prompt. Other than this, and the |
558 |
comparative simplicity of the configuration directive handler functions, |
559 |
this module is more or less identical to <tt>auth-ip.c</tt>.</p> |
560 |
|
561 |
<p>As with the <tt>httpd/auth-ip</tt> module (and also mentioned in |
562 |
comments in the source code for this module), "allow" rules are treated as |
563 |
"allow subject to other permission checks" rather than "allow |
564 |
unconditionally", and the callback function <tt>do_auth()</tt> returns |
565 |
<tt>HTTP_AUTH_UNDECIDED</tt> rather than <tt>HTTP_AUTH_ALLOW</tt> for such |
566 |
rules.</p> |
567 |
|
568 |
<p class="backlink"><a href="#top">Back to top</a></p> |
569 |
|
570 |
|
571 |
<h4 class="subsubsection-title" id="s2-6">8-2-6. <tt>httpd/top-page</tt>: Static page for server root</h4> |
572 |
|
573 |
<p>The <tt>httpd/top-page</tt> module, defined in <tt>top-page.c</tt>, is a |
574 |
very simple request handler which (depending on configuration settings) |
575 |
sends either the contents of a local file or an HTTP redirect in response |
576 |
to a request for the server's top page ("<tt>/</tt>").</p> |
577 |
|
578 |
<p class="backlink"><a href="#top">Back to top</a></p> |
579 |
|
580 |
|
581 |
<h4 class="subsubsection-title" id="s2-7">8-2-7. <tt>httpd/redirect</tt>: Redirects to nickname/channel URLs</h4> |
582 |
|
583 |
<p>The <tt>httpd/redirect</tt> module, defined in <tt>redirect.c</tt>, |
584 |
allows URLs stored with registered nicknames and channels to be accessed |
585 |
through the HTTP server. Two URL prefixes, one each for nicknames and |
586 |
channels, are defined via configuration directives (<tt>NicknamePrefix</tt> |
587 |
and <tt>ChannelPrefix</tt> respectively); when a request is received that |
588 |
matches one of the prefixes, the remainder of the URL is used as a nickname |
589 |
or channel name, and a redirect is sent for the URL associated with the |
590 |
nickname or channel (if not registered or no URL is stored, an error is |
591 |
returned).</p> |
592 |
|
593 |
<p>Since the "<tt>#</tt>" character is treated specially by web browsers, |
594 |
channel names are specified without the "<tt>#</tt>", which is added back |
595 |
internally when accessing the channel's data. For example, if |
596 |
<tt>ChannelPrefix</tt> is "<tt>/channel/</tt>", then a URL of |
597 |
"<tt>/channel/SomeChannel</tt>" will redirect to the URL record for the |
598 |
channel <tt>#SomeChannel</tt>.</p> |
599 |
|
600 |
<p>Naturally, in order to access nickname and channel data, the module |
601 |
must interface with the NickServ and ChanServ modules. This is done via |
602 |
the "<tt>load module</tt>" and "<tt>unload module</tt>" callbacks, which |
603 |
watch for the <tt>nickserv/main</tt> and <tt>chanserv/main</tt> modules to |
604 |
be loaded and save pointers to necessary functions. To avoid problems |
605 |
arising from the order in which the module is loaded, the |
606 |
<tt>init_module()</tt> routine also checks for the presence of these |
607 |
modules, and calls the "<tt>load module</tt>" callback function |
608 |
<tt>do_load_module()</tt> manually if they are already loaded.</p> |
609 |
|
610 |
<p class="backlink"><a href="#top">Back to top</a></p> |
611 |
|
612 |
|
613 |
<h4 class="subsubsection-title" id="s2-8">8-2-8. <tt>httpd/dbaccess</tt>: Provides database access via HTTP</h4> |
614 |
|
615 |
<p>The <tt>http/dbaccess</tt> module, defined in <tt>dbaccess.c</tt>, |
616 |
provides access to the data stored in the Services pseudoclient databases. |
617 |
It is easily the most complex of the HTTP server modules, as it must |
618 |
interface with each of the pseudoclient modules to obtain the data it |
619 |
provides to the client, and it must remain up-to-date with any changes to |
620 |
the internal data storage format used by the various modules.</p> |
621 |
|
622 |
<p>At the top of the file are several definitions used to simplify access |
623 |
to imported functions and variables. As noted in the source code, these |
624 |
are not only referenced when the corresponding module has been loaded and |
625 |
the symbols successfully dereferenced, so there is no need to check the |
626 |
pointers for <tt>NULL</tt> values. <i>(Implementation note: Nonetheless, |
627 |
it would be a good idea to do so anyway, just in case.)</i> These are |
628 |
followed by the <tt>PRINT_SELOPT()</tt> macro, used to generate HTML for |
629 |
selecting among one of several display options, and the |
630 |
<tt>my_strftime()</tt> function, which converts a <tt>time_t</tt> timestamp |
631 |
value to a standard-format string and HTML-quotes the result.</p> |
632 |
|
633 |
<p>The main request handler routine, <tt>do_request()</tt>, is located |
634 |
following these initial definitions. The only actual work performed by |
635 |
this routine, however, is checking the URL against the prefix defined for |
636 |
use by the module (in the <tt>Prefix</tt> variable, set by the same-named |
637 |
configuration directive), and generating a root page under <tt>Prefix</tt> |
638 |
redirecting to each of the available sets of data, one per pseudoclient |
639 |
(and one for XML export, as noted below). All requests for subpages are |
640 |
delivered to the appropriate subpath handler.</p> |
641 |
|
642 |
<p>This routine is followed by the subpath handlers themselves, each with a |
643 |
name of the form <tt>handle_<i>XXX</i>()</tt> indicating the subpath |
644 |
handled by the routine (with a few exceptions, noted below). Each handler |
645 |
takes the <tt>Client *<i>c</i></tt> and <tt>int *<i>close_ptr</i></tt> |
646 |
parameters from the original request, along with a <tt>char *<i>path</i></tt> |
647 |
parameter indicating the remainder of the URL path below the handler's own |
648 |
subpath.</p> |
649 |
|
650 |
<p>The first of these handlers is the OperServ data handler, |
651 |
<tt>handle_operserv()</tt>. In addition to the current number of users |
652 |
and operators along with basic data recorded by OperServ (the maximum user |
653 |
count and time), the page includes links to further subhandlers for |
654 |
autokills and exclusions, news items, session exceptions, and S-lines. |
655 |
Each of these has its own handler function; with the exception of news |
656 |
items (handled by <tt>handle_operserv_news()</tt>), the subhandlers make |
657 |
use of a common routine, <tt>handle_operserv_maskdata()</tt>, to output |
658 |
the appropriate data. (However, there is no support for an explicit path |
659 |
<tt>/operserv/maskdata</tt>.)</p> |
660 |
|
661 |
<p>The <tt>handle_operserv_maskdata()</tt> routine has two modes of |
662 |
operation, as do many of the lowest-level data handlers. When called with |
663 |
no further subpath (<i>e.g.</i> <tt>/operserv/akill/</tt>), a list of |
664 |
mask-data records of the appropriate type is sent to the client as a list |
665 |
of links. Selecting one of these will go to a path with that string as the |
666 |
final path element, and will cause the routine to display detailed |
667 |
information about the selected entry, much like using the <tt>VIEW</tt> |
668 |
subcommand of OperServ's various mask-data commands.</p> |
669 |
|
670 |
<p>Unlike the other OperServ data sets, there is no detailed information to |
671 |
show about news items. Therefore, the <tt>handle_operserv_news()</tt> |
672 |
routine simply outputs a list of news items (both logon news and operator |
673 |
news), like the <tt>LOGONNEWS LIST</tt> and <tt>OPERNEWS LIST</tt> |
674 |
commands.</p> |
675 |
|
676 |
<p>The OperServ data handlers are followed by <tt>handle_nickserv()</tt>, |
677 |
for displaying nickname data. Unlike <tt>handle_operserv()</tt>, this |
678 |
routine does not call on any subroutines, as there are only two modes of |
679 |
operation: listing registered nicknames (handled at the top of the routine) |
680 |
and displaying detailed information on a specific nickname (handled by the |
681 |
long remainder of the routine). The length of the routine is mainly the |
682 |
result of the need to quote all special characters in nickname data, to |
683 |
prevent malicious users from corrupting the output by setting particular |
684 |
strings in their nickname data.</p> |
685 |
|
686 |
<p>This is followed by <tt>handle_chanserv()</tt>, which functions |
687 |
similarly to <tt>handle_nickserv()</tt> except that it works on channels |
688 |
rather than nicknames. However, to reduce the amount of data sent in |
689 |
response to a single request, the privilege level, channel access, and |
690 |
autokick lists are split off into separate pages, accessed by appending |
691 |
"<tt>/levels</tt>", "<tt>/access</tt>", or "<tt>/autokick</tt>" |
692 |
respectively to the URL. The local variable <tt>mode</tt> keeps track of |
693 |
what type of data the routine is to display.</p> |
694 |
|
695 |
<p>Next is <tt>handle_statserv()</tt>, which predictably displays |
696 |
information from the StatServ pseudoclient's database. As StatServ |
697 |
currently only tracks a minimal amount of data, the implementation is |
698 |
comparatively simple, either listing the servers recorded with StatServ or |
699 |
displaying information for a selected server.</p> |
700 |
|
701 |
<p>Finally, <tt>handle_xml_export()</tt> is used to generate an XML data |
702 |
set containing all data registered with Services pseudoclients, using the |
703 |
<tt>misc/xml-export</tt> module described in <a href="#s4-1">section |
704 |
8-4-1</a>. As browsers may attempt to parse the data rather than |
705 |
displaying or saving it if a content type of <tt>text/xml</tt> is used, |
706 |
the module instead sends the type <tt>text/plain</tt>. (The acerbic |
707 |
comment in the source code has to do with a misfeature in at least some |
708 |
versions of the Microsoft Internet Explorer web browser; such versions |
709 |
ignore a <tt>Content-Type: text/plain</tt> header and attempt to interpret |
710 |
the data using internal heuristics, resulting in users being unable to view |
711 |
the XML data.)</p> |
712 |
|
713 |
<p class="backlink"><a href="#top">Back to top</a></p> |
714 |
|
715 |
|
716 |
<h4 class="subsubsection-title" id="s2-9">8-2-9. <tt>httpd/debug</tt>: Debugging module</h4> |
717 |
|
718 |
<p>The <tt>http/debug</tt> module, defined in <tt>debug.c</tt>, is intended |
719 |
to be used for debugging the HTTP server, and dumps several fields of the |
720 |
<tt>Client</tt> structure in response to requests to a particular URL (set |
721 |
by the <tt>DebugURL</tt> configuration directive). While the module does |
722 |
not return any sensitive information to the client, only information about |
723 |
the client itself, it is still bad practice to leave any unnecessary |
724 |
functionality such as this enabled, so this module should not be (and is |
725 |
not intended to be) loaded except when debugging.</p> |
726 |
|
727 |
<p>The <tt>do_request()</tt> function in the source code, which does the |
728 |
actual request handling, also includes a number of comments explaining the |
729 |
request-handling process in more detail.</p> |
730 |
|
731 |
<p class="backlink"><a href="#top">Back to top</a></p> |
732 |
|
733 |
<!------------------------------------------------------------------------> |
734 |
<hr/> |
735 |
|
736 |
<h3 class="subsection-title" id="s3">8-3. Mail-sending modules</h3> |
737 |
|
738 |
<p>In order to facilitate features such as mail authentication and memo |
739 |
forwarding, Services includes a set of modules allowing mail to be sent to |
740 |
remote systems. As with the built-in HTTP server described in |
741 |
<a href="#s2">section 8-2</a>, this functionality operates independently |
742 |
of the primary pseudoclients and IRC network connection (except to the |
743 |
extent that the sending of mail is typically initiated in response to a |
744 |
pseudoclient command).</p> |
745 |
|
746 |
<p>The mail-sending subsystem is composed of a core module implementing the |
747 |
mail interface, <tt>mail/main</tt>, and submodules for specific methods of |
748 |
sending mail. All relevant source files are located in the |
749 |
<tt>modules/mail</tt> directory.</p> |
750 |
|
751 |
<p class="backlink"><a href="#top">Back to top</a></p> |
752 |
|
753 |
|
754 |
<h4 class="subsubsection-title" id="s3-1">8-3-1. <tt>mail/main</tt>: Main mail module</h4> |
755 |
|
756 |
<p>The core mail-sending functionality is located in the <tt>mail/main</tt> |
757 |
module, defined in <tt>main.c</tt>. The module consists of two interfaces: |
758 |
an external interface, declared in the <tt>mail.h</tt> header file, for use |
759 |
by other modules to send mail, and an internal interface, declared in the |
760 |
<tt>mail-local.h</tt> header file, used for communicating with the |
761 |
low-level modules that perform the actual send operation.</p> |
762 |
|
763 |
<p>The external interface consists of a single function, <tt>sendmail()</tt>, |
764 |
declared as follows:</p> |
765 |
|
766 |
<div class="code">void <b>sendmail</b>(const char *<i>to</i>, const char *<i>subject</i>, |
767 |
const char *<i>body</i>, const char *<i>charset</i>, |
768 |
MailCallback <i>completion_callback</i>, void *<i>callback_data</i>)</div> |
769 |
|
770 |
<ul> |
771 |
<li class="spaced"><tt>const char *<i>to</i></tt>: The address to which the |
772 |
message is to be sent.</li> |
773 |
<li class="spaced"><tt>const char *<i>subject</i></tt>: The subject line to |
774 |
use with the message.</li> |
775 |
<li class="spaced"><tt>const char *<i>body</i></tt>: The body of the |
776 |
message (newlines are permitted within the message body).</li> |
777 |
<li class="spaced"><tt>const char *<i>charset</i></tt>: <i>Optional.</i> |
778 |
The MIME character set (<i>e.g.</i>, "<tt>iso-8859-1</tt>") in |
779 |
which the message text is written. If not specified, no character |
780 |
set is assumed.</li> |
781 |
<li class="spaced"><tt>MailCallback <i>completion_callback</i></tt>: |
782 |
<i>Optional.</i> The function to be called when mail sending |
783 |
completes (see below).</li> |
784 |
<li class="spaced"><tt>void *<i>callback_data</i></tt>: <i>Optional.</i> |
785 |
Arbitrary data passed unchanged to the completion callback.</li> |
786 |
</ul> |
787 |
|
788 |
<p>The first thing to note about this function is that it does not return a |
789 |
value. Mail sending is performed asynchronously (subject to limitations of |
790 |
the particular low-level module in use), so that when the function returns, |
791 |
the requested message has been queued but not necessarily sent. In order |
792 |
to signal the result of a mail-sending operation, <tt>sendmail()</tt> takes |
793 |
a callback function parameter (<tt><i>completion_callback</i></tt>); this |
794 |
function is called when the sending operation has completed, successfully |
795 |
or otherwise. The function type is defined as <tt><i>MailCallback</i></tt> |
796 |
in <tt>mail.h</tt>:</p> |
797 |
|
798 |
<div class="code">typedef void (*<b>MailCallback</b>)(int <i>status</i>, void *<i>data</i>)</div> |
799 |
|
800 |
<p>where <tt><i>data</i></tt> is the <tt><i>callback_data</i></tt> value |
801 |
passed to <tt>sendmail()</tt>, and <tt><i>status</i></tt> is one of the |
802 |
following values:</p> |
803 |
|
804 |
<ul> |
805 |
<li><tt>MAIL_STATUS_SENT</tt>: The message was successfully sent.</li> |
806 |
<li><tt>MAIL_STATUS_ERROR</tt>: An unspecified error occurred while sending |
807 |
the message.</li> |
808 |
<li><tt>MAIL_STATUS_NORSRC</tt>: Insufficient resources were available to |
809 |
perform the send operation.</li> |
810 |
<li><tt>MAIL_STATUS_REFUSED</tt>: Delivery of the message was refused by |
811 |
the remote system.</li> |
812 |
<li><tt>MAIL_STATUS_TIMEOUT</tt>: A timeout occurred while trying to send |
813 |
the message.</li> |
814 |
<li><tt>MAIL_STATUS_ABORTED</tt>: The operation was aborted (because the |
815 |
low-level mail module was removed before the message was sent, for |
816 |
example).</li> |
817 |
</ul> |
818 |
|
819 |
<p>It is important to note that, while <tt>sendmail()</tt> does not wait |
820 |
for the message to be sent before returning, there is nothing preventing |
821 |
the low-level module from delivering the message immediately if possible, |
822 |
and in cases such as sending to a user on the local system, the callback |
823 |
function may be called even before <tt>sendmail()</tt> itself returns! For |
824 |
this reason, the caller must ensure that all setup required by the callback |
825 |
function is performed <i>before</i> calling <tt>sendmail()</tt>.</p> |
826 |
|
827 |
<p><tt>sendmail()</tt>, in turn, does its work by calling out to functions |
828 |
implemented in a low-level module. The interface consists of two functions |
829 |
which the low-level module must provide, along with a function provided by |
830 |
the core module for signaling the completion of a mail operation:</p> |
831 |
|
832 |
<dl> |
833 |
<dt><tt>void (*<b>low_send</b>)(MailMessage *<i>msg</i>)</tt></dt> |
834 |
<dd>Provided by the low-level module, this function performs the actual |
835 |
work of starting the send operation, and is called by |
836 |
<tt>sendmail()</tt> once parameter and other checks have been |
837 |
performed. As with <tt>sendmail()</tt>, the routine does not |
838 |
return a value, but instead calls <tt>send_finished()</tt> (see |
839 |
below) to signal the message's status. Typically, this routine |
840 |
will perform any necessary module-specific checks, then start the |
841 |
asynchronous send operation and return without calling |
842 |
<tt>send_finished()</tt>. |
843 |
|
844 |
<p>The parameter passed to this routine is a structure (see below) |
845 |
describing the message to be sent. On entry, the structure's |
846 |
<tt>from</tt>, <tt>to</tt>, <tt>subject</tt>, and <tt>body</tt> are |
847 |
guaranteed to be non-<tt>NULL</tt>. The strings in these fields |
848 |
and the <tt>fromname</tt> field (which may be <tt>NULL</tt>) can be |
849 |
changed freely, but the pointer values should be left |
850 |
unmodified.</p></dd> |
851 |
|
852 |
<dt><tt>void (*<b>low_abort</b>)(MailMessage *<i>msg</i>)</tt></dt> |
853 |
<dd>Provided by the low-level module, this function takes any actions |
854 |
needed to abort the sending of a message currently in progress; |
855 |
the message to abort is indicated by the <tt><i>msg</i></tt> |
856 |
parameter, which will be the same as passed to a previous call to |
857 |
<tt>low_send()</tt>. The given message <i>must</i> be aborted, as |
858 |
there is no way for the routine to signal a failure to abort. The |
859 |
routine should not call <tt>send_finished()</tt>, as the core |
860 |
module will take care of setting the message completion status.</dd> |
861 |
|
862 |
<dt><tt>void <b>send_finished</b>(MailMessage *<i>msg</i>, int <i>status</i>)</tt></dt> |
863 |
<dd>Provided by the core module, this function is called by low-level |
864 |
modules to signal that a message has been successfully sent or an |
865 |
error has occurred that prevents the message from being sent. The |
866 |
<tt><i>msg</i></tt> parameter is the same one passed to |
867 |
<tt>low_send()</tt>, and <tt><i>status</i></tt> is one of the |
868 |
status codes listed above (<tt>MAIL_STATUS_*</tt>).</dd> |
869 |
</dl> |
870 |
|
871 |
<p>As can be seen from the above, both <tt>low_send</tt> and |
872 |
<tt>low_abort</tt> are declared as function pointers in the core module; |
873 |
low-level modules must set these to point to their own implementations of |
874 |
the functions. <i>Implementation note: It would be better to use a |
875 |
<tt>register()</tt>/<tt>unregister()</tt> pair of functions, as with the |
876 |
encryption and database code.</i></p> |
877 |
|
878 |
<p>The <tt>MailMessage</tt> structure used as a parameter in the above |
879 |
functions is used to collect the various parameters of a message into a |
880 |
single group for passing to the low-level modules. The pointer itself also |
881 |
serves as a unique ID value for each message in transit. The structure |
882 |
contains the following fields:</p> |
883 |
|
884 |
<ul> |
885 |
<li><tt>MailMessage *<b>next</b>, *<b>prev</b></tt>: Used by the core |
886 |
module to manage the list of in-transit messages.</li> |
887 |
<li><tt>char *<b>from</b></tt>: Copied from the value given in the |
888 |
<tt>FromAddress</tt> configuration directive.</li> |
889 |
<li><tt>char *<b>fromname</b></tt>: Copied from the value given in the |
890 |
<tt>FromName</tt> configuration directive, or <tt>NULL</tt> if no |
891 |
<tt>FromName</tt> directive was given.</li> |
892 |
<li><tt>char *<b>to</b></tt>: Copied from the <tt><i>to</i></tt> parameter |
893 |
to <tt>sendmail()</tt>.</li> |
894 |
<li><tt>char *<b>subject</b></tt>: Copied from the <tt><i>subject</i></tt> |
895 |
parameter to <tt>sendmail()</tt>.</li> |
896 |
<li><tt>char *<b>body</b></tt>: Copied from the <tt><i>body</i></tt> |
897 |
parameter to <tt>sendmail()</tt>.</li> |
898 |
<li><tt>char *<b>charset</b></tt>: Copied from the <tt><i>charset</i></tt> |
899 |
parameter to <tt>sendmail()</tt>, or <tt>NULL</tt> if the |
900 |
<tt><i>charset</i></tt> parameter was <tt>NULL</tt>.</li> |
901 |
<li><tt>MailCallback <b>completion_callback</b></tt>: Set to the |
902 |
<tt><i>completion_callback</i></tt> parameter to |
903 |
<tt>sendmail()</tt>.</li> |
904 |
<li><tt>void *<b>callback_data</b></tt>: Set to the |
905 |
<tt><i>callback_data</i></tt> parameter to <tt>sendmail()</tt>.</li> |
906 |
<li><tt>Timeout *<b>timeout</b></tt>: Used by the core module to manage |
907 |
send timeouts.</li> |
908 |
</ul> |
909 |
|
910 |
<p>The core module itself, defined in <tt>main.c</tt>, simply serves as a |
911 |
kind of "glue" between external callers and the low-level modules; it |
912 |
consists of the implementations of <tt>sendmail()</tt> and |
913 |
<tt>send_finished()</tt>, along with a timeout callback function |
914 |
(<tt>send_timeout()</tt>) for messages which remain in transit longer than |
915 |
the time specified by the <tt>SendTimeout</tt> configuration directive. |
916 |
When <tt>sendmail()</tt> is called, it performs checks on its parameters |
917 |
(calling the callback function with an error code if a problem is found), |
918 |
then sets up a <tt>MailMessage</tt> structure for the message, activates a |
919 |
timeout if <tt>SendTimeout</tt> is enabled, and calls <tt>low_send()</tt> |
920 |
to begin the actual sending process. When the low-level module calls |
921 |
<tt>send_finished()</tt>, it likewise calls the completion callback |
922 |
function with the specified status, then unlinks and frees the |
923 |
<tt>MailMessage</tt> structure for the message. Messages can be aborted |
924 |
if they time out, or if the core module is removed with any messages |
925 |
still in transit.</p> |
926 |
|
927 |
<p class="backlink"><a href="#top">Back to top</a></p> |
928 |
|
929 |
|
930 |
<h4 class="subsubsection-title" id="s3-2">8-3-2. <tt>mail/sendmail</tt>: Sends mail using the <tt>sendmail</tt> program</h4> |
931 |
|
932 |
<p>The <tt>mail/sendmail</tt> module, defined in <tt>sendmail.c</tt>, makes |
933 |
use of an external "sendmail" program to send mail. The module was |
934 |
designed primarily as a test module to ensure that the core mail processing |
935 |
code worked correctly, to help isolate problems before development of the |
936 |
more complex SMTP module started; it has been retained to support systems |
937 |
which cannot use SMTP to send mail directly, but such systems are presumed |
938 |
to be rare, and little effort has been put into improving this module. In |
939 |
particular, the module (and thus Services itself) blocks while interacting |
940 |
with the external program, potentially causing Services to lag and even |
941 |
opening up the possibility of denial-of-service attacks on Services (by |
942 |
repeatedly sending messages to addresses which take a long time to |
943 |
process).</p> |
944 |
|
945 |
<p>The entire logic of the module, outside of the module initialization and |
946 |
cleanup code (which actually comprises about half of the source file), is |
947 |
contained in <tt>send_sendmail()</tt>, the implementation of the |
948 |
<tt>low_send()</tt> routine called by the core module's <tt>sendmail()</tt> |
949 |
function. <tt>send_sendmail()</tt> opens a pipe to the program specified |
950 |
by the <tt>SendmailPath</tt> directive, which is assumed to take a |
951 |
"<tt>-t</tt>" option to read the recipient address from the message |
952 |
headers, as the standard Unix <tt>sendmail</tt> program does. The message |
953 |
is then written over the pipe, and <tt>pclose()</tt> is called to wait for |
954 |
the message sending operation to complete. This latter step, which is |
955 |
required to free the pipe resources as well, places Services at the mercy |
956 |
of the external program, as <tt>pclose()</tt> will not return until the |
957 |
process exits. <i>Implementation note: One improvement would be to make |
958 |
the pipe non-blocking, but as Services has no facilities for monitoring |
959 |
arbitrary file descriptors, this would require a periodic check via a |
960 |
timeout routine to see whether the child process had exited.</i> Finally, |
961 |
the message status is reported based on the exit code of the child |
962 |
process.</p> |
963 |
|
964 |
<p class="backlink"><a href="#top">Back to top</a></p> |
965 |
|
966 |
|
967 |
<h4 class="subsubsection-title" id="s3-3">8-3-3. <tt>mail/smtp</tt>: Sends mail using SMTP</h4> |
968 |
|
969 |
<p>The <tt>mail/smtp</tt> module, defined in <tt>smtp.c</tt>, sends mail |
970 |
via the SMTP protocol. While the module makes some simplifying |
971 |
assumptions, notably that a relay server is available that will accept and |
972 |
distribute mail on behalf of Services, it is more robustly designed than |
973 |
the <tt>mail/sendmail</tt> module, and is the recommended module for use in |
974 |
Services.</p> |
975 |
|
976 |
<p>As mentioned above, the <tt>mail/smtp</tt> module relies on the presence |
977 |
of an external relay server, which can be as simple as an SMTP daemon |
978 |
running on the same machine, that will accept message from Services via |
979 |
SMTP and relay them to the appropriate destinations. By doing this, the |
980 |
module is freed from the necessity of performing DNS lookups for each |
981 |
message sent, significantly reducing the complexity of the module. |
982 |
However, this also means that invalid addresses cannot be detected, except |
983 |
to the extent that the relay server checks for them during the SMTP |
984 |
connection from Services.</p> |
985 |
|
986 |
<p>For each message to be sent, the module creates a new connection to the |
987 |
relay server, taking advantage of the socket callbacks described in |
988 |
<a href="3.html">section 3</a> to process SMTP communications |
989 |
asynchronously. The socket used for each message, along with the |
990 |
<tt>MailMessage</tt> structure itself and other per-message data, is stored |
991 |
in a <tt>SocketInfo</tt> structure; the module maintains a list of these |
992 |
structures, one for each message in transit. The <tt>SocketInfo</tt> |
993 |
structure contains the following fields:</p> |
994 |
|
995 |
<dl> |
996 |
<dt><tt>struct SocketInfo_ *<b>next</b>, *<b>prev</b></tt></dt> |
997 |
<dd>Used to maintain the linked list of structures. (<tt>struct |
998 |
SocketInfo_</tt> is the same type as <tt>SocketInfo</tt>, and is |
999 |
used here only because the structure is defined as part of the |
1000 |
<tt>typedef</tt>.)</dd> |
1001 |
|
1002 |
<dt><tt>Socket *<b>sock</b></tt></dt> |
1003 |
<dd>The socket being used to send the message.</dd> |
1004 |
|
1005 |
<dt><tt>MailMessage *<b>msg</b></tt></dt> |
1006 |
<dd>The message data structure passed in from the core module.</dd> |
1007 |
|
1008 |
<dt><tt>int <b>msg_status</b></tt></dt> |
1009 |
<dd>The message status code to be passed to <tt>send_finished()</tt>.</dd> |
1010 |
|
1011 |
<dt><tt>int <b>relaynum</b></tt></dt> |
1012 |
<dd>The index (into the <tt>RelayHosts[]</tt> array) of the relay |
1013 |
server currently in use. If a connection to the first server |
1014 |
fails, the code will increment this field and retry the connection |
1015 |
until the list of relay hosts is exhausted.</dd> |
1016 |
|
1017 |
<dt><tt>enum {...} <b>state</b></tt></dt> |
1018 |
<dd>The current state of the connection: |
1019 |
<ul> |
1020 |
<li><b><tt>ST_GREETING</tt>:</b> Waiting for the remote server's |
1021 |
greeting.</li> |
1022 |
<li><b><tt>ST_HELO</tt>:</b> Waiting for a response to the |
1023 |
<tt>HELO</tt> command.</li> |
1024 |
<li><b><tt>ST_MAIL</tt>:</b> Waiting for a response to the |
1025 |
<tt>MAIL</tt> command.</li> |
1026 |
<li><b><tt>ST_RCPT</tt>:</b> Waiting for a response to the |
1027 |
<tt>RCPT</tt> command.</li> |
1028 |
<li><b><tt>ST_DATA</tt>:</b> Waiting for a response to the |
1029 |
<tt>DATA</tt> command.</li> |
1030 |
<li><b><tt>ST_FINISH</tt>:</b> Waiting for the server to confirm |
1031 |
that it has accepted the message.</li> |
1032 |
</ul></dd> |
1033 |
|
1034 |
<dt><tt>int <b>replycode</b></tt></dt> |
1035 |
<dd>The reply code associated with the line currently being received |
1036 |
from the server. A value of zero indicates that the next character |
1037 |
received will be the beginning of a new line.</dd> |
1038 |
|
1039 |
<dt><tt>char <b>replychar</b></tt></dt> |
1040 |
<dd>The fourth character of the line currently being received (normally |
1041 |
either a space or a hyphen, indicating the absence or presence of |
1042 |
continuation lines respectively).</dd> |
1043 |
|
1044 |
<dt><tt>int <b>garbage</b></tt></dt> |
1045 |
<dd>The number of garbage (non-reply) lines received from the server, |
1046 |
used to check for an erroneous connection to a non-SMTP server.</dd> |
1047 |
</dl> |
1048 |
|
1049 |
<p>When the <tt>low_send()</tt> implementation routine, <tt>send_smtp()</tt>, |
1050 |
is called, it first cleans any double quotes out of the "From" name (since |
1051 |
that name will later be enclosed in double quotes), then sets up a |
1052 |
<tt>SocketInfo</tt> structure for the message and creates a socket for SMTP |
1053 |
communication. On success, the socket's callbacks are set, and |
1054 |
<tt>try_next_relay()</tt> is called to attempt a connection to the first |
1055 |
SMTP relay specified in the configuration file. (The <tt>msg_status</tt> |
1056 |
field of <tt>SocketInfo</tt> is set to <tt>MAIL_STATUS_ERROR</tt> to |
1057 |
provide a fallback value in case an error in the module results in |
1058 |
<tt>send_finished()</tt> being called without an explicit status being set; |
1059 |
the "don't depend on this" is simply a reminder to ensure that the status |
1060 |
is in fact set correctly, rather than relying on that default value, since |
1061 |
the default could potentially change.)</p> |
1062 |
|
1063 |
<p><tt>try_next_relay()</tt>, in turn, increments the <tt>relaynum</tt> |
1064 |
field, then checks whether it has exceeded the number of configured relay |
1065 |
servers. If so, sending is terminated with an error code based on the |
1066 |
value of <tt>errno</tt> as returned from the last system call (the routine |
1067 |
is assumed to be called immediately after a socket-related system call); |
1068 |
otherwise, a connection is initiated to the next relay server, looping back |
1069 |
to the top of the function if the <tt>conn()</tt> call fails.</p> |
1070 |
|
1071 |
<p>Actual socket processing is handled by the <tt>smtp_readline()</tt> and |
1072 |
<tt>smtp_disconnect()</tt> functions. The latter, <tt>smtp_disconnect()</tt>, |
1073 |
simply calls <tt>send_finished()</tt>, passing either the value of |
1074 |
<tt>msg_status</tt> (if the connection was closed locally) or an |
1075 |
appropriate error status (if the connection was broken remotely or failed), |
1076 |
then frees the <tt>SocketInfo</tt> structure with <tt>free_socketinfo()</tt>, |
1077 |
which also closes the socket itself. (If the routine is called as the |
1078 |
result of a failed connection, however, it calls <tt>try_next_relay()</tt> |
1079 |
instead.)</p> |
1080 |
|
1081 |
<p><tt>smtp_readline()</tt> is the workhorse of the <tt>mail/smtp</tt> |
1082 |
module, processing data read from the server and sending the SMTP commands |
1083 |
necessary to relay the message. The routine first reads a line of data |
1084 |
from the socket, ensuring that it ends with a newline and removing that |
1085 |
newline. (While the socket subsystem ensures that a full line is |
1086 |
available when the read-line callback is called, <tt>smtp_readline()</tt> |
1087 |
is also able to handle partial lines, except in the pathological case of a |
1088 |
truncated reply code.) If the text received is at the beginning of a line, |
1089 |
the 3-digit reply code and continuation character are parsed and stored in |
1090 |
the <tt>SocketInfo</tt> structure corresponding to the socket. When a |
1091 |
complete, non-continued response line has been received, |
1092 |
<tt>smtp_readline()</tt> then either generates an error (for 4xx or 5xx |
1093 |
error responses from the SMTP server) or sends the next command or message |
1094 |
data to the server, depending on the connection state, and the state is |
1095 |
incremented. (After sending the final <tt>QUIT</tt> command, the socket is |
1096 |
closed, causing <tt>send_finished()</tt> to be called from the socket |
1097 |
disconnection callback.)</p> |
1098 |
|
1099 |
<p>The module's implementation of the <tt>low_abort()</tt> function can be |
1100 |
found in <tt>smtp_abort()</tt>. The routine simply looks up the |
1101 |
<tt>SocketInfo</tt> corresponding to the message, then frees it, |
1102 |
disconnecting the socket in the process.</p> |
1103 |
|
1104 |
<p class="backlink"><a href="#top">Back to top</a></p> |
1105 |
|
1106 |
<!------------------------------------------------------------------------> |
1107 |
<hr/> |
1108 |
|
1109 |
<h3 class="subsection-title" id="s4">8-4. Miscellaneous modules</h3> |
1110 |
|
1111 |
<p>This section documents the two remaining modules which do not fit |
1112 |
neatly into any other category: the <tt>misc/xml-export</tt> and |
1113 |
<tt>misc/xml-import</tt> modules, used for exporting Services pseudoclient |
1114 |
data to an XML file and vice versa. Both of these modules are located in |
1115 |
the <tt>modules/misc</tt> directory.</p> |
1116 |
|
1117 |
<p class="backlink"><a href="#top">Back to top</a></p> |
1118 |
|
1119 |
|
1120 |
<h4 class="subsubsection-title" id="s4-1">8-4-1. <tt>misc/xml-export</tt>: Data export using XML</h4> |
1121 |
|
1122 |
<p>The <tt>misc/xml-export</tt> module, defined in <tt>xml-export.c</tt> |
1123 |
along with declarations in <tt>xml.h</tt>, provides a method through which |
1124 |
Services pseudoclient data can be exported into an XML file suitable for |
1125 |
use with external programs. It should be noted that this module does not |
1126 |
make use of the standard database interface, relying instead on direct |
1127 |
calls to the appropriate modules' database access functions and direct |
1128 |
access to the corresponding data structures, and thus cannot export data |
1129 |
added by third-party modules. This limitation is a result of the module's |
1130 |
implementation in version 5.0, before the current database system was |
1131 |
developed; one possible solution would be to reimplement this module and |
1132 |
<tt>misc/xml-import</tt> as database modules |
1133 |
(see <a href="11.html#s1">section 11-1</a>).</p> |
1134 |
|
1135 |
<p>One thing worth noting about the structure of the module is that, since |
1136 |
it is also compiled into the <tt>convert-db</tt> tool, there are a number |
1137 |
of code segments (mainly logging calls) that need to be compiled |
1138 |
differently. These are protected by preprocessor conditionals on the |
1139 |
<tt>CONVERT_DB</tt> symbol, defined by <tt>tools/Makefile</tt> (see |
1140 |
<a href="10.html#s3-4">section 10-3-4</a>).</p> |
1141 |
|
1142 |
<p>Exporting is handled by the <tt>xml_export()</tt> routine defined near |
1143 |
the bottom of the file. This routine takes two parameters: a function |
1144 |
pointer of type <tt>xml_writefunc_t</tt>, specifying the function to be |
1145 |
called to output data, and an arbitrary pointer value which is passed |
1146 |
unchanged to the function. The <tt>xml_writefunc_t</tt> type is defined in |
1147 |
<tt>xml.h</tt> as:</p> |
1148 |
|
1149 |
<div class="code">int (*<b>xml_writefunc_t</b>)(void *<i>data</i>, const char *<i>fmt</i>, ...)</div> |
1150 |
|
1151 |
<p>where <tt><i>data</i></tt> is the pointer parameter passed to |
1152 |
<tt>xml_export()</tt> and <tt><i>fmt</i></tt> is a <tt>printf()</tt>-style |
1153 |
format string. (This prototype was chosen so that <tt>fprintf()</tt> could |
1154 |
be used as a callback function. <tt>sprintf()</tt> also fits the |
1155 |
prototype, but should be avoided due to the likelihood of buffer |
1156 |
overflows.)</p> |
1157 |
|
1158 |
<p><tt>xml_export()</tt> does not actually export any data itself, other |
1159 |
than writing the <tt><?xml?></tt> header tag and top-level |
1160 |
<tt><ircservices-db></tt> enclosing tags. Rather, it calls helper |
1161 |
routines to export each class of data, passing the write function pointer |
1162 |
and data pointer along to each routine.</p> |
1163 |
|
1164 |
<p>The first of these helper routines is <tt>export_constants()</tt>. |
1165 |
This routine does not export any data <i>per se</i>, but instead writes |
1166 |
out the values of various constants used by Services; this allows other |
1167 |
programs which read in the data to interpret numerical data such as |
1168 |
channel access levels and special values of limits properly, rather than |
1169 |
relying on the definitions used in any particular version of Services (or |
1170 |
whatever other program may have generated the data).</p> |
1171 |
|
1172 |
<p>Following this is <tt>export_operserv_data()</tt>, the first of the |
1173 |
actual data export routines. This routine writes out the maximum user |
1174 |
count and timestamp, along with the super-user password if present. The |
1175 |
password is written in encrypted format, and is first passed through the |
1176 |
<tt>xml_quotebuf()</tt> function to avoid the danger of special characters |
1177 |
like <tt><</tt>, <tt>></tt>, or the null character from causing |
1178 |
problems when the data is read in. This latter function, defined near the |
1179 |
top of the file, converts all non-ASCII bytes in the passed-in buffer to |
1180 |
their equivalent character codes, and converts the three characters |
1181 |
<tt><</tt> <tt>></tt> <tt>&</tt> to "<tt>&lt;</tt>", |
1182 |
"<tt>&gt;</tt>", and "<tt>&amp;</tt>" respectively. The size of |
1183 |
the static return buffer, <tt>BUFSIZE*6+1</tt>, is so that an input buffer |
1184 |
of up to <tt>BUFSIZE</tt> bytes can be encoded with no truncation (the |
1185 |
longest possible encoding for a single byte is 6 characters: |
1186 |
"<tt>&#<i>nnn</i>;</tt>").</p> |
1187 |
|
1188 |
<p>The next routine, <tt>export_nick_db()</tt>, is the first of the true |
1189 |
database export routines, iterating through all nickname groups and then |
1190 |
all nicknames to dump the data for each record to the XML output stream. |
1191 |
The routine takes advantage of these <tt>XML_PUT_*</tt> macros defined at |
1192 |
the top of the source file to simplify the writing of the various structure |
1193 |
fields and substructures. These macros are:</p> |
1194 |
|
1195 |
<ul> |
1196 |
<li><b><tt>XML_PUT_STRING()</tt>:</b> Writes out a string field.</li> |
1197 |
<li><b><tt>XML_PUT_PASS()</tt>:</b> Writes out a password field.</li> |
1198 |
<li><b><tt>XML_PUT_LONG()</tt>:</b> Writes out a signed integer field of |
1199 |
size no greater than <tt>long</tt> (but possibly smaller).</li> |
1200 |
<li><b><tt>XML_PUT_ULONG()</tt>:</b> Writes out an unsigned integer field |
1201 |
of size no greater than <tt>unsigned long</tt> (but possibly |
1202 |
smaller).</li> |
1203 |
<li><b><tt>XML_PUT_STRARR()</tt>:</b> Writes out a variable-length string |
1204 |
array field.</li> |
1205 |
</ul> |
1206 |
|
1207 |
<p>Each macro takes three parameters: <tt><i>indent</i></tt>, a string |
1208 |
prefixed to the output line for indenting; <tt><i>structure</i></tt>, the |
1209 |
structure (not structure pointer) in which the field to write resides; and |
1210 |
<tt><i>field</i></tt>, the name of the field to write. The value written |
1211 |
is enclosed in tags named the same as the field name.</p> |
1212 |
|
1213 |
<p>The subsequent database export routines—<tt>export_channel_db()</tt>, |
1214 |
<tt>export_news_db()</tt>, <tt>export_maskdata</tt>, and |
1215 |
<tt>export_statserv_db()</tt>—export the corresponding databases in a |
1216 |
similar manner. One point of note is the writing of mode locks in |
1217 |
<tt>export_channel_db()</tt>: since the <tt>on</tt> and <tt>off</tt> fields |
1218 |
of the <tt>ModeLock</tt> structure are strings rather than bitmasks in the |
1219 |
<tt>convert-db</tt> tool, as noted in <a href="7.html#s4-1-1">section |
1220 |
7-4-1-1</a>, they are handled differently depending on whether the |
1221 |
preprocessor symbol <tt>CONVERT_DB</tt> is defined.</p> |
1222 |
|
1223 |
<p>The <tt>misc/xml-export</tt> module also includes a callback function |
1224 |
for the core's "<tt>command line</tt>" callback, allowing the pseudoclient |
1225 |
databases to be exported without connecting to the network. The callback |
1226 |
function, <tt>do_command_line()</tt>, checks for the <tt>-export</tt> |
1227 |
option; if present, the XML database dump is written to the named file, or |
1228 |
to standard output if no filename is given, and the function returns 3 (on |
1229 |
success) or 2 (on error) to signal the core code to terminate immediately.</p> |
1230 |
|
1231 |
<p class="backlink"><a href="#top">Back to top</a></p> |
1232 |
|
1233 |
|
1234 |
<h4 class="subsubsection-title" id="s4-2">8-4-2. <tt>misc/xml-import</tt>: Data import using XML</h4> |
1235 |
|
1236 |
<p>The <tt>misc/xml-import</tt> module, defined in <tt>xml-import.c</tt>, |
1237 |
performs the opposite function of the <tt>misc/xml-export</tt> module, |
1238 |
reading data from an XML file and adding it to the various pseudoclient |
1239 |
databases. As with the <tt>misc/xml-export</tt> module, this module is |
1240 |
heavily intertwined with the pseudoclient modules and is unable to handle |
1241 |
data used by third-party modules. Note that the <tt>xml.h</tt> header file |
1242 |
is included by <tt>xml-import.c</tt>, as it is considered a common XML |
1243 |
header file for both import and export, but there are no declarations in |
1244 |
<tt>xml.h</tt> that are actually used in this module.</p> |
1245 |
|
1246 |
<p>Since the import of data will typically create new records, the |
1247 |
<tt>xml-import</tt> module requires a way to allocate and initialize a |
1248 |
record of each of the various structure types. This is done for nickname |
1249 |
and channel records by defining the <tt>STANDALONE_NICKSERV</tt> and |
1250 |
<tt>STANDALONE_CHANSERV</tt> preprocessor symbols and including |
1251 |
<tt>modules/nickserv/util.c</tt> and <tt>modules/chanserv/util.c</tt> (see |
1252 |
also <a href="7.html#s3-1-4">section 7-3-1-4</a>), and for other record |
1253 |
types by allocating with <tt>calloc()</tt> and freeing with custom free |
1254 |
routines. This is admittedly a very kludgey way of doing things, but again |
1255 |
is a carryover from previous versions, before the current database system |
1256 |
was developed.</p> |
1257 |
|
1258 |
<p>When importing data, there is the possibility that data in the imported |
1259 |
XML file will conflict with data already stored in Services' databases. In |
1260 |
the case of OperServ mask-data (autokill, etc.) records and StatServ server |
1261 |
entries, the record in the imported data is always dropped; however, for |
1262 |
nicknames and channels, one of several methods of handling collisions can |
1263 |
be chosen. The various methods, along with the corresponding configuration |
1264 |
options and the flags used to represent them internally, are:</p> |
1265 |
|
1266 |
<ul> |
1267 |
<li class="spaced"><b><tt>XMLI_NICKCOLL_SKIPGROUP</tt>:</b> When a nickname |
1268 |
in the imported data conflicts with a nickname in the database, the |
1269 |
entire nickname group in the imported data containing the |
1270 |
conflicting nickname is discarded. This is the default behavior.</li> |
1271 |
|
1272 |
<li class="spaced"><b><tt>XMLI_NICKCOLL_SKIPNICK</tt>:</b> When a nickname |
1273 |
in the imported data conflicts with a nickname in the database, |
1274 |
only that nickname is discarded; if any other (non-colliding) |
1275 |
nicknames remain in the same nickname group, they are imported |
1276 |
normally, otherwise the resulting empty group is discarded. This |
1277 |
behavior is selected by <tt>OnNicknameCollision skipnick</tt>.</li> |
1278 |
|
1279 |
<li class="spaced"><b><tt>XMLI_NICKCOLL_OVERWRITE</tt>:</b> When a nickname |
1280 |
in the imported data conflicts with a nickname in the database, the |
1281 |
nickname in the database is dropped, along with its nickname group |
1282 |
if there are no other nicknames in the group. This behavior is |
1283 |
selected by <tt>OnNicknameCollision overwrite</tt>.</li> |
1284 |
|
1285 |
<li class="spaced"><b><tt>XMLI_NICKCOLL_ABORT</tt>:</b> When a nickname in |
1286 |
the imported data conflicts with a nickname in the database, the |
1287 |
import procedure is aborted after the XML data has been read in. |
1288 |
This behavior is selected by <tt>OnNicknameCollision abort</tt>.</li> |
1289 |
</ul> |
1290 |
|
1291 |
<ul> |
1292 |
<li class="spaced"><b><tt>XMLI_CHANCOLL_SKIP</tt>:</b> When a channel in |
1293 |
the imported data conflicts with a channel in the database, the |
1294 |
channel in the imported data is discarded. This is the default |
1295 |
behavior.</li> |
1296 |
|
1297 |
<li class="spaced"><b><tt>XMLI_CHANCOLL_OVERWRITE</tt>:</b> When a channel |
1298 |
in the imported data conflicts with a channel in the database, the |
1299 |
channel in the database is dropped. This behavior is selected by |
1300 |
<tt>OnChannelCollision overwrite</tt>.</li> |
1301 |
|
1302 |
<li class="spaced"><b><tt>XMLI_CHANCOLL_ABORT</tt>:</b> When a channel in |
1303 |
the imported data conflicts with a channel in the database, the |
1304 |
import procedure is aborted after the XML data has been read in. |
1305 |
This behavior is selected by <tt>OnNicknameCollision abort</tt>.</li> |
1306 |
</ul> |
1307 |
|
1308 |
<p>One flag from each set is stored in the file-local variable <tt>flags</tt> |
1309 |
at module initialization or reconfiguration time, based on the configuration |
1310 |
file settings.</p> |
1311 |
|
1312 |
<p>XML input is assumed to be from a file, whose file pointer is stored in |
1313 |
the file-local variable <tt>import_file</tt>. The local function |
1314 |
<tt>get_byte()</tt> reads in a byte from this file, returning the value of |
1315 |
that byte or -1 on error, as well as performing buffering (which is |
1316 |
probably redundant with the buffering performed by the stdio functions) and |
1317 |
updating byte and line counters for use in error messages. The macro |
1318 |
<tt>NEXT_BYTE</tt> encapsulates this call, assigning the return value of |
1319 |
<tt>get_byte()</tt> to a variable <tt>c</tt> and returning -1 when |
1320 |
end-of-file is reached.</p> |
1321 |
|
1322 |
<p>The XML data is processed by a simple XML parser, implemented by the |
1323 |
<tt>parse_tag()</tt> routine. This routine calls <tt>read_tag()</tt> to |
1324 |
parse a single tag, then looks up the tag in the <tt>tags[]</tt> table and |
1325 |
calls the associated handler to read and process the tag's contents, and |
1326 |
returns a pointer to those contents (whose type can vary depending on the |
1327 |
tag). The function has three special return values: <tt>CONTINUE</tt> for |
1328 |
tags that were processed successfully but contain no data, <tt>NULL</tt> to |
1329 |
indicate an error processing a tag, or <tt>PARSETAG_END</tt> when the |
1330 |
closing tag corresponding to the tag given in the <tt><i>caller_tag</i></tt> |
1331 |
parameter has been found (or end-of-file is reached). The parser does not |
1332 |
handle empty tags (of the "<tt><tag/></tt>" syntax), as they are not |
1333 |
used in well-formed Services data dumps; every tag has some sort of data |
1334 |
associated with it.</p> |
1335 |
|
1336 |
<p><tt>read_tag()</tt>, in turn, reads bytes from the file until it locates |
1337 |
the beginning of a tag, then parses the tag name and any attribute (only |
1338 |
the first attribute is processed). The function itself returns 1 for an |
1339 |
opening tag, 0 for a closing tag, or a negative value on error; the tag |
1340 |
name, attribute name, attribute value, pre-tag text, and text length are |
1341 |
stored in the variables pointed to by the parameters <tt><i>tag_ret</i></tt>, |
1342 |
<tt><i>attr_ret</i></tt>, <tt><i>attrval_ret</i></tt>, |
1343 |
<tt><i>text_ret</i></tt>, and <tt><i>textlen_ret</i></tt>, respectively. |
1344 |
The strings returned point into a dynamically-allocated buffer local to the |
1345 |
function, which can be freed by calling it with <tt><i>tag_ret</i></tt> set |
1346 |
to <tt>NULL</tt>.</p> |
1347 |
|
1348 |
<p>Each tag handler takes as parameters the tag name, attribute name |
1349 |
(<tt>NULL</tt> if no attribute is present), and attribute value string |
1350 |
(also <tt>NULL</tt> if no attribute is present). Since many tags consist |
1351 |
of simple integer or string values, they make use of the common handlers |
1352 |
<tt>th_text()</tt>, <tt>th_int32()</tt>, <tt>th_uint32()</tt>, |
1353 |
<tt>th_time()</tt>, and <tt>th_strarray()</tt>. Of these, <tt>th_text()</tt> |
1354 |
returns a <tt>TextInfo</tt> structure containing the <tt>malloc()</tt>'d |
1355 |
text buffer, null-terminated, along with the length in bytes of the string |
1356 |
(not including the null terminator); <tt>th_strarray()</tt> returns an |
1357 |
<tt>ArrayInfo</tt> structure containing the <tt>malloc()</tt>'d, |
1358 |
null-terminated string elements and element count; the other handlers |
1359 |
return a pointer to the relevant type. The returned variables themselves |
1360 |
are stored in static buffers local to each handler.</p> |
1361 |
|
1362 |
<p>For simple tag handlers like the standard handlers mentioned above, |
1363 |
handling a tag consists of simply parsing the text between the start and |
1364 |
end tags for that tag. This is done by repeatedly calling |
1365 |
<tt>parse_tag()</tt>, passing the handler's <tt><i>tag</i></tt> parameter |
1366 |
as <tt><i>caller_tag</i></tt>, until the function returns |
1367 |
<tt>PARSETAG_END</tt>, and converting the inter-tag text from the final |
1368 |
<tt>parse_tag()</tt> call (the code assumes no intervening tags) to the |
1369 |
proper format. For the case of <tt>th_strarray()</tt>, the |
1370 |
<tt>parse_tag()</tt> loop checks for <tt><array-element></tt> tags, |
1371 |
converting their contents to an <tt>ArrayInfo</tt> structure.</p> |
1372 |
|
1373 |
<p>The handlers for specific types, like <tt>NickInfo</tt> and |
1374 |
<tt>ChannelInfo</tt>, are more complex, having to deal with multiple |
1375 |
subtags, but follow the same general structure. These handlers return |
1376 |
dynamically allocated structures which are added directly into the import |
1377 |
data list upon being returned from the tag handler.</p> |
1378 |
|
1379 |
<p>The overall import process consists of reading the contents of the |
1380 |
<tt><ircservices-db></tt> into data structures in memory, then |
1381 |
merging those data structures into the appropriate databases. The reading |
1382 |
and parsing is handled by the <tt>read_data()</tt> routine; if it succeeds, |
1383 |
the data is then merged into the databases with <tt>merge_data()</tt>, and |
1384 |
the loaded data is freed with <tt>free_data()</tt>. These routines are |
1385 |
called by the top-level <tt>xml_import()</tt> function.</p> |
1386 |
|
1387 |
<p><tt>read_data()</tt> takes the place of the tag handler for the |
1388 |
<tt><ircservices-db></tt> tag, which is read in manually by |
1389 |
<tt>xml_import()</tt> (by calling <tt>read_tag()</tt>). Like other tag |
1390 |
handlers, it loops calling <tt>parse_tag()</tt> to read in subtag contents, |
1391 |
adding each returned structure into the temporary databases used for |
1392 |
storing the data to import. <tt>read_data()</tt> also takes care of |
1393 |
checking for collisions with data already existing in the pseudoclient |
1394 |
databases, and taking proper action in such cases. The routine returns |
1395 |
nonzero if all data was successfully read in and no collisions caused an |
1396 |
abort, else zero.</p> |
1397 |
|
1398 |
<p>If <tt>read_data()</tt> succeeds, <tt>merge_data()</tt> is then called |
1399 |
to store the read-in records in the main Services databases. An extra |
1400 |
check is performed here for nicknames and channels, ensuring that no |
1401 |
collisions occur unless the collision flags specified overwriting current |
1402 |
records; deletion of such colliding records is also performed at this stage |
1403 |
(rather than when the data is read in, to avoid the case of a nickname or |
1404 |
channel getting deleted and an error then being found later in the imported |
1405 |
data). In the case of colliding nickname group IDs, the imported group is |
1406 |
renumbered to use a free ID value, and all relevant channel entries |
1407 |
(founders, successors, and access list entries) are adjusted accordingly.</p> |
1408 |
|
1409 |
<p>The top-level <tt>xml_import()</tt> function is in turn called by the |
1410 |
<tt>do_command_line()</tt> callback function, hooked into the core's |
1411 |
"<tt>command line</tt>" callback. Like the <tt>misc/xml-export</tt> |
1412 |
module, this module checks for a specific command-line option (in this |
1413 |
case, "<tt>-import</tt>"; if found, <tt>xml_import()</tt> is called with |
1414 |
the file given as a parameter to the option (an error is generated if the |
1415 |
parameter is missing or the file cannot be opened), and the function's |
1416 |
return value (2 or 3) signals Services to exit with an exit code indicating |
1417 |
the success or failure of the import.</p> |
1418 |
|
1419 |
<p>Formerly, the <tt>httpd/dbaccess</tt> module (see <a href="#s2-8">section |
1420 |
8-2-8</a>) also provided the ability to import XML data via this module, by |
1421 |
uploading a file via HTTP. This functionality was removed, however, mainly |
1422 |
to avoid the security and stability issues raised by deleting data records |
1423 |
(nicknames and channels) already in use on the network.</p> |
1424 |
|
1425 |
<p class="backlink"><a href="#top">Back to top</a></p> |
1426 |
|
1427 |
<!------------------------------------------------------------------------> |
1428 |
<hr/> |
1429 |
|
1430 |
<p class="backlink"><a href="7.html">Previous section: Services pseudoclients</a> | |
1431 |
<a href="index.html">Table of Contents</a> | |
1432 |
<a href="9.html">Next section: The database conversion tool</a></p> |
1433 |
|
1434 |
</body> |
1435 |
</html> |