1 |
<?xml version="1.0" encoding="ISO-8859-1"?> |
2 |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11-strict.dtd"> |
3 |
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> |
4 |
<head> |
5 |
<meta http-equiv="Content-Style-Type" content="text/css"/> |
6 |
<style type="text/css">@import "style.css";</style> |
7 |
<title>IRC Services Technical Reference Manual - 6. Database handling</title> |
8 |
</head> |
9 |
|
10 |
<body> |
11 |
<h1 class="title" id="top">IRC Services Technical Reference Manual</h1> |
12 |
|
13 |
<h2 class="section-title">6. Database handling</h2> |
14 |
|
15 |
<p class="section-toc"> |
16 |
6-1. <a href="#s1">Databases in Services</a> |
17 |
<br/>6-2. <a href="#s2">The database subsystem interface</a> |
18 |
<br/> 6-2-1. <a href="#s2-1">Tables, records, and fields</a> |
19 |
<br/> 6-2-2. <a href="#s2-2">Registering and unregistering tables</a> |
20 |
<br/> 6-2-3. <a href="#s2-3">Loading and saving data</a> |
21 |
<br/>6-3. <a href="#s3">Database modules</a> |
22 |
<br/>6-4. <a href="#s4">Specific module details</a> |
23 |
<br/> 6-4-1. <a href="#s4-1"><tt>database/standard</tt></a> |
24 |
<br/> 6-4-1-1. <a href="#s4-1-1">Data format</a> |
25 |
<br/> 6-4-1-2. <a href="#s4-1-2">Module structure</a> |
26 |
<br/> 6-4-2. <a href="#s4-2"><tt>database/version4</tt></a> |
27 |
<br/> 6-4-2-1. <a href="#s4-2-1">Data format</a> |
28 |
<br/> 6-4-2-2. <a href="#s4-2-2">Module structure</a> |
29 |
<br/>6-5. <a href="#s5">Auxiliary source files</a> |
30 |
<br/> 6-5-1. <a href="#s5-1"><tt>fileutil.c</tt>, <tt>fileutil.h</tt></a> |
31 |
<br/> 6-5-2. <a href="#s5-2"><tt>extsyms.c</tt>, <tt>extsyms.h</tt></a> |
32 |
</p> |
33 |
|
34 |
<p class="backlink"><a href="5.html">Previous section: IRC server interface</a> | |
35 |
<a href="index.html">Table of Contents</a> | |
36 |
<a href="7.html">Next section: Services pseudoclients</a></p> |
37 |
|
38 |
<!------------------------------------------------------------------------> |
39 |
<hr/> |
40 |
|
41 |
<h3 class="subsection-title" id="s1">6-1. Databases in Services</h3> |
42 |
|
43 |
<p>As with any program that handles large amounts of data, Services needs |
44 |
a place to store nickname, channel, and other data. In Services, the |
45 |
primary data storage method is in-memory lists and tables; however, since |
46 |
these disappear when Services terminates, a more persistent method of |
47 |
recording the data is required. This is implemented through the |
48 |
<i>database subsystem</i>, briefly touched on in |
49 |
<a href="2.html#s9-2">section 2-9-2</a>.</p> |
50 |
|
51 |
<p>The primary reason for the use of such a two-layer structure is because |
52 |
of the history of Services; as it was originally designed only for use on a |
53 |
small network, the effort required to implement Services using a true |
54 |
database management system was seen as excessive compared to the simplicity |
55 |
of accessing data structures already in memory. As a result, little |
56 |
thought was given to the structure or accessibility of the persistent data |
57 |
files, which were seen as only an adjunct to the in-memory structures. |
58 |
While this served well enough for a time, the system's inflexibility proved |
59 |
cumbersome as more data was stored, and the file format's opaqueness caused |
60 |
trouble for other programs attempting to access the data.</p> |
61 |
|
62 |
<p>The latter problem of opaqueness was mostly resolved with the addition |
63 |
of XML-based data import and export modules (<tt>misc/xml-import</tt> and |
64 |
<tt>misc/xml-export</tt>, described in <a href="8.html#s4">section 8-4</a>). |
65 |
The database system itself remained an issue through version 5.0, but has |
66 |
been redesigned for version 5.1 to allow significantly more flexibility in |
67 |
storing data, as described below. (The two-layer style has been retained, |
68 |
however, primarily due to the difficulty of changing it—a complete |
69 |
rewrite of Services would be required.)</p> |
70 |
|
71 |
<p class="backlink"><a href="#top">Back to top</a></p> |
72 |
|
73 |
<!------------------------------------------------------------------------> |
74 |
<hr/> |
75 |
|
76 |
<h3 class="subsection-title" id="s2">6-2. The database subsystem interface</h3> |
77 |
|
78 |
<p>In Services (as in any typical database system), data to be stored in |
79 |
databases is organized into <i>tables</i>, <i>records</i>, and |
80 |
<i>fields</i>. However, this organization is separate from the in-memory |
81 |
representation of the data: rather than storing the actual data itself, |
82 |
the "tables" handled by the database system hold information on <i>how to |
83 |
access the data</i>. The actual operations of reading data from and |
84 |
writing data to persistent storage are then performed using this |
85 |
information, along with utility routines provided by the table's owner.</p> |
86 |
|
87 |
<p>The core part of the database subsystem is in the source files |
88 |
<tt>databases.c</tt> and <tt>databases.h</tt>.</p> |
89 |
|
90 |
<p class="backlink"><a href="#top">Back to top</a></p> |
91 |
|
92 |
|
93 |
<h4 class="subsubsection-title" id="s2-1">6-2-1. Tables, records, and fields</h4> |
94 |
|
95 |
<p>A table, as used by the database subsystem, is defined by a |
96 |
<tt>DBTable</tt> structure, which contains information about the fields |
97 |
used in the table and utility routines used to create, delete, and access |
98 |
records in the table. The structure is defined (in <tt>databases.h</tt>, |
99 |
along with all other database-related structures and declarations) as |
100 |
follows:</p> |
101 |
|
102 |
<dl> |
103 |
<dt><tt>const char *<b>name</b></tt></dt> |
104 |
<dd>The name for the table. This is used to identify the table to |
105 |
the database system, and is generally used as a filename or other |
106 |
identifier for the copy of the table in persistent storage. The |
107 |
name must be unique among all registered tables.</dd> |
108 |
|
109 |
<dt><tt>DBField *<b>fields</b></tt></dt> |
110 |
<dd>A pointer to an array of <tt>DBField</tt> structures describing the |
111 |
fields in the table, terminated by an entry with |
112 |
<tt>DBField.name</tt> set to <tt>NULL</tt>.</dd> |
113 |
|
114 |
<dt><tt>void *(*<b>newrec</b>)()</tt></dt> |
115 |
<dd>Returns a newly allocated record to place data in. This function |
116 |
is guaranteed to never be called for more than one record |
117 |
simultaneously (in other words, a call to <tt>newrec()</tt> is |
118 |
guaranteed to be followed by a call to either <tt>insert()</tt> or |
119 |
<tt>freerec()</tt>), so this routine may return a pointer to a |
120 |
static buffer instead than actually allocating memory if doing so |
121 |
is more convenient (see the description of the |
122 |
<tt>nickserv/access</tt> module in <a href="7.html#s3-2">section |
123 |
7-3-2</a> for an example of such usage).</dd> |
124 |
|
125 |
<dt><tt>void (*<b>insert</b>)(void *<i>record</i>)</tt></dt> |
126 |
<dd>Inserts a record into the table. This function is called by the |
127 |
database subsystem to insert a new record into the table after it |
128 |
has been successfully loaded. The record passed in is no longer |
129 |
valid after the function returns.</dd> |
130 |
|
131 |
<dt><tt>void (*<b>freerec</b>)(void *<i>record</i>)</tt></dt> |
132 |
<dd>Frees resources used by a record. This function is called by the |
133 |
database subsystem if an error occurs while loading a record, |
134 |
before the record has been inserted into the table.</dd> |
135 |
|
136 |
<dt><tt>void *(*<b>first</b>)()</tt></dt> |
137 |
<dd>Returns a pointer to the first record in the table.</dd> |
138 |
|
139 |
<dt><tt>void *(*<b>next</b>)()</tt></dt> |
140 |
<dd>Returns a pointer to the next record in the table after the last |
141 |
one returned by <tt>first()</tt> or <tt>next()</tt>.</dd> |
142 |
|
143 |
<dt><tt>int (*<b>postload</b>)()</tt></dt> |
144 |
<dd>Called by the database subsystem after all records have been loaded. |
145 |
This can be used to, <i>e.g.,</i> implement data integrity checks |
146 |
which can only be performed after all data has been loaded. If the |
147 |
routine returns zero, the load operation is treated as a failure. |
148 |
This field may be <tt>NULL</tt> if no post-load routine is |
149 |
required.</dd> |
150 |
</dl> |
151 |
|
152 |
<p>As can be seen from this structure, the actual records themselves are |
153 |
not stored in the <tt>DBTable</tt> structure, but are rather left to the |
154 |
table's owner to store as appropriate. For example, the ChanServ |
155 |
pseudoclient module stores the data for each record in a |
156 |
<tt>ChannelInfo</tt> structure.</p> |
157 |
|
158 |
<p>The field data, stored in <tt>DBField</tt> structures, likewise does |
159 |
not hold actual data, only instructions on how to access it. The |
160 |
<tt>DBField</tt> structure contains:</p> |
161 |
|
162 |
<dl> |
163 |
<dt><tt>const char *<b>name</b></tt></dt> |
164 |
<dd>The name for the field. Typically the same as the field identifier |
165 |
used in the program.</dd> |
166 |
|
167 |
<dt><tt>DBType <b>type</b></tt></dt> |
168 |
<dd>The type of the field. Valid types are defined in |
169 |
<tt>databases.h</tt>: |
170 |
<ul><li><tt><b>DBTYPE_INT<i>n</i></b></tt>, |
171 |
<tt><b>DBTYPE_UINT<i>n</i></b></tt>: |
172 |
Signed and unsigned integer values of different bit lengths |
173 |
(8, 16, or 32).</li> |
174 |
<li><tt><b>DBTYPE_TIME</b></tt>: A <tt>time_t</tt> value.</li> |
175 |
<li><tt><b>DBTYPE_STRING</b></tt>: A string (<tt>char *</tt>) |
176 |
value, which can be <tt>NULL</tt>.</li> |
177 |
<li><tt><b>DBTYPE_BUFFER</b></tt>: A fixed-length buffer. The |
178 |
buffer size is given in <tt>DBField.length</tt>.</li> |
179 |
<li><tt><b>DBTYPE_PASSWORD</b></tt>: A <tt>Password</tt> value, |
180 |
as defined in <tt>encrypt.h</tt>.</li> |
181 |
</ul></dd> |
182 |
|
183 |
<dt><tt>int <b>offset</b></tt></dt> |
184 |
<dd>The offset in bytes from the start of the record to the location |
185 |
where the field's value is stored. For records stored in a |
186 |
<tt>struct</tt>, this can be obtained using the standard |
187 |
<tt>offsetof()</tt> macro; for example, the offset of this member |
188 |
in a <tt>DBField</tt> structure is given by: |
189 |
<div class="code">offsetof(DBField, offset)</div></dd> |
190 |
|
191 |
<dt><tt>int <b>length</b></tt></dt> |
192 |
<dd>For <tt>DBTYPE_BUFFER</tt> fields, this gives the length of the |
193 |
buffer, in bytes. This value is ignored for other field types.</dd> |
194 |
|
195 |
<dt><tt>int <b>load_only</b></tt></dt> |
196 |
<dd>If nonzero, the field is not saved to persistent storage. This is |
197 |
intended to facilitate changes in table format, as described |
198 |
below.</dd> |
199 |
|
200 |
<dt><tt>void <b>get</b>(const void *<i>record</i>, void **<i>value_ret</i>)</tt></dt> |
201 |
<dd>If not <tt>NULL</tt>, provides a function to retrieve the field's |
202 |
value; the database subsystem will call this function instead of |
203 |
simply accessing the data stored in the field. |
204 |
<tt><i>record</i></tt> is a pointer to the record structure, and |
205 |
<tt><i>value_ret</i></tt> points to a buffer to receive the value; |
206 |
the buffer will be large enough to hold the data type specified by |
207 |
<tt>DBField.type</tt>. (For strings, store a <tt>char *</tt> |
208 |
value in <tt>*<i>value_ret</i></tt>; the string will <i>not</i> be |
209 |
freed after use, so a static buffer or other method is needed to |
210 |
avoid memory leaks if the string is generated dynamically.)</dd> |
211 |
|
212 |
<dt><tt>void <b>put</b>(void *<i>record</i>, const void *<i>value</i>)</tt></dt> |
213 |
<dd>If not <tt>NULL</tt>, provides a function to set the field's value; |
214 |
the database system will call this function instead of simply |
215 |
storing the loaded data into the field. <tt><i>record</i></tt> is |
216 |
a pointer to the record structure which is to receive the data, and |
217 |
<tt><i>value</i></tt> points to the data itself, in the format |
218 |
given by <tt>DBField.type</tt>. (For strings, this is a |
219 |
<tt>char *</tt> value which may be <tt>NULL</tt>; if not |
220 |
<tt>NULL</tt>, the string has been allocated with <tt>malloc()</tt> |
221 |
and will not be freed by the database subsystem, so you will need |
222 |
to free it if you do not store the pointer directly into the |
223 |
record.)</dd> |
224 |
</dl> |
225 |
|
226 |
<p>This structure is designed with the assumption that data will be stored |
227 |
in some structured type in memory; if no <tt>get()</tt> or <tt>put()</tt> |
228 |
routine is provided, the database subsystem will simply access the memory |
229 |
location derived by adding the field's offset to the record pointer |
230 |
returned by the table's record access functions (<tt>newrec()</tt>, |
231 |
<tt>first()</tt>, or <tt>next()</tt>). If this is not sufficient, however, |
232 |
the table owner can define <tt>get()</tt> and/or <tt>put()</tt> functions |
233 |
for accessing the data.</p> |
234 |
|
235 |
<p>One member of the <tt>DBField</tt> structure that deserves particular |
236 |
mention is the <tt>load_only</tt> member. If a field's <tt>load_only</tt> |
237 |
value is nonzero, then the field will be ignored when the database is saved |
238 |
to persistent storage. This can be used to handle changes in the format of |
239 |
a table; if the old field is left defined with <tt>load_only</tt> nonzero |
240 |
and a <tt>put()</tt> routine provided, that routine will be called whenever |
241 |
a record with the old field is loaded, allowing the old field's value to be |
242 |
processed as necessary to fit the new table format. Alternatively, the old |
243 |
field can be left in the in-memory structure, and code added to the table's |
244 |
<tt>insert()</tt> routine to handle the data translation.</p> |
245 |
|
246 |
<p class="backlink"><a href="#top">Back to top</a></p> |
247 |
|
248 |
|
249 |
<h4 class="subsubsection-title" id="s2-2">6-2-2. Registering and unregistering tables</h4> |
250 |
|
251 |
<p>In order for a database table to be loaded from and saved to persistent |
252 |
storage, it must first be registered with the database subsystem by calling |
253 |
the <tt>register_dbtable()</tt> routine; the complementary |
254 |
<tt>unregister_dbtable()</tt> routine must be called when the table is no |
255 |
longer needed (for example, when the module owning the table exits). The |
256 |
routines' prototypes are as follows:</p> |
257 |
|
258 |
<div class="code">int <b>register_dbtable</b>(DBTable *<i>table</i>) |
259 |
void <b>unregister_dbtable</b>(DBTable *<i>table</i>)</div> |
260 |
|
261 |
<p>Both routines take a pointer to the <tt>DBTable</tt> structure |
262 |
describing the table to be registered or unregistered. The |
263 |
<tt>register_dbtable()</tt> routine returns nonzero on success, zero on |
264 |
failure.</p> |
265 |
|
266 |
<p>Note that <tt>register_dbtable()</tt> assumes that the in-memory table |
267 |
is empty, and has no facility to signal the database owner to clear the |
268 |
table before data is loaded. The database owner must ensure that the table |
269 |
is empty or take whatever other precautions are appropriate before |
270 |
registering the table.</p> |
271 |
|
272 |
<p class="backlink"><a href="#top">Back to top</a></p> |
273 |
|
274 |
|
275 |
<h4 class="subsubsection-title" id="s2-3">6-2-3. Loading and saving data</h4> |
276 |
|
277 |
<p>Data loading and saving is performed on a per-table basis when the table |
278 |
is registered or unregistered, respectively; thus data is immediately |
279 |
available for use when <tt>register_dbtable()</tt> returns successfully, |
280 |
and any changes made to the data after <tt>unregister_dbtable()</tt> is |
281 |
called will not be reflected in persistent storage. There is also an |
282 |
auxiliary routine, <tt>save_all_dbtables()</tt>, which causes all |
283 |
registered tables to be saved (synced) to persistent storage immediately:</p> |
284 |
|
285 |
<div class="code">int <b>save_all_dbtables</b>()</div> |
286 |
|
287 |
<p>This routine returns one of the following values:</p> |
288 |
|
289 |
<ul> |
290 |
<li><b>1</b> if all database tables were saved with no errors, or if no |
291 |
tables are registered.</li> |
292 |
<li><b>0</b> if some tables were saved successfully, but errors occured on |
293 |
at least one table.</li> |
294 |
<li><b>-1</b> if no tables were saved successfully.</li> |
295 |
</ul> |
296 |
|
297 |
<p>This routine is called by the main loop (via the <tt>save_data_now()</tt> |
298 |
helper routine) at periodic intervals or when explicitly requested, as |
299 |
described in <a href="2.html#s3-3">section 2-3-3</a>.</p> |
300 |
|
301 |
<p class="backlink"><a href="#top">Back to top</a></p> |
302 |
|
303 |
<!------------------------------------------------------------------------> |
304 |
<hr/> |
305 |
|
306 |
<h3 class="subsection-title" id="s3">6-3. Database modules</h3> |
307 |
|
308 |
<p>The core portion of the database subsystem only provides the interface |
309 |
for persistent storage of databases; the actual work of transferring data |
310 |
to and front persistent storage is performed by <i>database modules</i>. |
311 |
The standard database modules are located in the <tt>modules/database</tt> |
312 |
directory.</p> |
313 |
|
314 |
<p>A database module registers itself with the core part of the subsystem |
315 |
by calling <tt>register_dbmodule()</tt>; as with tables, the module must |
316 |
unregister itself with the complementary <tt>unregister_dbmodule()</tt> |
317 |
when exiting:</p> |
318 |
|
319 |
<div class="code">int <b>register_dbmodule</b>(DBModule *<i>module</i>) |
320 |
void <b>unregister_dbmodule</b>(DBModule *<i>module</i>)</div> |
321 |
|
322 |
<p>Only one database module may be registered; if a second module tries to |
323 |
register itself, <tt>register_dbmodule()</tt> will return an error (zero).</p> |
324 |
|
325 |
<p>The <tt>DBModule</tt> structure passed to these functions contains two |
326 |
function pointers:</p> |
327 |
|
328 |
<div class="code">int (*<b>load_table</b>)(DBTable *<i>table</i>) |
329 |
int (*<b>save_table</b>)(DBTable *<i>table</i>)</div> |
330 |
|
331 |
<p>As the names suggest, <tt>load_table()</tt> is called to load a table |
332 |
from persistent storage, and <tt>save_table()</tt> is called to save a |
333 |
table to persistent storage. Both routines should return nonzero on |
334 |
success, zero on failure.</p> |
335 |
|
336 |
<p>Since the <tt>DBTable</tt> structures representing registered database |
337 |
tables are passed directly to these two routines, the module must take care |
338 |
to observe the restrictions and requirements on calling the table's |
339 |
function pointers documented in <a href="#s2">section 6-2</a> above, such |
340 |
as not calling <tt>newrec()</tt> twice without an intervening |
341 |
<tt>insert()</tt> or <tt>freerec()</tt> and ensuring that <tt>postload()</tt> |
342 |
is called when a table has been loaded. <i>Implementation note: A better |
343 |
implementation might hide the DBTable structure from database modules, |
344 |
providing an interface that ensures the rules are followed.</i></p> |
345 |
|
346 |
<p>To simplify data access logic and avoid bugs caused by misuse of data |
347 |
fields, database modules should use the <tt>get_dbfield()</tt> and |
348 |
<tt>put_dbfield()</tt> routines to read and write fields in database |
349 |
records. These routines are declared as:</p> |
350 |
|
351 |
<div class="code">void <b>get_dbfield</b>(const void *<i>record</i>, const DBField *<i>field</i>, void *<i>buffer</i>) |
352 |
void <b>put_dbfield</b>(void *<i>record</i>, const DBField *<i>field</i>, const void *<i>value</i>)</div> |
353 |
|
354 |
<p>The routines will automatically call the field's <tt>get()</tt> or |
355 |
<tt>put()</tt> routine if one is supplied, or else copy the field's value |
356 |
to or from the supplied buffer.</p> |
357 |
|
358 |
<p class="backlink"><a href="#top">Back to top</a></p> |
359 |
|
360 |
<!------------------------------------------------------------------------> |
361 |
<hr/> |
362 |
|
363 |
<h3 class="subsection-title" id="s4">6-4. Specific module details</h3> |
364 |
|
365 |
<p>Services includes two standard database modules. The first, |
366 |
<tt>database/standard</tt>, is (as the name implies) intended to be the |
367 |
standard module for use with version 5.1; it stores each table in a binary |
368 |
data file. The second module, <tt>database/version4</tt>, uses the same |
369 |
file format as was used in Services versions 4.x and 5.0, and is intended |
370 |
for compatibility when testing Services 5.1 or converting databases to the |
371 |
new format. (It is not possible to have one module handle loading and a |
372 |
different module handle saving, so data must first be exported to XML using |
373 |
the <tt>version4</tt> module and then imported using the <tt>standard</tt> |
374 |
module in the latter case.)</p> |
375 |
|
376 |
<p class="backlink"><a href="#top">Back to top</a></p> |
377 |
|
378 |
|
379 |
<h4 class="subsubsection-title" id="s4-1">6-4-1. <tt>database/standard</tt></h4> |
380 |
|
381 |
<h5 class="subsubsubsection-title" id="s4-1-1">6-4-1-1. Data format</h5> |
382 |
|
383 |
<p>This module stores each table in a file whose name is constructed by |
384 |
replacing all non-alphanumeric characters (except hyphens and underscores) |
385 |
by underscores and appending "<tt>.sdb</tt>". The file format consists of |
386 |
three main sections, described below. In all cases, numeric data is |
387 |
written in big-endian format (with the most-significant byte first); |
388 |
strings are stored as a 16-bit length in bytes followed by the specified |
389 |
number of bytes of string data (including a terminating null byte), with a |
390 |
<tt>NULL</tt> string indicated by a length value of zero. |
391 |
<i>Implementation note: This obviously limits the length of a string to |
392 |
65,534 bytes. This is the result of reusing the string reading and writing |
393 |
routines used for the old file format; while it has not proved to be a |
394 |
problem to date, it is nonetheless an unnecessary artificial |
395 |
limitation.</i></p> |
396 |
|
397 |
<dl> |
398 |
|
399 |
<dt><b>The file header</b></dt> |
400 |
|
401 |
<dd><p>The file header contains basic information about the file, in the |
402 |
following four fields:</p> |
403 |
<ul> |
404 |
<li class="spaced"><b>File format version:</b> <i>(32-bit integer)</i> |
405 |
A value which identifies the file format in use. This is |
406 |
always the constant <tt>NEWDB_VERSION</tt>; the upper 24 |
407 |
bits of this value contain the ASCII string "<tt>ISD</tt>", |
408 |
identifying the file as an <b>I</b>RC <b>S</b>ervices |
409 |
<b>D</b>atabase, and the lower 8 bits contain a format |
410 |
version number, currently 1.</li> |
411 |
<li class="spaced"><b>Header size:</b> <i>(32-bit integer)</i> |
412 |
The total size of the header, in bytes. Currently 16.</li> |
413 |
<li class="spaced"><b>Field list offset:</b> <i>(32-bit integer)</i> |
414 |
The offset in bytes from the start of the file to the field |
415 |
list, described below.</li> |
416 |
<li class="spaced"><b>Record data offset:</b> <i>(32-bit integer)</i> |
417 |
The offset in bytes from the start of the file to the |
418 |
record data, described below.</li> |
419 |
</ul> |
420 |
</dd> |
421 |
|
422 |
<dt><b>The field list</b></dt> |
423 |
|
424 |
<dd><p>The field list contains information about the fields in the data |
425 |
table and how they are stored in the file. The field list can be |
426 |
stored anywhere in the file, but the current implementation writes |
427 |
it immediately after the file header. The field list consists of a |
428 |
header followed by a variable number of field entries. The header |
429 |
contains the following three values:</p> |
430 |
<ul> |
431 |
<li class="spaced"><b>Field list size:</b> <i>(32-bit integer)</i> |
432 |
The total size of the field list, in bytes.</li> |
433 |
<li class="spaced"><b>Number of fields:</b> <i>(32-bit integer)</i> |
434 |
The number of fields in the field list.</li> |
435 |
<li class="spaced"><b>Record data size:</b> <i>(32-bit integer)</i> |
436 |
The size in bytes of the fixed part of a single record's |
437 |
data. This is the portion of the record data which is |
438 |
always stored in the same format, excluding variable-length |
439 |
data such as strings.</li> |
440 |
</ul> |
441 |
<p>For each field, the following data is recorded:</p> |
442 |
<ul> |
443 |
<li class="spaced"><b>Field data size:</b> <i>(32-bit integer)</i> |
444 |
The size in bytes of the data as stored in the fixed part |
445 |
of the record data. All fields are stored consecutively in |
446 |
the order they appear in the field list, with no padding; |
447 |
thus the offset of a field's data is equal to the sum of |
448 |
the sizes of all previous fields.</li> |
449 |
<li class="spaced"><b>Field type:</b> <i>(16-bit integer)</i> |
450 |
The type of the field. The value is one of the |
451 |
<tt>DBTYPE_*</tt> constants defined in <tt>databaess.h</tt>. |
452 |
<i>Implementation note: This is a bad idea; it would be |
453 |
better to explicitly define constants in <tt>standard.c</tt> |
454 |
to avoid problems arising from changes in the values of the |
455 |
constants.</i></li> |
456 |
<li class="spaced"><b>Field name:</b> <i>(string)</i> |
457 |
The name of the field.</li> |
458 |
</ul> |
459 |
</dd> |
460 |
|
461 |
<dt><b>The record data</b></dt> |
462 |
|
463 |
<dd><p>The last section of the file contains the actual data for each |
464 |
record in the table. To avoid the potential for a corrupt record |
465 |
to render all following records unreadable (if, for example, the |
466 |
length of a string is incorrect), the actual record data is |
467 |
preceded by a <i>record descriptor table</i>, which contains a |
468 |
file offset pointer and total length for each record's data.</p> |
469 |
|
470 |
<p>In order to simplify the writing of database files, the record |
471 |
descriptor table is allowed to be fragmented into multiple parts. |
472 |
Each partial table consists of an 8-byte header containing:</p> |
473 |
<ul> |
474 |
<li class="spaced"><b>Next table pointer:</b> <i>(32-bit integer)</i> |
475 |
The absolute file offset (in bytes) of the next record |
476 |
descriptor table. Set to zero for the last table in the |
477 |
file.</li> |
478 |
<li class="spaced"><b>Table length:</b> <i>(32-bit integer)</i> |
479 |
The length of this record descriptor table in bytes, |
480 |
including the header.</li> |
481 |
</ul> |
482 |
<p>The remainder of the table is filled with 8-byte record |
483 |
descriptors, each containing:</p> |
484 |
<ul> |
485 |
<li class="spaced"><b>Record data pointer:</b> <i>(32-bit integer)</i> |
486 |
The absolute file offset of the record's data.</li> |
487 |
<li class="spaced"><b>Record data length:</b> <i>(32-bit integer)</i> |
488 |
The length of the record's data in bytes.</li> |
489 |
</ul> |
490 |
<p>Note that the header has the same format as a record descriptor, |
491 |
so the entire descriptor table can be treated as an array of |
492 |
descriptors in which the first entry points to the next table |
493 |
rather than a particular record.</p> |
494 |
|
495 |
<p>The record data pointed to by each descriptor consists, in turn, |
496 |
of a fixed-length part and a variable-length part. The |
497 |
fixed-length part (also referred to in the field list description |
498 |
above) contains all data which is of a fixed length for every |
499 |
record; this includes all numeric data, as well as a 32-bit data |
500 |
offset pointer for strings (see below). Variable-length data is |
501 |
stored immediately after the fixed-length part of the data, in |
502 |
arbitrary order.</p> |
503 |
|
504 |
<p>The various field types are stored as follows (where not |
505 |
explicitly mentioned, the value is stored entirely in the |
506 |
fixed-length part of the record data):</p> |
507 |
<ul> |
508 |
<li class="spaced"><b><tt>DBTYPE_INT<i>n</i></tt>, |
509 |
<tt>DBTYPE_UINT<i>n</i></tt>:</b> The value is stored |
510 |
using the requisite number of bytes (1, 2, or 4, depending |
511 |
on the data type size).</li> |
512 |
<li class="spaced"><b><tt>DBTYPE_TIME</tt>:</b> |
513 |
The value is stored as a 64-bit integer.</li> |
514 |
<li class="spaced"><b><tt>DBTYPE_STRING</tt>:</b> |
515 |
A 32-bit data offset is stored in the fixed-length part; |
516 |
this is a byte offset relative to the start of the record |
517 |
data, and points to the location of the actual string data |
518 |
(a16-bit length followed by character data), stored in the |
519 |
variable-length part of the record.</li> |
520 |
<li class="spaced"><b><tt>DBTYPE_BUFFER</tt>:</b> |
521 |
The value is stored using the number of bytes specified by |
522 |
<tt>DBField.length</tt>.</li> |
523 |
<li class="spaced"><b><tt>DBTYPE_PASSWORD</tt>:</b> |
524 |
The value is stored as a data offset pointing to the string |
525 |
giving the cipher name (<tt>Password.cipher</tt>) followed |
526 |
by a fixed buffer of <tt>PASSMAX</tt> bytes. The cipher |
527 |
name itself is stored in the variable-length part, like |
528 |
other strings.</li> |
529 |
</ul> |
530 |
</dd> |
531 |
|
532 |
</dl> |
533 |
|
534 |
<p class="backlink"><a href="#top">Back to top</a></p> |
535 |
|
536 |
|
537 |
<h5 class="subsubsubsection-title" id="s4-1-2">6-4-1-2. Module structure</h5> |
538 |
|
539 |
<p>Database loading and saving are handled by the routines |
540 |
<tt>standard_load_table()</tt> and <tt>standard_save_table()</tt>, |
541 |
respectively. (The <tt>standard_</tt> prefix comes from the module name, |
542 |
and is included to avoid potential name clashes with other database |
543 |
modules, which would complicate debugging.) Each of these routines calls |
544 |
three subroutines to handle each of the three parts of a database file |
545 |
described in <a href="#s4-1-1">section 6-4-1-1</a> above.</p> |
546 |
|
547 |
<p>The <tt>SAFE()</tt> preprocessor macro defined at the top of the file is |
548 |
used in read and write operations to check for a premature end-of-file (on |
549 |
read) or a write error (on write) and abort the routine in these cases.</p> |
550 |
|
551 |
<p>Three helper functions used in loading and saving are defined first:</p> |
552 |
|
553 |
<dl> |
554 |
<dt><tt>TableInfo *<b>create_tableinfo</b>(const DBTable *<i>table</i>)</tt></dt> |
555 |
<dd>Generates a <tt>TableInfo</tt> structure corresponding to the given |
556 |
database table. The <tt>TableInfo</tt> structure is defined at the |
557 |
top of the file, and includes the size of each field as stored in |
558 |
memory and on disk, as well as the location of each field within |
559 |
the record's data as written to disk. (This latter value, |
560 |
<tt>offset</tt>, is set to -1 by this routine, since it is |
561 |
initialized differently when loading than when saving.)</dd> |
562 |
|
563 |
<dt><tt>void <b>free_tableinfo</b>(TableInfo *<i>ti</i>)</tt></dt> |
564 |
<dd>Frees a <tt>TableInfo</tt> structure created by |
565 |
<tt>create_tableinfo()</tt>.</dd> |
566 |
|
567 |
<dt><tt>const char *<b>make_filename</b>(const DBTable *<i>table</i>)</tt></dt> |
568 |
<dd>Generates the filename corresponding to the table name for the |
569 |
given table. The returned filename string is stored in a static |
570 |
buffer, which will be overwritten by subsequent calls.</dd> |
571 |
</dl> |
572 |
|
573 |
<p>Following these routines is <tt>standard_load_table()</tt>, along with |
574 |
its helper routines <tt>read_file_header()</tt>, <tt>read_field_list()</tt>, |
575 |
and <tt>read_records()</tt>. When called, <tt>standard_load_table()</tt> |
576 |
takes the following actions:</p> |
577 |
|
578 |
<ul> |
579 |
<li class="spaced">Generates a <tt>TableInfo</tt> structure for the |
580 |
database table.</li> |
581 |
|
582 |
<li class="spaced">Opens the file corresponding to the table, using |
583 |
<tt>open_db()</tt> from <tt>fileutil.c</tt>(see |
584 |
<a href="#s5-1">section 6-5-1</a>).</li> |
585 |
|
586 |
<li class="spaced">Calls <tt>read_file_header()</tt> to read in the file |
587 |
header.</li> |
588 |
|
589 |
<li class="spaced">Seeks to the beginning of the field list, and calls |
590 |
<tt>read_field_list()</tt> to read it in.</li> |
591 |
|
592 |
<li class="spaced">Seeks to the beginning of the record data, and calls |
593 |
<tt>read_records()</tt> to read it in.</li> |
594 |
</ul> |
595 |
|
596 |
<p><tt>read_file_header()</tt> is fairly straightforward; it simply reads |
597 |
in the four header fields, checks the version number and header size to |
598 |
ensure that they have appropriate values, and returns the field list and |
599 |
record data offsets in the variable references passed in.</p> |
600 |
|
601 |
<p><tt>read_field_list()</tt> is slightly more complex; since there is no |
602 |
guarantee that the record structure stored in the file will match that |
603 |
given by the <tt>DBTable</tt> structure, the routine must match fields in |
604 |
the file to those in the structure. <tt>read_field_list()</tt> iterates |
605 |
through the fields in the loaded table, searching the <tt>TableInfo</tt> |
606 |
structure for a matching field (the name, type, and field size must all |
607 |
match); if found, the record data offset is recorded in the |
608 |
<tt>TableInfo</tt> structure, while unknown fields are simply ignored. |
609 |
<i>Implementation note: As a side effect of this handling, fields like |
610 |
nicknames, channel names, and passwords will cease to be recognized if the |
611 |
relevant buffer sizes are changed, thus the note in <tt>defs.h</tt> about |
612 |
backing up the data before changing the constants.</i></p> |
613 |
|
614 |
<p><tt>read_records()</tt> reads in and loops through the record descriptor |
615 |
tables, continuing until an empty descriptor, signifying the end of the |
616 |
table, is found. In order to avoid duplication of code, the descriptor |
617 |
table is loaded (when necessary) at the beginning of the loop; however, |
618 |
since the descriptor table must be loaded before the end-of-data check can |
619 |
be made, the loop termination check is performed in the middle of the loop, |
620 |
immediately after the descriptor table loading. The <tt>recnum</tt> loop |
621 |
variable indicates the current index in the descriptor table, with 1 |
622 |
meaning the first record descriptor (after the header) and 0 meaning that a |
623 |
new table has to be loaded; the modulo arithmetic in the loop variable |
624 |
update expression ensures that when the index reaches the end of the table, |
625 |
it will be reset to zero, causing the next table to be loaded.</p> |
626 |
|
627 |
<p>The table-saving routine <tt>standard_save_table()</tt> and its |
628 |
subroutines <tt>write_file_header()</tt>, <tt>write_field_list()</tt>, and |
629 |
<tt>write_records()</tt> operate in essentially the same way, although they |
630 |
are slightly simpler because there is no need to check for invalid data, as |
631 |
must be done while reading. The other point worth mentioning is that |
632 |
<tt>open_db()</tt> automatically writes the version number given as the |
633 |
third parameter into the file, so there is no need for |
634 |
<tt>write_file_header()</tt> to do so. <i>Implementation note: Yes, this |
635 |
is ugly; see <a href="#s5-1">section 6-5-1</a> for an explanation.</i></p> |
636 |
|
637 |
<p>Finally, the source file concludes with the standard module variables |
638 |
and routines, along with the <tt>DBModule</tt> structure required for |
639 |
registering the database module. However, for this module they are |
640 |
enclosed by <tt>#ifndef INCLUDE_IN_VERSION4</tt> and <tt>#endif</tt>; |
641 |
this is so that the source file can be directly included in the |
642 |
<tt>database/version4</tt> module (see <a href="#s4-2-2">section |
643 |
6-4-2-2</a> below) without causing identifier conflicts or other |
644 |
problems.</p> |
645 |
|
646 |
<p class="backlink"><a href="#top">Back to top</a></p> |
647 |
|
648 |
|
649 |
<h4 class="subsubsection-title" id="s4-2">6-4-2. <tt>database/version4</tt></h4> |
650 |
|
651 |
<p>This module is intended as a compatibility/transition module, and (as |
652 |
the name implies) supports database files in the format used by 4.x |
653 |
versions of Services, as well as the extended form of that format used in |
654 |
version 5.0. These versions did not support generic database tables, so |
655 |
any such tables which do not correspond to tables used in version 4/5 |
656 |
format files are simply written out in the same format used with the |
657 |
<tt>database/standard</tt> module.</p> |
658 |
|
659 |
<p class="backlink"><a href="#top">Back to top</a></p> |
660 |
|
661 |
|
662 |
<h5 class="subsubsubsection-title" id="s4-2-1">6-4-2-1. Data format</h5> |
663 |
|
664 |
<p>The pre-5.1 data file format is a rather complex beast, an extended form |
665 |
of the original database files which were simply binary dumps of the |
666 |
structures used in memory. The format is not documented outside of the |
667 |
code that implements it, and even I (the developer) often have to refer to |
668 |
the code when analyzing a database file from these versions.</p> |
669 |
|
670 |
<p>The database files used encompass all of the data handled by the |
671 |
standard pseudoclients, but the files are generally split by pseudoclient |
672 |
name rather than individual table: for example, the nickname, nickname |
673 |
group, and memo data are all stored in the same database. This, again, |
674 |
derives from the fact that such data was at one time all stored as part of |
675 |
the same structure (even now, memos are stored in the nickname group |
676 |
structure rather than having their own separate table in memory). The list |
677 |
of files and the tables they encompass follows:</p> |
678 |
|
679 |
<ul> |
680 |
<li class="spaced"><b><tt>nick.db</tt></b> contains the <tt>nick</tt>, |
681 |
<tt>nickgroup</tt>, <tt>nick-access</tt>, <tt>nick-autojoin</tt>, |
682 |
<tt>memo</tt>, and <tt>memo-ignore</tt> tables.</li> |
683 |
<li class="spaced"><b><tt>chan.db</tt></b> contains the <tt>chan</tt>, |
684 |
<tt>chan-access</tt>, and <tt>chan-akick</tt> tables.</li> |
685 |
<li class="spaced"><b><tt>oper.db</tt></b> contains the <tt>oper</tt> |
686 |
table.</li> |
687 |
<li class="spaced"><b><tt>news.db</tt></b> contains the <tt>news</tt> |
688 |
table.</li> |
689 |
<li class="spaced"><b><tt>akill.db</tt></b> contains the <tt>akill</tt> and |
690 |
<tt>exclude</tt> tables.</li> |
691 |
<li class="spaced"><b><tt>exception.db</tt></b> contains the |
692 |
<tt>exception</tt> table.</li> |
693 |
<li class="spaced"><b><tt>sline.db</tt></b> contains the <tt>sgline</tt>, |
694 |
<tt>sqline</tt>, and <tt>szline</tt> tables.</li> |
695 |
<li class="spaced"><b><tt>stats.db</tt></b> contains the |
696 |
<tt>stat-servers</tt> table.</li> |
697 |
</ul> |
698 |
|
699 |
<p>In general, the contents of each database file can be divided into three |
700 |
parts: the base data, the 5.0 extension data, and the 5.1 extension |
701 |
data, all concatenated together. This division of data was introduced in |
702 |
version 5.0 to allow databases written by version 5.0 to be read by version |
703 |
4.5 (minus the 5.0-specific features, of course); version 5.0 wrote the |
704 |
data in the format used by version 4.5, then appended the 5.0-specific data |
705 |
to the end of the file, so that the 4.5 code would simply ignore it, |
706 |
believing that it had reached the end of the data, while the 5.0 code would |
707 |
know to look for the extension data to supplement the (possibly inaccurate) |
708 |
data in the base part. Version 5.1 takes the same approach with respect to |
709 |
version 5.0, resulting in database files which are very convoluted but |
710 |
which can be used in any of versions 4.5, 5.0, or 5.1.</p> |
711 |
|
712 |
<p>Each part begins with a 32-bit version number identifying the format of |
713 |
the data (like the 5.1 standard format, all values are stored in big-endian |
714 |
byte order); this value is fixed at 11 for the base data and 27 for the 5.0 |
715 |
extension data, the file version numbers used in the final releases of |
716 |
these versions of Services. The file version is followed immediately by |
717 |
the data itself, whose format varies depending on the particular data being |
718 |
stored. Simple arrays like news and autokill data typically use a 16-bit |
719 |
count followed by the appropriate number of repetitions of the data |
720 |
structure, a format which is also used for sub-arrays such as access lists |
721 |
within nickname and channel data. Nicknames and channels, on the other |
722 |
hand, do not have a count field, and instead simply consist of a byte with |
723 |
value 1 followed by the nickname or channel data structure for as many |
724 |
structures as necessary, followed by 256 zero bytes indicating the end of |
725 |
the table. (The reason for 256 zero bytes instead of just one is that very |
726 |
old versions of Services, earlier than version 4.0, wrote out each |
727 |
collision list of the 256-element hash arrays separately, terminating each |
728 |
list with a zero; when this was changed, the fiction of 256 collision lists |
729 |
was kept in order to simplify the database reading logic.)</p> |
730 |
|
731 |
<p>For cases where there is a difference in data format or content between |
732 |
the base, 5.0, and 5.1 data, the data is written so that if loaded by the |
733 |
corresponding version of Services, it will be interpreted as closely as |
734 |
possible to the true value. For example, the 32-bit nickname group ID is |
735 |
written into 16 bits of the nickname flags and the 16-bit registered |
736 |
channel limit in the base data, since 4.5 does not interpret these bits; |
737 |
however, since 5.0 does make use of them, the correct values of those two |
738 |
fields are then re-recorded in the 5.0 extension data. Similarly, channel |
739 |
access levels are recorded in the base data using the 4.5 access level |
740 |
system (a range from -9999 to 9999 with standard levels clustered from -2 |
741 |
to 10), and again in the 5.0 extension data using the current system.</p> |
742 |
|
743 |
<p class="backlink"><a href="#top">Back to top</a></p> |
744 |
|
745 |
|
746 |
<h5 class="subsubsubsection-title" id="s4-2-2">6-4-2-2. Module structure</h5> |
747 |
|
748 |
<p>The source file, <tt>version4.c</tt>, starts with a workaround for a |
749 |
limitation of static module compilation. As with the |
750 |
<tt>database/standard</tt> module, this module makes use of the utility |
751 |
routines in <tt>fileutil.c</tt>; however, if <tt>fileutil.c</tt> is simply |
752 |
linked into the module, as is done with the <tt>database/standard</tt> |
753 |
module, an error would occur at link time due to the symbols being defined |
754 |
in both modules. While it is possible to adjust the compilation process to |
755 |
avoid this problem, the <tt>database/version4</tt> module instead simply |
756 |
uses <tt>#define</tt> to rename all of the exported functions in |
757 |
<tt>fileutil.c</tt>, then includes that source file directly.</p> |
758 |
|
759 |
<p>The four version number defines indicate the file version numbers to be |
760 |
used with various parts of the data:</p> |
761 |
<ul> |
762 |
<li><b><tt>FILE_VERSION</tt>:</b> The file version used for the base data. |
763 |
Always 11 (the last value used in Services 4.5).</li> |
764 |
<li><b><tt>LOCAL_VERSION</tt>:</b> The file version used for the 5.1 |
765 |
extension data. Incremented when the 5.1 extension data format |
766 |
changes.</li> |
767 |
<li><b><tt>FIRST_VERSION_51</tt>:</b> The first file version used in |
768 |
Services 5.1. Used to ensure that the 5.1 extension data is |
769 |
valid.</li> |
770 |
<li><b><tt>LOCAL_VERSION_50</tt>:</b> The file version used for the 5.0 |
771 |
extension data. Always 27 (the last value used in Services 5.0).</li> |
772 |
</ul> |
773 |
|
774 |
<p>The <tt>CA_SIZE_4_5</tt>, <tt>ACCESS_INVALID_4_5</tt>, and |
775 |
<tt>def_levels_4_5[]</tt> constants and array are used when processing |
776 |
channel privilege level data as stored in the base data section. Since |
777 |
Services 4.5 always stored the privilege level array, even if all values |
778 |
were set to the defaults, this array is used to detect such a case when |
779 |
loading data and to supply data for channels using the default settings |
780 |
when saving. (The channel access levels themselves use a different scale |
781 |
in 4.5; this is handled by the <tt>convert_old_level()</tt> and |
782 |
<tt>convert_new_level()</tt> helper functions, defined later.)</p> |
783 |
|
784 |
<p>The last set of compatibility constants and variables, |
785 |
<tt>MAX_SERVADMINS</tt>, <tt>services_admins[]</tt> and so on, is used to |
786 |
handle loading and saving of the Services administrator and operator lists |
787 |
in <tt>oper.db</tt>. (Version 4.5 kept these separate from the nickname |
788 |
data, as opposed to the current method which stores the OperServ status |
789 |
level in the nickname group data.)</p> |
790 |
|
791 |
<p>Following these preliminary declarations are the main load and save |
792 |
routines, <tt>version4_load_table()</tt> and <tt>version4_save_table()</tt>, |
793 |
preceded by forward declarations of the individual table handling routines. |
794 |
For the most part, these consist of checking the name of the table to be |
795 |
loaded or saved and calling the appropriate routine; however, since most |
796 |
database files encompass two or more tables, the table pointers must be |
797 |
saved in local variables until all relevant tables are available. Also, |
798 |
several tables are simply ignored this is because the load/save routines |
799 |
access the corresponding data directly through the parent structures (for |
800 |
example, channel access and autokick lists are accessed via the |
801 |
<tt>ChannelInfo</tt> structures in the <tt>chan</tt> table). One other |
802 |
workaround required when loading data is the temporary setting of the |
803 |
global <tt>noexpire</tt> flag; as the comments in the code indicate, this |
804 |
is because the databases are loaded in several steps, and records' |
805 |
expiration timestamps may not be correct until the final step, so leaving |
806 |
expiration enabled could cause records to be improperly expired during the |
807 |
loading process (since expiration occurs when a record is accessed via the |
808 |
various pseudoclients' <tt>get()</tt>, <tt>first()</tt>, and <tt>next()</tt> |
809 |
functions).</p> |
810 |
|
811 |
<p>Next are three short utility functions. The first, |
812 |
<tt>my_open_db_r()</tt>, calls <tt>open_db()</tt> from <tt>fileutil.c</tt> |
813 |
(see <a href="#s5-1">section 6-5-1</a>) to open the given database file for |
814 |
reading, then reads in the file version number and checks that it is within |
815 |
range for the base data section; the version number is then returned in |
816 |
<tt>*<i>ver_ret</i></tt>. (File versions below 5, corresponding to |
817 |
Services 3.0, are not supported because they stored numeric values in a |
818 |
machine-dependent format.) The other two utility routines, |
819 |
<tt>read_maskdata()</tt> and <tt>write_maskdata</tt>, are used to read and |
820 |
write lists of <tt>MaskData</tt> structures, used (for example) in |
821 |
autokills and S-lines.</p> |
822 |
|
823 |
<p>The bulk of the module is taken up by the routines to load particular |
824 |
tables. Since each database file has its own particular format, the table |
825 |
load/save routines must be tailored for each file; the load routines, in |
826 |
particular, must be able to handle multiple versions of files, and as such |
827 |
are especially complex (for the nickname and channel tables, the load |
828 |
routine is broken up into several subroutines). For the sake of simplicity |
829 |
and speed, the routines access the relevant structures directly rather than |
830 |
going through the <tt>DBField</tt> entries of the table; this means that |
831 |
the module must be updated whenever the structures' formats or meanings |
832 |
change, but as the module is only intended as a transitional one, this is |
833 |
not seen to be a significant problem.</p> |
834 |
|
835 |
<p>The load/save routines also call some routines defined in the various |
836 |
pseudoclient modules, such as <tt>get()</tt>, <tt>first()</tt>, and |
837 |
<tt>next()</tt> routines for the various data structures. Since the |
838 |
database may be (and generally is) loaded before the pseudoclient modules, |
839 |
the symbols must be imported appropriately; this is handled by the |
840 |
<tt>extsyms.c</tt> and <tt>extsyms.h</tt> auxiliary files, though the |
841 |
handling is rather machine-dependent. See <a href="#s5-2">section |
842 |
6-5-2</a> for details.</p> |
843 |
|
844 |
<p>The routines used for loading and saving tables which do not correspond |
845 |
to any of the files listed above, <tt>load_generic_table()</tt> and |
846 |
<tt>save_generic_table()</tt>, are actually renamed versions of the |
847 |
<tt>standard_load_table()</tt> and <tt>standard_save_table()</tt> routines |
848 |
defined in the <tt>database/standard</tt> module. To avoid the |
849 |
difficulties involved in trying to load two database modules at once, this |
850 |
module simply includes the <tt>standard.c</tt> source file directly, after |
851 |
setting up <tt>#define</tt> directives to rename the load and save |
852 |
routines; a <tt>#ifndef INCLUDED_IN_VERSION4</tt> protects the parts of the |
853 |
<tt>database/standard</tt> module not related to loading and saving, |
854 |
avoiding multiple definitions of module-related symbols.</p> |
855 |
|
856 |
<p class="backlink"><a href="#top">Back to top</a></p> |
857 |
|
858 |
<!------------------------------------------------------------------------> |
859 |
<hr/> |
860 |
|
861 |
<h3 class="subsection-title" id="s5">6-5. Auxiliary source files</h3> |
862 |
|
863 |
<h4 class="subsubsection-title" id="s5-1">6-5-1. <tt>fileutil.c</tt>, <tt>fileutil.h</tt></h4> |
864 |
|
865 |
<p><tt>fileutil.c</tt> (and its corresponding header, <tt>fileutil.h</tt>) |
866 |
provide utility functions used by both the <tt>database/standard</tt> and |
867 |
<tt>database/version4</tt> modules for reading and writing binary data |
868 |
files. The functions use a <tt>dbFILE</tt> structure to indicate the file |
869 |
to be read from or written to; this is analagous to the <tt>FILE</tt> |
870 |
structure used by stdio-style functions, but includes extra fields used by |
871 |
the open and close functions to ensure that a valid copy of the file is |
872 |
retained even if a write error occurs (see the function descriptions below |
873 |
for details). The actual file pointer is also available in the structure's |
874 |
<tt>fp</tt> field for direct use with the stdio functions.</p> |
875 |
|
876 |
<p>There are several preprocessor conditionals on <tt>CONVERT_DB</tt> |
877 |
scattered throughout the code. These are used to prevent unneeded portions |
878 |
of code, particularly log- and module-related functions, from being seen |
879 |
when the source file is compiled for the <tt>convert-db</tt> tool.</p> |
880 |
|
881 |
<p>The following functions are available. Note that all of the read/write |
882 |
functions (except <tt>get_file_version()</tt> and the raw read/write |
883 |
functions <tt>read_db()</tt>, <tt>write_db()</tt>, and <tt>getc_db()</tt>) |
884 |
share the property that they return 0 on success and -1 on error.</p> |
885 |
|
886 |
<dl> |
887 |
<dt><tt>int32 <b>get_file_version</b>(dbFILE *<i>f</i>)</tt></dt> |
888 |
<dd>Retrieves the file version number from the given file. Returns -1 |
889 |
if the file version could not be read.</dd> |
890 |
|
891 |
<dt><tt>int <b>write_file_version</b>(dbFILE *<i>f</i>, int32 <i>filever</i>)</tt></dt> |
892 |
<dd>Writes the specified file version number to the file. Returns 0 |
893 |
on success, -1 on failure.</dd> |
894 |
|
895 |
<dt><tt>dbFILE *<b>open_db</b>(const char *<i>filename</i>, const char *<i>mode</i>, int32 <i>version</i>)</tt></dt> |
896 |
<dd>Opens the given file for reading (<tt><i>mode</i>=="r"</tt>) or |
897 |
writing (<tt><i>mode</i>=="w"</tt>), returning the <tt>dbFILE</tt> |
898 |
structure pointer on success, <tt>NULL</tt> on failure. When |
899 |
opening a file for writing, the actual file created is a temporary |
900 |
file whose name is the given filename with "<tt>.new</tt>" |
901 |
appended; when <tt>close_db()</tt> is called, the <tt>rename()</tt> |
902 |
system call is used to overwrite any existing file with this |
903 |
temporary file. This ensures that a valid copy of the file will |
904 |
remain on disk even if the writing process is interrupted for some |
905 |
reason. The <tt><i>version</i></tt> parameter is used only when |
906 |
opening a file for writing, and is automatically written to the |
907 |
file using <tt><i>write_file_version()</i></tt>.</dd> |
908 |
|
909 |
<dt><tt>int <b>close_db</b>(dbFILE *<i>f</i>)</tt></dt> |
910 |
<dd>Closes the given file. If the file was open for writing, the |
911 |
temporary file is renamed over the original (if any exists), |
912 |
generating an error if the rename operation fails. Returns 0 on |
913 |
success, -1 on failure.</dd> |
914 |
|
915 |
<dt><tt>void <b>restore_db</b>(dbFILE *<i>f</i>)</tt></dt> |
916 |
<dd>Closes the given file. If the file was open for writing, removes |
917 |
the temporary file, leaving the original file unchanged. This |
918 |
function never generates an error (errors returned from |
919 |
<tt>fclose()</tt> are ignored), and preserves the value of |
920 |
<tt>errno</tt>.</dd> |
921 |
|
922 |
<dt><tt>int <b>read_db</b>(dbFILE *<i>f</i>, void *<i>buf</i>, size_t <i>len</i>)</tt></dt> |
923 |
<dd>Reads the specified number of bytes from the file into |
924 |
<tt><i>buf</i></tt>, returning the number of bytes successfully |
925 |
read or -1 on error. Implemented as a macro in <tt>fileutil.h</tt>.</dd> |
926 |
|
927 |
<dt><tt>int <b>write_db</b>(dbFILE *<i>f</i>, const void *<i>buf</i>, size_t <i>len</i>)</tt></dt> |
928 |
<dd>Writes the specified number of bytes from <tt><i>buf</i></tt> into |
929 |
the file, returning the number of bytes successfully written or -1 |
930 |
on error. Implemented as a macro in <tt>fileutil.h</tt>.</dd> |
931 |
|
932 |
<dt><tt>int <b>getc_db</b>(dbFILE *<i>f</i>)</tt></dt> |
933 |
<dd>Reads a single byte from the file, returning the byte's value on |
934 |
success, -1 on error. Implemented as a macro in |
935 |
<tt>fileutil.h</tt>.</dd> |
936 |
|
937 |
<dt><tt>int <b>read_int8</b>(int8 *ret, dbFILE *<i>f</i>)</tt></dt> |
938 |
<dd>Reads an 8-bit integer from the file, storing it in the location |
939 |
pointed to by <tt><i>ret</i></tt>. Returns 0 on success, -1 on |
940 |
failure.</dd> |
941 |
|
942 |
<dt><tt>int <b>read_uint8</b>(uint8 *ret, dbFILE *<i>f</i>)</tt></dt> |
943 |
<dd>Reads an unsigned 8-bit integer from the file. Identical in |
944 |
behavior to <tt>read_int8()</tt>; this function is provided to |
945 |
avoid signed/unsigned type conversion warnings when compiling.</dd> |
946 |
|
947 |
<dt><tt>int <b>write_int8</b>(int8 val, dbFILE *<i>f</i>)</tt></dt> |
948 |
<dd>Writes the given 8-bit integer to the file. Returns 0 on success, |
949 |
-1 on failure.</dd> |
950 |
|
951 |
<dt><tt>int <b>read_int16</b>(int16 *ret, dbFILE *<i>f</i>)</tt></dt> |
952 |
<dd>Reads a 16-bit integer from the file, storing it in the location |
953 |
pointed to by <tt><i>ret</i></tt>. Returns 0 on success, -1 on |
954 |
failure.</dd> |
955 |
|
956 |
<dt><tt>int <b>read_uint16</b>(uint16 *ret, dbFILE *<i>f</i>)</tt></dt> |
957 |
<dd>Reads an unsigned 16-bit integer from the file. Identical in |
958 |
behavior to <tt>read_int16()</tt>.</dd> |
959 |
|
960 |
<dt><tt>int <b>write_int16</b>(int16 val, dbFILE *<i>f</i>)</tt></dt> |
961 |
<dd>Writes the given 16-bit integer to the file. Returns 0 on success, |
962 |
-1 on failure.</dd> |
963 |
|
964 |
<dt><tt>int <b>read_int32</b>(int32 *ret, dbFILE *<i>f</i>)</tt></dt> |
965 |
<dd>Reads a 32-bit integer from the file, storing it in the location |
966 |
pointed to by <tt><i>ret</i></tt>. Returns 0 on success, -1 on |
967 |
failure.</dd> |
968 |
|
969 |
<dt><tt>int <b>read_uint32</b>(uint32 *ret, dbFILE *<i>f</i>)</tt></dt> |
970 |
<dd>Reads an unsigned 32-bit integer from the file. Identical in |
971 |
behavior to <tt>read_int32()</tt>.</dd> |
972 |
|
973 |
<dt><tt>int <b>write_int32</b>(int32 val, dbFILE *<i>f</i>)</tt></dt> |
974 |
<dd>Writes the given 32-bit integer to the file. Returns 0 on success, |
975 |
-1 on failure.</dd> |
976 |
|
977 |
<dt><tt>int <b>read_time</b>(time_t *ret, dbFILE *<i>f</i>)</tt></dt> |
978 |
<dd>Reads a timestamp value from the file, storing it in the location |
979 |
pointed to by <tt><i>ret</i></tt>. Returns 0 on success, -1 on |
980 |
failure. Timestamp values are always stored using 64 bits, |
981 |
regardless of the size of the <tt>time_t</tt> type.</dd> |
982 |
|
983 |
<dt><tt>int <b>write_time</b>(time_t val, dbFILE *<i>f</i>)</tt></dt> |
984 |
<dd>Writes the given timestamp value to the file. Returns 0 on |
985 |
success, -1 on failure.</dd> |
986 |
|
987 |
<dt><tt>int <b>read_ptr</b>(void **ret, dbFILE *<i>f</i>)</tt></dt> |
988 |
<dd>Reads a pointer value from the file, storing it in the location |
989 |
pointed to by <tt><i>ret</i></tt>. The value will be either |
990 |
<tt>NULL</tt> or an arbitrary non-<tt>NULL</tt> value. Returns 0 |
991 |
on success, -1 on failure. <i>Implementation note: This function |
992 |
and its complement, <tt>write_ptr()</tt>, are included only for use |
993 |
by the <tt>database/version4</tt> module and the <tt>convert-db</tt> |
994 |
tool, which actually do have to deal with pointers written in this |
995 |
way.</i></dd> |
996 |
|
997 |
<dt><tt>int <b>write_ptr</b>(const void *ptr, dbFILE *<i>f</i>)</tt></dt> |
998 |
<dd>Writes the given pointer value to the file. The actual pointer |
999 |
itself is not stored, only a flag indicating whether the pointer is |
1000 |
<tt>NULL</tt> or not. Returns 0 on success, -1 on failure.</dd> |
1001 |
|
1002 |
<dt><tt>int <b>read_string</b>(char **ret, dbFILE *<i>f</i>)</tt></dt> |
1003 |
<dd>Reads a string from the file, allocating memory for the string |
1004 |
using <tt>malloc()</tt> and storing a pointer to the string in the |
1005 |
location pointed to by <tt><i>ret</i></tt>. Note that the value |
1006 |
stored may be <tt>NULL</tt>. Returns 0 on success, -1 on |
1007 |
failure.</dd> |
1008 |
|
1009 |
<dt><tt>int <b>write_string</b>(const char *s, dbFILE *<i>f</i>)</tt></dt> |
1010 |
<dd>Writes the given string (which may be <tt>NULL</tt>) to the file. |
1011 |
The string must be no longer than 65,534 bytes (if longer, the |
1012 |
value written will be silently truncated). Returns 0 on success, |
1013 |
-1 on failure.</dd> |
1014 |
|
1015 |
<dt><tt>int <b>read_buffer</b>(<i>buf</i>, dbFILE *<i>f</i>)</tt></dt> |
1016 |
<dd>Reads the given buffer (assumed to be declared as, <i>e.g.</i>, |
1017 |
a <tt>char</tt> array) from the file. Returns 0 on success, -1 on |
1018 |
failure. Implemented as a macro in <tt>fileutil.h</tt>.</dd> |
1019 |
|
1020 |
<dt><tt>int <b>write_buffer</b>(<i>buf</i>, dbFILE *<i>f</i>)</tt></dt> |
1021 |
<dd>Writes the given buffer (assumed to be declared as, <i>e.g.</i>, |
1022 |
a <tt>char</tt> array) to the file. Returns 0 on success, -1 on |
1023 |
failure. Implemented as a macro in <tt>fileutil.h</tt>.</dd> |
1024 |
</dl> |
1025 |
|
1026 |
<p class="backlink"><a href="#top">Back to top</a></p> |
1027 |
|
1028 |
|
1029 |
<h4 class="subsubsection-title" id="s5-2">6-5-2. <tt>extsyms.c</tt>, <tt>extsyms.h</tt></h4> |
1030 |
|
1031 |
<p><tt>extsyms.c</tt> and <tt>extsyms.h</tt> are used by the |
1032 |
<tt>database/version4</tt> module to import external symbols from other |
1033 |
modules which may not be loaded when the <tt>version4</tt> module is |
1034 |
initialized. The <tt>version4</tt> module makes use of a number of |
1035 |
functions and variables from the various pseudoclient modules, and adding |
1036 |
code at every use to check whether the appropriate module is loaded and |
1037 |
look up the symbol would only further complicate already complex code. For |
1038 |
this reason, the actual work of looking up the symbols is done in |
1039 |
<tt>extsyms.c</tt>, and <tt>extsyms.h</tt> provides redefinition macros to |
1040 |
allow the <tt>version4</tt> module to be written as if the functions and |
1041 |
variables were already present.</p> |
1042 |
|
1043 |
<p>The actual work of looking up and accessing (for values) or calling (for |
1044 |
functions) the external symbols is implemented by the <tt>IMPORT_FUNC()</tt>, |
1045 |
<tt>IMPORT_VAR()</tt>, and <tt>IMPORT_VAR_MAYBE()</tt> macros defined in |
1046 |
<tt>extsyms.c</tt>. These macros all have the same basic format: they |
1047 |
define a variable of the form <tt>__dblocal_<i>symbol</i>_ptr</tt> to hold |
1048 |
the value of the symbol (the address of the function or variable), followed |
1049 |
by a function which looks up the symbol's value if it is not yet known, |
1050 |
then accesses or calls it. (Module pointers are likewise cached in |
1051 |
file-local variables, declared separately.) If the symbol or its module |
1052 |
cannot be found, the local routine <tt>fatal_no_symbol()</tt> is called to |
1053 |
abort the program, except for <tt>IMPORT_VAR_MAYBE()</tt>, in which case a |
1054 |
default value is returned from the accessing function if the symbol is not |
1055 |
available.</p> |
1056 |
|
1057 |
<p>The logic for accessing an external variable is simple; a reference to |
1058 |
the variable is translated by macros in <tt>extsyms.h</tt> into a call to |
1059 |
the function defined by <tt>IMPORT_VAR()</tt> or <tt>IMPORT_VAR_MAYBE()</tt> |
1060 |
(whose name has the format <tt>__dblocal_get_<i>variable</i>()</tt>), which |
1061 |
accesses the variable's value through the pointer obtained from looking up |
1062 |
the symbol and returns it. The function's declaration uses the GCC |
1063 |
<tt>typeof()</tt> built-in operator to give the function's return value, as |
1064 |
well as the cache variable for the symbol value, the same type as the |
1065 |
variable itself.</p> |
1066 |
|
1067 |
<p>Calling an external function is a more complex task, due to the fact |
1068 |
that functions can take parameters or not and can return or not return a |
1069 |
value. Rather than explicitly writing out the symbol access functions for |
1070 |
each external function accessed, <tt>extsyms.c</tt> makes use of a GCC |
1071 |
feature which allows a function to call another function, passing along |
1072 |
the same parameters passed to the parent function, and return its return |
1073 |
value without knowing anything about either the parameters or the type of |
1074 |
return value. This feature is the builtin apply/return code, which takes |
1075 |
the general form:</p> |
1076 |
|
1077 |
<div class="code">__builtin_return(__builtin_apply( |
1078 |
<i>function_pointer</i>, |
1079 |
__builtin_apply_args(), |
1080 |
<i>parameter_buffer_size</i>))</div> |
1081 |
|
1082 |
<p>where <tt><i>function_pointer</i></tt> is a pointer to the function to |
1083 |
be called, and <tt><i>parameter_buffer_size</i></tt> is the maximum amount |
1084 |
of stack space expected to be used by the parameters to the function, if |
1085 |
any. If this feature is not available, for example because a compiler |
1086 |
other than GCC is in use, then the code tries to use another |
1087 |
(assembly-based) algorithm to accomplish the same thing if possible, or |
1088 |
generates a compilation error if no such substitute algorithm is |
1089 |
available.</p> |
1090 |
|
1091 |
<p>However, the use of the <tt>__builtin_apply()</tt> GCC feature in |
1092 |
Services has, over the course of Services' development, revealed a few bugs |
1093 |
in the implementation of that feature; as such, Services must sometimes |
1094 |
resort to an assembly-based algorithm even when using GCC. The necessity |
1095 |
of this is indicated by the preprocessor macro <tt>NEED_GCC3_HACK</tt>, |
1096 |
which is set by the <tt>configure</tt> script if it detects that this |
1097 |
workaround is required. The bugs which have been discovered are:</p> |
1098 |
|
1099 |
<ul> |
1100 |
<li class="spaced">The generated code can access the wrong area of memory |
1101 |
when setting up the stack for the called function |
1102 |
(<a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=8028">GCC |
1103 |
Bugzilla bug 8028</a> |
1104 |
<span class="remotehost">[gcc.gnu.org]</span>).</li> |
1105 |
<li class="spaced">The generated code can fail to pass through the called |
1106 |
function's return value |
1107 |
(<a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11151">GCC |
1108 |
Bugzilla bug 11151</a> |
1109 |
<span class="remotehost">[gcc.gnu.org]</span>).</li> |
1110 |
<li class="spaced">A function calling <tt>__builtin_apply()</tt> can |
1111 |
behave incorrectly if inlined in another function. |
1112 |
(<a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20076">GCC |
1113 |
Bugzilla bug 20076</a> |
1114 |
<span class="remotehost">[gcc.gnu.org]</span>). This is not |
1115 |
directly relevant to <tt>extsyms.c</tt>, but caused problems at |
1116 |
one time in the <tt>configure</tt> script.</li> |
1117 |
</ul> |
1118 |
|
1119 |
<p>Finally, in order to avoid cached pointers going stale when a module is |
1120 |
unloaded, <tt>extsyms.c</tt> includes a callback function for the |
1121 |
"<tt>unload module</tt>" callback, which clears out all cached |
1122 |
pointers for a module when the module is unloaded.</p> |
1123 |
|
1124 |
<p class="backlink"><a href="#top">Back to top</a></p> |
1125 |
|
1126 |
<!------------------------------------------------------------------------> |
1127 |
<hr/> |
1128 |
|
1129 |
<p class="backlink"><a href="5.html">Previous section: IRC server interface</a> | |
1130 |
<a href="index.html">Table of Contents</a> | |
1131 |
<a href="7.html">Next section: Services pseudoclients</a></p> |
1132 |
|
1133 |
</body> |
1134 |
</html> |