comparison old-trunk/doc/manual.docbook.sgml @ 339:eb230fa7d28e

Prepare for migration to hg
author lost
date Fri, 19 Mar 2010 02:54:14 +0000
parents
children
comparison
equal deleted inserted replaced
338:e7885b3ee266 339:eb230fa7d28e
1 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.5//EN">
2 <book>
3 <bookinfo>
4 <title>LW Tool Chain</title>
5 <author><firstname>William</firstname><surname>Astle</surname></author>
6 <copyright><year>2009</year><holder>William Astle</holder></copyright>
7 </bookinfo>
8 <chapter>
9
10 <title>Introduction</title>
11
12 <para>
13 The LW tool chain provides utilities for building binaries for MC6809 and
14 HD6309 CPUs. The tool chain includes a cross-assembler and a cross-linker
15 which support several styles of output.
16 </para>
17
18 <section>
19 <title>History</title>
20 <para>
21 For a long time, I have had an interest in creating an operating system for
22 the Coco3. I finally started working on that project around the beginning of
23 2006. I had a number of assemblers I could choose from. Eventually, I settled
24 on one and started tinkering. After a while, I realized that assembler was not
25 going to be sufficient due to lack of macros and issues with forward references.
26 Then I tried another which handled forward references correctly but still did
27 not support macros. I looked around at other assemblers and they all lacked
28 one feature or another that I really wanted for creating my operating system.
29 </para>
30
31 <para>
32 The solution seemed clear at that point. I am a fair programmer so I figured
33 I could write an assembler that would do everything I wanted an assembler to
34 do. Thus the LWASM probject was born. After more than two years of on and off
35 work, version 1.0 of LWASM was released in October of 2008.
36 </para>
37
38 <para>
39 As the aforementioned operating system project progressed further, it became
40 clear that while assembling the whole project through a single file was doable,
41 it was not practical. When I found myself playing some fancy games with macros
42 in a bid to simulate sections, I realized I needed a means of assembling
43 source files separately and linking them later. This spawned a major development
44 effort to add an object file support to LWASM. It also spawned the LWLINK
45 project to provide a means to actually link the files.
46 </para>
47
48 </section>
49
50 </chapter>
51
52 <chapter>
53 <title>Output Formats</title>
54
55 <para>
56 The LW tool chain supports multiple output formats. Each format has its
57 advantages and disadvantages. Each format is described below.
58 </para>
59
60 <section>
61 <title>Raw Binaries</title>
62 <para>
63 A raw binary is simply a string of bytes. There are no headers or other
64 niceties. Both LWLINK and LWASM support generating raw binaries. ORG directives
65 in the source code only serve to set the addresses that will be used for
66 symbols but otherwise have no direct impact on the resulting binary.
67 </para>
68
69 </section>
70 <section>
71 <title>DECB Binaries</title>
72
73 <para>A DECB binary is compatible with the LOADM command in Disk Extended
74 Color Basic on the CoCo. They are also compatible with CLOADM from Extended
75 Color Basic. These binaries include the load address of the binary as well
76 as encoding an execution address. These binaries may contain multiple loadable
77 sections, each of which has its own load address.</para>
78
79 <para>
80 Each binary starts with a preamble. Each preamble is five bytes long. The
81 first byte is zero. The next two bytes specify the number of bytes to load
82 and the last two bytes specify the address to load the bytes at. Then, a
83 string of bytes follows. After this string of bytes, there may be another
84 preamble or a postamble. A postamble is also five bytes in length. The first
85 byte of the postamble is $FF, the next two are zero, and the last two are
86 the execution address for the binary.
87 </para>
88
89 <para>
90 Both LWASM and LWLINK can output this format.
91 </para>
92 </section>
93
94 <section>
95 <title>OS9 Modules</title>
96 <para>
97
98 Since version 2.5, LWASM is able to generate OS9 modules. The syntax is
99 basically the same as for other assemblers. A module starts with the MOD
100 directive and ends with the EMOD directive. The OS9 directive is provided
101 as a shortcut for writing system calls.
102
103 </para>
104
105 <para>
106
107 LWASM does NOT provide an OS9Defs file. You must provide your own. Also note
108 that the common practice of using "ifp1" around the inclusion of the OS9Defs
109 file is discouraged as it is pointless and can lead to unintentional
110 problems and phasing errors. Because LWASM reads each file exactly once,
111 there is no benefit to restricting the inclusion to the first assembly pass.
112
113 </para>
114
115 <para>
116
117 It is also critical to understand that unlike many OS9 assemblers, LWASM
118 does NOT maintain a separate data address counter. Thus, you must define
119 all your data offsets and so on outside of the mod/emod segment. It is,
120 therefore, likely that source code targeted at other assemblers will require
121 edits to build correctly.
122
123 </para>
124
125 <para>
126
127 LWLINK does not, yet, have the ability to create OS9 modules from object
128 files.
129
130 </para>
131 </section>
132
133 <section>
134 <title>Object Files</title>
135 <para>LWASM supports generating a proprietary object file format which is
136 described in <xref linkend="objchap">. LWLINK is then used to link these
137 object files into a final binary in any of LWLINK's supported binary
138 formats.</para>
139
140 <para>Object files also support the concept of sections which are not valid
141 for other output types. This allows related code from each object file
142 linked to be collapsed together in the final binary.</para>
143
144 <para>
145 Object files are very flexible in that they allow references that are not
146 known at assembly time to be resolved at link time. However, because the
147 addresses of such references are not known at assembly time, there is no way
148 for the assembler to deduce that an eight bit addressing mode is possible.
149 That means the assember will default to using sixteen bit addressing
150 whenever an external or cross-section reference is used.
151 </para>
152
153 <para>
154 As of LWASM 2.4, it is possible to force direct page addressing for an
155 external reference. Care must be taken to ensure the resulting addresses
156 are really in the direct page since the linker does not know what the direct
157 page is supposed to be and does not emit errors for byte overflows.
158 </para>
159
160 <para>
161 It is also possible to use external references in an eight bit immediate
162 mode instruction. In this case, only the low order eight bits will be used.
163 Again, no byte overflows will be flagged.
164 </para>
165
166
167 </section>
168
169 </chapter>
170
171 <chapter>
172 <title>LWASM</title>
173 <para>
174 The LWTOOLS assembler is called LWASM. This chapter documents the various
175 features of the assembler. It is not, however, a tutorial on 6x09 assembly
176 language programming.
177 </para>
178
179 <section>
180 <title>Command Line Options</title>
181 <para>
182 The binary for LWASM is called "lwasm". Note that the binary is in lower
183 case. lwasm takes the following command line arguments.
184 </para>
185
186 <variablelist>
187
188 <varlistentry>
189 <term><option>--6309</option></term>
190 <term><option>-3</option></term>
191 <listitem>
192 <para>
193 This will cause the assembler to accept the additional instructions available
194 on the 6309 processor. This is the default mode; this option is provided for
195 completeness and to override preset command arguments.
196 </para>
197 </listitem>
198 </varlistentry>
199
200 <varlistentry>
201 <term><option>--6809</option></term>
202 <term><option>-9</option></term>
203 <listitem>
204 <para>
205 This will cause the assembler to reject instructions that are only available
206 on the 6309 processor.
207 </para>
208 </listitem>
209 </varlistentry>
210
211 <varlistentry>
212 <term><option>--decb</option></term>
213 <term><option>-b</option></term>
214 <listitem>
215 <para>
216 Select the DECB output format target. Equivalent to <option>--format=decb</option>.
217 </para>
218 </listitem>
219 </varlistentry>
220
221 <varlistentry>
222 <term><option>--format=type</option></term>
223 <term><option>-f type</option></term>
224 <listitem>
225 <para>
226 Select the output format. Valid values are <option>obj</option> for the
227 object file target, <option>decb</option> for the DECB LOADM format,
228 <option>os9</option> for creating OS9 modules, and <option>raw</option> for
229 a raw binary.
230 </para>
231 </listitem>
232 </varlistentry>
233
234 <varlistentry>
235 <term><option>--list[=file]</option></term>
236 <term><option>-l[file]</option></term>
237 <listitem>
238 <para>
239 Cause LWASM to generate a listing. If <option>file</option> is specified,
240 the listing will go to that file. Otherwise it will go to the standard output
241 stream. By default, no listing is generated.
242 </para>
243 </listitem>
244 </varlistentry>
245
246 <varlistentry>
247 <term><option>--obj</option></term>
248 <listitem>
249 <para>
250 Select the proprietary object file format as the output target.
251 </para>
252 </listitem>
253 </varlistentry>
254
255 <varlistentry>
256 <term><option>--output=FILE</option></term>
257 <term><option>-o FILE</option></term>
258 <listitem>
259 <para>
260 This option specifies the name of the output file. If not specified, the
261 default is <option>a.out</option>.
262 </para>
263 </listitem>
264 </varlistentry>
265
266 <varlistentry>
267 <term><option>--pragma=pragma</option></term>
268 <term><option>-p pragma</option></term>
269 <listitem>
270 <para>
271 Specify assembler pragmas. Multiple pragmas are separated by commas. The
272 pragmas accepted are the same as for the PRAGMA assembler directive described
273 below.
274 </para>
275 </listitem>
276 </varlistentry>
277
278 <varlistentry>
279 <term><option>--raw</option></term>
280 <term><option>-r</option></term>
281 <listitem>
282 <para>
283 Select raw binary as the output target.
284 </para>
285 </listitem>
286 </varlistentry>
287
288 <varlistentry>
289 <term><option>--help</option></term>
290 <term><option>-?</option></term>
291 <listitem>
292 <para>
293 Present a help screen describing the command line options.
294 </para>
295 </listitem>
296 </varlistentry>
297
298 <varlistentry>
299 <term><option>--usage</option></term>
300 <listitem>
301 <para>
302 Provide a summary of the command line options.
303 </para>
304 </listitem>
305 </varlistentry>
306
307 <varlistentry>
308 <term><option>--version</option></term>
309 <term><option>-V</option></term>
310 <listitem>
311 <para>
312 Display the software version.
313 </para>
314 </listitem>
315 </varlistentry>
316
317 <varlistentry>
318 <term><option>--debug</option></term>
319 <term><option>-d</option></term>
320 <listitem>
321 <para>
322 Increase the debugging level. Only really useful to people hacking on the
323 LWASM source code itself.
324 </para>
325 </listitem>
326 </varlistentry>
327
328 </variablelist>
329
330 </section>
331
332 <section>
333 <title>Dialects</title>
334 <para>
335 LWASM supports all documented MC6809 instructions as defined by Motorola.
336 It also supports all known HD6309 instructions. While there is general
337 agreement on the pneumonics for most of the 6309 instructions, there is some
338 variance with the block transfer instructions. TFM for all four variations
339 seems to have gained the most traction and, thus, this is the form that is
340 recommended for LWASM. However, it also supports COPY, COPY-, IMP, EXP,
341 TFRP, TFRM, TFRS, and TFRR. It further adds COPY+ as a synomym for COPY,
342 IMPLODE for IMP, and EXPAND for EXP.
343 </para>
344
345 <para>By default, LWASM accepts 6309 instructions. However, using the
346 <parameter>--6809</parameter> parameter, you can cause it to throw errors on
347 6309 instructions instead.</para>
348
349 <para>
350 The standard addressing mode specifiers are supported. These are the
351 hash sign ("#") for immediate mode, the less than sign ("&lt;") for forced
352 eight bit modes, and the greater than sign ("&gt;") for forced sixteen bit modes.
353 </para>
354
355 <para>
356 Additionally, LWASM supports using the asterisk ("*") to indicate
357 base page addressing. This should not be used in hand-written source code,
358 however, because it is non-standard and may or may not be present in future
359 versions of LWASM.
360 </para>
361
362 </section>
363
364 <section>
365 <title>Source Format</title>
366
367 <para>
368 LWASM accepts plain text files in a relatively free form. It can handle
369 lines terminated with CR, LF, CRLF, or LFCR which means it should be able
370 to assemble files on any platform on which it compiles.
371 </para>
372 <para>
373 Each line may start with a symbol. If a symbol is present, there must not
374 be any whitespace preceding it. It is legal for a line to contain nothing
375 but a symbol.</para>
376 <para>
377 The op code is separated from the symbol by whitespace. If there is
378 no symbol, there must be at least one white space character preceding it.
379 If applicable, the operand follows separated by whitespace. Following the
380 opcode and operand is an optional comment.
381 </para>
382 <para>
383 A comment can also be introduced with a * or a ;. The comment character is
384 optional for end of statement comments. However, if a symbol is the only
385 thing present on the line other than the comment, the comment character is
386 mandatory to prevent the assembler from interpreting the comment as an opcode.
387 </para>
388
389 <para>
390 For compatibility with the output generated by some C preprocessors, LWASM
391 will also ignore lines that begin with a #. This should not be used as a general
392 comment character, however.
393 </para>
394
395 <para>
396 The opcode is not treated case sensitively. Neither are register names in
397 the operand fields. Symbols, however, are case sensitive.
398 </para>
399
400 <para>
401 LWASM does not support line numbers in the file.
402 </para>
403
404 </section>
405
406 <section>
407 <title>Symbols</title>
408
409 <para>
410 Symbols have no length restriction. They may contain letters, numbers, dots,
411 dollar signs, and underscores. They must start with a letter, dot, or
412 underscore.
413 </para>
414
415 <para>
416 LWASM also supports the concept of a local symbol. A local symbol is one
417 which contains either a "?" or a "@", which can appear anywhere in the symbol.
418 The scope of a local symbol is determined by a number of factors. First,
419 each included file gets its own local symbol scope. A blank line will also
420 be considered a local scope barrier. Macros each have their own local symbol
421 scope as well (which has a side effect that you cannot use a local symbol
422 as an argument to a macro). There are other factors as well. In general,
423 a local symbol is restricted to the block of code it is defined within.
424 </para>
425
426 <para>
427 By default, unless assembling to the os9 target, a "$" in the symbol will
428 also make it local. This can be controlled by the "dollarlocal" and
429 "nodollarlocal" pragmas. In the absence of a pragma to the contrary, For
430 the os9 target, a "$" in the symbol will not make it considered local while
431 for all other targets it will.
432 </para>
433
434 </section>
435
436 <section>
437 <title>Numbers and Expressions</title>
438 <para>
439
440 Numbers can be expressed in binary, octal, decimal, or hexadecimal. Binary
441 numbers may be prefixed with a "%" symbol or suffixed with a "b" or "B".
442 Octal numbers may be prefixed with "@" or suffixed with "Q", "q", "O", or
443 "o". Hexadecimal numbers may be prefixed with "$", "0x" or "0X", or suffixed
444 with "H". No prefix or suffix is required for decimal numbers but they can
445 be prefixed with "&amp;" if desired. Any constant which begins with a letter
446 must be expressed with the correct prefix base identifier or be prefixed
447 with a 0. Thus hexadecimal FF would have to be written either 0FFH or $FF.
448 Numbers are not case sensitive.
449
450 </para>
451
452 <para> A symbol may appear at any point where a number is acceptable. The
453 special symbol "*" can be used to represent the starting address of the
454 current source line within expressions. </para>
455
456 <para>The ASCII value of a character can be included by prefixing it with a
457 single quote ('). The ASCII values of two characters can be included by
458 prefixing the characters with a quote (").</para>
459
460 <para>
461
462 LWASM supports the following basic binary operators: +, -, *, /, and %.
463 These represent addition, subtraction, multiplication, division, and
464 modulus. It also supports unary negation and unary 1's complement (- and ^
465 respectively). It is also possible to use ~ for the unary 1's complement
466 operator. For completeness, a unary positive (+) is supported though it is
467 a no-op. LWASM also supports using |, &, and ^ for bitwise or, bitwise and,
468 and bitwise exclusive or respectively.
469
470 </para>
471
472 <para>
473
474 Operator precedence follows the usual rules. Multiplication, division, and
475 modulus take precedence over addition and subtraction. Unary operators take
476 precedence over binary operators. Bitwise operators are lower precdence
477 than addition and subtraction. To force a specific order of evaluation,
478 parentheses can be used in the usual manner.
479
480 </para>
481
482 <para>
483
484 As of LWASM 2.5, the operators && and || are recognized for boolean and and
485 boolean or respectively. They will return either 0 or 1 (false or true).
486 They have the lowest precedence of all the binary operators.
487
488 </para>
489
490 </section>
491
492 <section>
493 <title>Assembler Directives</title>
494 <para>
495 Various directives can be used to control the behaviour of the
496 assembler or to include non-code/data in the resulting output. Those directives
497 that are not described in detail in other sections of this document are
498 described below.
499 </para>
500
501 <section>
502 <title>Data Directives</title>
503 <variablelist>
504 <varlistentry><term>FCB <parameter>expr[,...]</parameter></term>
505 <term>.DB <parameter>expr[,...]</parameter></term>
506 <term>.BYTE <parameter>expr[,...]</parameter></term>
507 <listitem>
508 <para>Include one or more constant bytes (separated by commas) in the output.</para>
509 </listitem>
510 </varlistentry>
511
512 <varlistentry>
513 <term>FDB <parameter>expr[,...]</parameter></term>
514 <term>.DW <parameter>expr[,...]</parameter></term>
515 <term>.WORD <parameter>expr[,...]</parameter></term>
516 <listitem>
517 <para>Include one or more words (separated by commas) in the output.</para>
518 </listitem>
519 </varlistentry>
520
521 <varlistentry>
522 <term>FQB <parameter>expr[,...]</parameter></term>
523 <term>.QUAD <parameter>expr[,...]</parameter></term>
524 <term>.4BYTE <parameter>expr[,...]</parameter></term>
525 <listitem>
526 <para>Include one or more double words (separated by commas) in the output.</para>
527 </listitem>
528 </varlistentry>
529
530 <varlistentry>
531 <term>FCC <parameter>string</parameter></term>
532 <term>.ASCII <parameter>string</parameter></term>
533 <term>.STR <parameter>string</parameter></term>
534 <listitem>
535 <para>
536 Include a string of text in the output. The first character of the operand
537 is the delimiter which must appear as the last character and cannot appear
538 within the string. The string is included with no modifications>
539 </para>
540 </listitem>
541 </varlistentry>
542
543 <varlistentry>
544 <term>FCN <parameter>string</parameter></term>
545 <term>.ASCIZ <parameter>string</parameter></term>
546 <term>.STRZ <parameter>string</parameter></term>
547 <listitem>
548 <para>
549 Include a NUL terminated string of text in the output. The first character of
550 the operand is the delimiter which must appear as the last character and
551 cannot appear within the string. A NUL byte is automatically appended to
552 the string.
553 </para>
554 </listitem>
555 </varlistentry>
556
557 <varlistentry>
558 <term>FCS <parameter>string</parameter></term>
559 <term>.ASCIS <parameter>string</parameter></term>
560 <term>.STRS <parameter>string</parameter></term>
561 <listitem>
562 <para>
563 Include a string of text in the output with bit 7 of the final byte set. The
564 first character of the operand is the delimiter which must appear as the last
565 character and cannot appear within the string.
566 </para>
567 </listitem>
568 </varlistentry>
569
570 <varlistentry><term>ZMB <parameter>expr</parameter></term>
571 <listitem>
572 <para>
573 Include a number of NUL bytes in the output. The number must be fully resolvable
574 during pass 1 of assembly so no forward or external references are permitted.
575 </para>
576 </listitem>
577 </varlistentry>
578
579 <varlistentry><term>ZMD <parameter>expr</parameter></term>
580 <listitem>
581 <para>
582 Include a number of zero words in the output. The number must be fully
583 resolvable during pass 1 of assembly so no forward or external references are
584 permitted.
585 </para>
586 </listitem>
587 </varlistentry>
588
589 <varlistentry><term>ZMQ <parameter>expr<parameter></term>
590 <listitem>
591 <para>
592 Include a number of zero double-words in the output. The number must be fully
593 resolvable during pass 1 of assembly so no forward or external references are
594 permitted.
595 </para>
596 </listitem>
597 </varlistentry>
598
599 <varlistentry>
600 <term>RMB <parameter>expr</parameter></term>
601 <term>.BLKB <parameter>expr</parameter></term>
602 <term>.DS <parameter>expr</parameter></term>
603 <term>.RS <parameter>expr</parameter></term>
604 <listitem>
605 <para>
606 Reserve a number of bytes in the output. The number must be fully resolvable
607 during pass 1 of assembly so no forward or external references are permitted.
608 The value of the bytes is undefined.
609 </para>
610 </listitem>
611 </varlistentry>
612
613 <varlistentry><term>RMD <parameter>expr</parameter></term>
614 <listitem>
615 <para>
616 Reserve a number of words in the output. The number must be fully
617 resolvable during pass 1 of assembly so no forward or external references are
618 permitted. The value of the words is undefined.
619 </para>
620 </listitem>
621 </varlistentry>
622
623 <varlistentry><term>RMQ <parameter>expr</parameter></term>
624 <listitem>
625 <para>
626 Reserve a number of double-words in the output. The number must be fully
627 resolvable during pass 1 of assembly so no forward or external references are
628 permitted. The value of the double-words is undefined.
629 </para>
630 </listitem>
631 </varlistentry>
632
633 <varlistentry>
634 <term>INCLUDEBIN <parameter>filename</parameter></term>
635 <listitem>
636 <para>
637 Treat the contents of <parameter>filename</parameter> as a string of bytes to
638 be included literally at the current assembly point. This has the same effect
639 as converting the file contents to a series of FCB statements and including
640 those at the current assembly point.
641 </para>
642 </listitem>
643 </varlistentry>
644
645 </variablelist>
646
647 </section>
648
649 <section>
650 <title>Address Definition</title>
651 <para>The directives in this section all control the addresses of symbols
652 or the assembly process itself.</para>
653
654 <variablelist>
655 <varlistentry><term>ORG <parameter>expr</parameter></term>
656 <listitem>
657 <para>Set the assembly address. The address must be fully resolvable on the
658 first pass so no external or forward references are permitted. ORG is not
659 permitted within sections when outputting to object files. For the DECB
660 target, each ORG directive after which output is generated will cause
661 a new preamble to be output. ORG is only used to determine the addresses
662 of symbols when the raw target is used.
663 </para>
664 </listitem>
665 </varlistentry>
666
667 <varlistentry>
668 <term><parameter>sym</parameter> EQU <parameter>expr</parameter></term>
669 <term><parameter>sym</parameter> = <parameter>expr</parameter></term>
670 <listitem>
671 <para>Define the value of <parameter>sym</parameter> to be <parameter>expr</parameter>.
672 </listitem>
673 </varlistentry>
674
675 <varlistentry>
676 <term><parameter>sym</parameter> SET <parameter>expr</parameter></term>
677 <listitem>
678 <para>Define the value of <parameter>sym</parameter> to be <parameter>expr</parameter>.
679 Unlike EQU, SET permits symbols to be defined multiple times as long as SET
680 is used for all instances. Use of the symbol before the first SET statement
681 that sets its value is undefined.</para>
682 </listitem>
683 </varlistentry>
684
685 <varlistentry>
686 <term>SETDP <parameter>expr</parameter></term>
687 <listitem>
688 <para>Inform the assembler that it can assume the DP register contains
689 <parameter>expr</parameter>. This directive is only advice to the assembler
690 to determine whether an address is in the direct page and has no effect
691 on the contents of the DP register. The value must be fully resolved during
692 the first assembly pass because it affects the sizes of subsequent instructions.
693 </para>
694 <para>This directive has no effect in the object file target.
695 </para>
696 </listitem>
697 </varlistentry>
698
699 <varlistentry>
700 <term>ALIGN <parameter>expr</parameter>[,<parameter>value</parameter>]</term>
701 <listitem>
702
703 <para>Force the current assembly address to be a multiple of
704 <parameter>expr</parameter>. If <parameter>value</parameter> is not
705 specified, a series of NUL bytes is output to force the alignment, if
706 required. Otherwise, the low order 8 bits of <parameter>value</parameter>
707 will be used as the fill. The alignment value must be fully resolved on the
708 first pass because it affects the addresses of subsquent instructions.
709 However, <parameter>value</parameter> may include forward references; as
710 long as it resolves to a constant for the second pass, the value will be
711 accepted.</para>
712
713 <para>Unless <parameter>value</parameter> is specified as something like $12,
714 this directive is not suitable for inclusion in the middle of actual code.
715 The default padding value is $00 which is intended to be used within data
716 blocks. </para>
717
718 </listitem>
719 </varlistentry>
720
721 </variablelist>
722
723 </section>
724
725 <section>
726 <title>Conditional Assembly</title>
727 <para>
728 Portions of the source code can be excluded or included based on conditions
729 known at assembly time. Conditionals can be nested arbitrarily deeply. The
730 directives associated with conditional assembly are described in this section.
731 </para>
732 <para>All conditionals must be fully bracketed. That is, every conditional
733 statement must eventually be followed by an ENDC at the same level of nesting.
734 </para>
735 <para>Conditional expressions are only evaluated on the first assembly pass.
736 It is not possible to game the assembly process by having a conditional
737 change its value between assembly passes. Thus there is not and never will
738 be any equivalent of IFP1 or IFP2 as provided by other assemblers.</para>
739
740 <variablelist>
741 <varlistentry>
742 <term>IFEQ <parameter>expr</parameter></term>
743 <listitem>
744 <para>If <parameter>expr</parameter> evaluates to zero, the conditional
745 will be considered true.
746 </para>
747 </listitem>
748 </varlistentry>
749
750 <varlistentry>
751 <term>IFNE <parameter>expr</parameter></term>
752 <term>IF <parameter>expr</parameter></term>
753 <listitem>
754 <para>If <parameter>expr</parameter> evaluates to a non-zero value, the conditional
755 will be considered true.
756 </para>
757 </listitem>
758 </varlistentry>
759
760 <varlistentry>
761 <term>IFGT <parameter>expr</parameter></term>
762 <listitem>
763 <para>If <parameter>expr</parameter> evaluates to a value greater than zero, the conditional
764 will be considered true.
765 </para>
766 </listitem>
767 </varlistentry>
768
769 <varlistentry>
770 <term>IFGE <parameter>expr</parameter></term>
771 <listitem>
772 <para>If <parameter>expr</parameter> evaluates to a value greater than or equal to zero, the conditional
773 will be considered true.
774 </para>
775 </listitem>
776 </varlistentry>
777
778 <varlistentry>
779 <term>IFLT <parameter>expr</parameter></term>
780 <listitem>
781 <para>If <parameter>expr</parameter> evaluates to a value less than zero, the conditional
782 will be considered true.
783 </para>
784 </listitem>
785 </varlistentry>
786
787 <varlistentry>
788 <term>IFLE <parameter>expr</parameter></term>
789 <listitem>
790 <para>If <parameter>expr</parameter> evaluates to a value less than or equal to zero , the conditional
791 will be considered true.
792 </para>
793 </listitem>
794 </varlistentry>
795
796 <varlistentry>
797 <term>IFDEF <parameter>sym</parameter></term>
798 <listitem>
799 <para>If <parameter>sym</parameter> is defined at this point in the assembly
800 process, the conditional
801 will be considered true.
802 </para>
803 </listitem>
804 </varlistentry>
805
806 <varlistentry>
807 <term>IFNDEF <parameter>sym</parameter></term>
808 <listitem>
809 <para>If <parameter>sym</parameter> is not defined at this point in the assembly
810 process, the conditional
811 will be considered true.
812 </para>
813 </listitem>
814 </varlistentry>
815
816 <varlistentry>
817 <term>ELSE</term>
818 <listitem>
819 <para>
820 If the preceding conditional at the same level of nesting was false, the
821 statements following will be assembled. If the preceding conditional at
822 the same level was true, the statements following will not be assembled.
823 Note that the preceding conditional might have been another ELSE statement
824 although this behaviour is not guaranteed to be supported in future versions
825 of LWASM.
826 </para>
827 </listitem>
828
829 <varlistentry>
830 <term>ENDC</term>
831 <listitem>
832 <para>
833 This directive marks the end of a conditional construct. Every conditional
834 construct must end with an ENDC directive.
835 </para>
836 </listitem>
837 </varlistentry>
838
839 </variablelist>
840 </section>
841
842 <section>
843 <title>OS9 Target Directives</title>
844
845 <para>This section includes directives that apply solely to the OS9
846 target.</para>
847
848 <variablelist>
849
850 <varlistentry>
851 <term>OS9 <parameter>syscall</parameter></term>
852 <listitem>
853 <para>
854
855 This directive generates a call to the specified system call. <parameter>syscall</parameter> may be an arbitrary expression.
856
857 </para>
858 </listitem>
859 </varlistentry>
860
861 <varlistentry>
862 <term>MOD <parameter>size</parameter>,<parameter>name</parameter>,<parameter>type</parameter>,<parameter>flags</parameter>,<parameter>execoff</parameter>,<parameter>datasize</parameter></term>
863 <listitem>
864 <para>
865
866 This tells LWASM that the beginning of the actual module is here. It will
867 generate a module header based on the parameters specified. It will also
868 begin calcuating the module CRC.
869
870 </para>
871
872 <para>
873
874 The precise meaning of the various parameters is beyond the scope of this
875 document since it is not a tutorial on OS9 module programming.
876
877 </para>
878
879 </listitem>
880 </varlistentry>
881
882 <varlistentry>
883 <term>EMOD</term>
884 <listitem>
885 <para>
886
887 This marks the end of a module and causes LWASM to emit the calculated CRC
888 for the module.
889
890 </para>
891 </varlistentry>
892
893 </variablelist>
894 </section>
895
896 <section>
897 <title>Miscelaneous Directives</title>
898
899 <para>This section includes directives that do not fit into the other
900 categories.</para>
901
902 <variablelist>
903
904 <varlistentry>
905 <term>INCLUDE <parameter>filename</parameter></term>
906 <term>USE <parameter>filename</parameter></term>
907
908 <listitem> <para> Include the contents of <parameter>filename</parameter> at
909 this point in the assembly as though it were a part of the file currently
910 being processed. Note that if whitespace appears in the name of the file,
911 you must enclose <parameter>filename</parameter> in quotes.
912 </para>
913
914 <para>
915 Note that the USE variation is provided only for compatibility with other
916 assemblers. It is recommended to use the INCLUDE variation.</para>
917
918 </listitem>
919 </varlistentry>
920
921 <varlistentry>
922 <term>END <parameter>[expr]</parameter></term>
923 <listitem>
924 <para>
925 This directive causes the assembler to stop assembling immediately as though
926 it ran out of input. For the DECB target only, <parameter>expr</parameter>
927 can be used to set the execution address of the resulting binary. For all
928 other targets, specifying <parameter>expr</parameter> will cause an error.
929 </para>
930 </listitem>
931 </varlistentry>
932
933 <varlistentry>
934 <term>ERROR <parameter>string</parameter></term>
935 <listitem>
936 <para>
937 Causes a custom error message to be printed at this line. This will cause
938 assembly to fail. This directive is most useful inside conditional constructs
939 to cause assembly to fail if some condition that is known bad happens.
940 </para>
941 </listitem>
942 </varlistentry>
943
944 <varlistentry>
945 <term>.MODULE <parameter>string</parameter></term>
946 <listitem>
947 <para>
948 This directive is ignored for most output targets. If the output target
949 supports encoding a module name into it, <parameter>string</parameter>
950 will be used as the module name.
951 </para>
952 <para>
953 As of version 2.2, no supported output targets support this directive.
954 </para>
955 </listitem>
956 </varlistentry>
957
958 </variablelist>
959 </section>
960
961 </section>
962
963 <section>
964 <title>Macros</title>
965 <para>
966 LWASM is a macro assembler. A macro is simply a name that stands in for a
967 series of instructions. Once a macro is defined, it is used like any other
968 assembler directive. Defining a macro can be considered equivalent to adding
969 additional assembler directives.
970 </para>
971 <para>Macros my accept parameters. These parameters are referenced within
972 a macro by the a backslash ("\") followed by a digit 1 through 9 for the first
973 through ninth parameters. They may also be referenced by enclosing the
974 decimal parameter number in braces ("{num}"). These parameter references
975 are replaced with the verbatim text of the parameter passed to the macro. A
976 reference to a non-existent parameter will be replaced by an empty string.
977 Macro parameters are expanded everywhere on each source line. That means
978 the parameter to a macro could be used as a symbol or it could even appear
979 in a comment or could cause an entire source line to be commented out
980 when the macro is expanded.
981 </para>
982 <para>
983 Parameters passed to a macro are separated by commas and the parameter list
984 is terminated by any whitespace. This means that neither a comma nor whitespace
985 may be included in a macro parameter.
986 </para>
987 <para>
988 Macro expansion is done recursively. That is, within a macro, macros are
989 expanded. This can lead to infinite loops in macro expansion. If the assembler
990 hangs for a long time while assembling a file that uses macros, this may be
991 the reason.</para>
992
993 <para>Each macro expansion receives its own local symbol context which is not
994 inherited by any macros called by it nor is it inherited from the context
995 the macro was instantiated in. That means it is possible to use local symbols
996 within macros without having them collide with symbols in other macros or
997 outside the macro itself. However, this also means that using a local symbol
998 as a parameter to a macro, while legal, will not do what it would seem to do
999 as it will result in looking up the local symbol in the macro's symbol context
1000 rather than the enclosing context where it came from, likely yielding either
1001 an undefined symbol error or bizarre assembly results.
1002 </para>
1003 <para>
1004 Note that there is no way to define a macro as local to a symbol context. All
1005 macros are part of the global macro namespace. However, macros have a separate
1006 namespace from symbols so it is possible to have a symbol with the same name
1007 as a macro.
1008 </para>
1009
1010 <para>
1011 Macros are defined only during the first pass. Macro expansion also
1012 only occurs during the first pass. On the second pass, the macro
1013 definition is simply ignored. Macros must be defined before they are used.
1014 </para>
1015
1016 <para>The following directives are used when defining macros.</para>
1017
1018 <variablelist>
1019 <varlistentry>
1020 <term><parameter>macroname</parameter> MACRO</term>
1021 <listitem>
1022 <para>This directive is used to being the definition of a macro called
1023 <parameter>macroname</parameter>. If <parameter>macroname</parameter> already
1024 exists, it is considered an error. Attempting to define a macro within a
1025 macro is undefined. It may work and it may not so the behaviour should not
1026 be relied upon.
1027 </para>
1028 </listitem>
1029 </varlistentry>
1030
1031 <varlistentry>
1032 <term>ENDM</term>
1033 <listitem>
1034 <para>
1035 This directive indicates the end of the macro currently being defined. It
1036 causes the assembler to resume interpreting source lines as normal.
1037 </para>
1038 </listitem>
1039 </variablelist>
1040
1041 </section>
1042
1043 <section>
1044 <title>Object Files and Sections</title>
1045 <para>
1046 The object file target is very useful for large project because it allows
1047 multiple files to be assembled independently and then linked into the final
1048 binary at a later time. It allows only the small portion of the project
1049 that was modified to be re-assembled rather than requiring the entire set
1050 of source code to be available to the assembler in a single assembly process.
1051 This can be particularly important if there are a large number of macros,
1052 symbol definitions, or other metadata that uses resources at assembly time.
1053 By far the largest benefit, however, is keeping the source files small enough
1054 for a mere mortal to find things in them.
1055 </para>
1056
1057 <para>
1058 With multi-file projects, there needs to be a means of resolving references to
1059 symbols in other source files. These are known as external references. The
1060 addresses of these symbols cannot be known until the linker joins all the
1061 object files into a single binary. This means that the assembler must be
1062 able to output the object code without knowing the value of the symbol. This
1063 places some restrictions on the code generated by the assembler. For
1064 example, the assembler cannot generate direct page addressing for instructions
1065 that reference external symbols because the address of the symbol may not
1066 be in the direct page. Similarly, relative branches and PC relative addressing
1067 cannot be used in their eight bit forms. Everything that must be resolved
1068 by the linker must be assembled to use the largest address size possible to
1069 allow the linker to fill in the correct value at link time. Note that the
1070 same problem applies to absolute address references as well, even those in
1071 the same source file, because the address is not known until link time.
1072 </para>
1073
1074 <para>
1075 It is often desired in multi-file projects to have code of various types grouped
1076 together in the final binary generated by the linker as well. The same applies
1077 to data. In order for the linker to do that, the bits that are to be grouped
1078 must be tagged in some manner. This is where the concept of sections comes in.
1079 Each chunk of code or data is part of a section in the object file. Then,
1080 when the linker reads all the object files, it coalesces all sections of the
1081 same name into a single section and then considers it as a unit.
1082 </para>
1083
1084 <para>
1085 The existence of sections, however, raises a problem for symbols even
1086 within the same source file. Thus, the assembler must treat symbols from
1087 different sections within the same source file in the same manner as external
1088 symbols. That is, it must leave them for the linker to resolve at link time,
1089 with all the limitations that entails.
1090 </para>
1091
1092 <para>
1093 In the object file target mode, LWASM requires all source lines that
1094 cause bytes to be output to be inside a section. Any directives that do
1095 not cause any bytes to be output can appear outside of a section. This includes
1096 such things as EQU or RMB. Even ORG can appear outside a section. ORG, however,
1097 makes no sense within a section because it is the linker that determines
1098 the starting address of the section's code, not the assembler.
1099 </para>
1100
1101 <para>
1102 All symbols defined globally in the assembly process are local to the
1103 source file and cannot be exported. All symbols defined within a section are
1104 considered local to the source file unless otherwise explicitly exported.
1105 Symbols referenced from external source files must be declared external,
1106 either explicitly or by asking the assembler to assume that all undefined
1107 symbols are external.
1108 </para>
1109
1110 <para>
1111 It is often handy to define a number of memory addresses that will be
1112 used for data at run-time but which need not be included in the binary file.
1113 These memory addresses are not initialized until run-time, either by the
1114 program itself or by the program loader, depending on the operating environment.
1115 Such sections are often known as BSS sections. LWASM supports generating
1116 sections with a BSS attribute set which causes the section definition including
1117 symbols exported from that section and those symbols required to resolve
1118 references from the local file, but with no actual code in the object file.
1119 It is illegal for any source lines within a BSS flagged section to cause any
1120 bytes to be output.
1121 </para>
1122
1123 <para>The following directives apply to section handling.</para>
1124
1125 <variablelist>
1126 <varlistentry>
1127 <term>SECTION <parameter>name[,flags]</parameter></term>
1128 <term>SECT <parameter>name[,flags]</parameter></term>
1129 <term>.AREA <parameter>name[,flags]</parameter></term>
1130 <listitem>
1131 <para>
1132 Instructs the assembler that the code following this directive is to be
1133 considered part of the section <parameter>name</parameter>. A section name
1134 may appear multiple times in which case it is as though all the code from
1135 all the instances of that section appeared adjacent within the source file.
1136 However, <parameter>flags</parameter> may only be specified on the first
1137 instance of the section.
1138 </para>
1139 <para>There is a single flag supported in <parameter>flags</parameter>. The
1140 flag <parameter>bss</parameter> will cause the section to be treated as a BSS
1141 section and, thus, no code will be included in the object file nor will any
1142 bytes be permitted to be output.</para>
1143 <para>
1144 If the section name is "bss" or ".bss" in any combination of upper and
1145 lower case, the section is assumed to be a BSS section. In that case,
1146 the flag <parameter>!bss</parameter> can be used to override this assumption.
1147 </para>
1148 <para>
1149 If assembly is already happening within a section, the section is implicitly
1150 ended and the new section started. This is not considered an error although
1151 it is recommended that all sections be explicitly closed.
1152 </para>
1153 </listitem>
1154 </varlistentry>
1155
1156 <varlistentry>
1157 <term>ENDSECTION</term>
1158 <term>ENDSECT</term>
1159 <term>ENDS</term>
1160 <listitem>
1161 <para>
1162 This directive ends the current section. This puts assembly outside of any
1163 sections until the next SECTION directive.
1164 </listitem>
1165 </varlistentry>
1166
1167 <varlistentry>
1168 <term><parameter>sym</parameter> EXTERN</term>
1169 <term><parameter>sym</parameter> EXTERNAL</term>
1170 <term><parameter>sym</parameter> IMPORT</term>
1171 <listitem>
1172 <para>
1173 This directive defines <parameter>sym</parameter> as an external symbol.
1174 This directive may occur at any point in the source code. EXTERN definitions
1175 are resolved on the first pass so an EXTERN definition anywhere in the
1176 source file is valid for the entire file. The use of this directive is
1177 optional when the assembler is instructed to assume that all undefined
1178 symbols are external. In fact, in that mode, if the symbol is referenced
1179 before the EXTERN directive, an error will occur.
1180 </para>
1181 </listitem>
1182 </varlistentry>
1183
1184 <varlistentry>
1185 <term><parameter>sym</parameter> EXPORT</term>
1186 <term><parameter>sym</parameter> .GLOBL</term>
1187
1188 <term>EXPORT <parameter>sym</parameter></term>
1189 <term>.GLOBL <parameter>sym</parameter></term>
1190
1191 <listitem>
1192 <para>
1193 This directive defines <parameter>sym</parameter> as an exported symbol.
1194 This directive may occur at any point in the source code, even before the
1195 definition of the exported symbol.
1196 </para>
1197 <para>
1198 Note that <parameter>sym</parameter> may appear as the operand or as the
1199 statement's symbol. If there is a symbol on the statement, that will
1200 take precedence over any operand that is present.
1201 </para>
1202 </listitem>
1203 </varlistentry>
1204
1205 </variablelist>
1206
1207 </section>
1208
1209 <section>
1210 <title>Assembler Modes and Pragmas</title>
1211 <para>
1212 There are a number of options that affect the way assembly is performed.
1213 Some of these options can only be specified on the command line because
1214 they determine something absolute about the assembly process. These include
1215 such things as the output target. Other things may be switchable during
1216 the assembly process. These are known as pragmas and are, by definition,
1217 not portable between assemblers.
1218 </para>
1219
1220 <para>LWASM supports a number of pragmas that affect code generation or
1221 otherwise affect the behaviour of the assembler. These may be specified by
1222 way of a command line option or by assembler directives. The directives
1223 are as follows.
1224 </para>
1225
1226 <variablelist>
1227 <varlistentry>
1228 <term>PRAGMA <parameter>pragma[,...]</parameter></term>
1229 <listitem>
1230 <para>
1231 Specifies that the assembler should bring into force all <parameter>pragma</parameter>s
1232 specified. Any unrecognized pragma will cause an assembly error. The new
1233 pragmas will take effect immediately. This directive should be used when
1234 the program will assemble incorrectly if the pragma is ignored or not supported.
1235 </para>
1236 </listitem>
1237 </varlistentry>
1238
1239 <varlistentry>
1240 <term>*PRAGMA <parameter>pragma[,...]</parameter></term>
1241 <listitem>
1242 <para>
1243 This is identical to the PRAGMA directive except no error will occur with
1244 unrecognized or unsupported pragmas. This directive, by virtue of starting
1245 with a comment character, will also be ignored by assemblers that do not
1246 support this directive. Use this variation if the pragma is not required
1247 for correct functioning of the code.
1248 </para>
1249 </listitem>
1250 </varlistentry>
1251 </variablelist>
1252
1253 <para>Each pragma supported has a positive version and a negative version.
1254 The positive version enables the pragma while the negative version disables
1255 it. The negatitve version is simply the positive version with "no" prefixed
1256 to it. For instance, "pragma" vs. "nopragma". Only the positive version is
1257 listed below.</para>
1258
1259 <para>Pragmas are not case sensitive.</para>
1260
1261 <variablelist>
1262 <varlistentry>
1263 <term>index0tonone</term>
1264 <listitem>
1265 <para>
1266 When in force, this pragma enables an optimization affecting indexed addressing
1267 modes. When the offset expression in an indexed mode evaluates to zero but is
1268 not explicity written as 0, this will replace the operand with the equivalent
1269 no offset mode, thus creating slightly faster code. Because of the advantages
1270 of this optimization, it is enabled by default.
1271 </para>
1272 </listitem>
1273 </varlistentry>
1274
1275 <varlistentry>
1276 <term>cescapes</term>
1277 <listitem>
1278 <para>
1279 This pragma will cause strings in the FCC, FCS, and FCN pseudo operations to
1280 have C-style escape sequences interpreted. The one departure from the official
1281 spec is that unrecognized escape sequences will return either the character
1282 immediately following the backslash or some undefined value. Do not rely
1283 on the behaviour of undefined escape sequences.
1284 </para>
1285 </listitem>
1286 </varlistentry>
1287
1288 <varlistentry>
1289 <term>importundefexport</term>
1290 <listitem>
1291 <para>
1292 This pragma is only valid for targets that support external references. When
1293 in force, it will cause the EXPORT directive to act as IMPORT if the symbol
1294 to be exported is not defined. This is provided for compatibility with the
1295 output of gcc6809 and should not be used in hand written code. Because of
1296 the confusion this pragma can cause, it is disabled by default.
1297 </para>
1298 </listitem>
1299 </varlistentry>
1300
1301 <varlistentry>
1302 <term>undefextern</term>
1303 <listitem>
1304 <para>
1305 This pragma is only valid for targets that support external references. When in
1306 force, if the assembler sees an undefined symbol on the second pass, it will
1307 automatically define it as an external symbol. This automatic definition will
1308 apply for the remainder of the assembly process, even if the pragma is
1309 subsequently turned off. Because this behaviour would be potentially surprising,
1310 this pragma defaults to off.
1311 </para>
1312 <para>
1313 The primary use for this pragma is for projects that share a large number of
1314 symbols between source files. In such cases, it is impractical to enumerate
1315 all the external references in every source file. This allows the assembler
1316 and linker to do the heavy lifting while not preventing a particular source
1317 module from defining a local symbol of the same name as an external symbol
1318 if it does not need the external symbol. (This pragma will not cause an
1319 automatic external definition if there is already a locally defined symbol.)
1320 </para>
1321 <para>
1322 This pragma will often be specified on the command line for large projects.
1323 However, depending on the specific dynamics of the project, it may be sufficient
1324 for one or two files to use this pragma internally.
1325 </para>
1326 </listitem>
1327 </varlistentry>
1328
1329 <varlistentry>
1330 <term>dollarlocal</term>
1331 <listitem>
1332
1333 <para>When set, a "$" in a symbol makes it local. When not set, "$" does not
1334 cause a symbol to be local. It is set by default except when using the OS9
1335 target.</para>
1336
1337 </listitem>
1338 </varlistentry>
1339
1340 <varlistentry>
1341 <term>dollarnotlocal</term>
1342 <listitem>
1343
1344 <para> This is the same as the "dollarlocal" pragma except its sense is
1345 reversed. That is, "dollarlocal" and "nodollarnotlocal" are equivalent and
1346 "nodollarlocal" and "dollarnotlocal" are equivalent. </para>
1347
1348 </listitem>
1349 </varlistentry>
1350
1351 </variablelist>
1352
1353 </section>
1354
1355 </chapter>
1356
1357 <chapter>
1358 <title>LWLINK</title>
1359 <para>
1360 The LWTOOLS linker is called LWLINK. This chapter documents the various features
1361 of the linker.
1362 </para>
1363
1364 <section>
1365 <title>Command Line Options</title>
1366 <para>
1367 The binary for LWLINK is called "lwlink". Note that the binary is in lower
1368 case. lwlink takes the following command line arguments.
1369 </para>
1370 <variablelist>
1371 <varlistentry>
1372 <term><option>--decb</option></term>
1373 <term><option>-b</option></term>
1374 <listitem>
1375 <para>
1376 Selects the DECB output format target. This is equivalent to <option>--format=decb</option>
1377 </para>
1378 </listitem>
1379 </varlistentry>
1380
1381 <varlistentry>
1382 <term><option>--output=FILE</option></term>
1383 <term><option>-o FILE</option></term>
1384 <listitem>
1385 <para>
1386 This option specifies the name of the output file. If not specified, the
1387 default is <option>a.out</option>.
1388 </para>
1389 </listitem>
1390 </varlistentry>
1391
1392 <varlistentry>
1393 <term><option>--format=TYPE</option></term>
1394 <term><option>-f TYPE</option></term>
1395 <listitem>
1396 <para>
1397 This option specifies the output format. Valid values are <option>decb</option>
1398 and <option>raw</option>
1399 </para>
1400 </listitem>
1401 </varlistentry>
1402
1403 <varlistentry>
1404 <term><option>--raw</option></term>
1405 <term><option>-r</option></term>
1406 <listitem>
1407 <para>
1408 This option specifies the raw output format.
1409 It is equivalent to <option>--format=raw</option>.
1410 and <option>raw</option>
1411 </para>
1412 </listitem>
1413 </varlistentry>
1414
1415 <varlistentry>
1416 <term><option>--script=FILE</option></term>
1417 <term><option>-s</option></term>
1418 <listitem>
1419 <para>
1420 This option allows specifying a linking script to override the linker's
1421 built in defaults.
1422 </para>
1423 </listitem>
1424 </varlistentry>
1425
1426 <varlistentry>
1427 <term><option>--section-base=SECT=BASE</option></term>
1428 <listitem>
1429 <para>
1430 Cause section SECT to load at base address BASE. This will be prepended
1431 to the built-in link script. It is ignored if a link script is provided.
1432 </para>
1433 </listitem>
1434 </varlistentry>
1435
1436 <varlistentry>
1437 <term><option>--map=FILE</option></term>
1438 <term><option>-m FILE</option></term>
1439 <listitem>
1440 <para>
1441 This will output a description of the link result to FILE.
1442 </para>
1443 </listitem>
1444 </varlistentry>
1445
1446 <varlistentry>
1447 <term><option>--library=LIBSPEC</option></term>
1448 <term><option>-l LIBSPEC</option></term>
1449 <listitem>
1450 <para>
1451 Load a library using the library search path. LIBSPEC will have "lib" prepended
1452 and ".a" appended.
1453 </para>
1454 </listitem>
1455 </varlistentry>
1456
1457 <varlistentry>
1458 <term><option>--library-path=DIR</option></term>
1459 <term><option>-L DIR</option></term>
1460 <listitem>
1461 <para>
1462 Add DIR to the library search path.
1463 </para>
1464 </listitem>
1465 </varlistentry>
1466
1467 <varlistentry>
1468 <term><option>--debug</option></term>
1469 <term><option>-d</option></term>
1470 <listitem>
1471 <para>
1472 This option increases the debugging level. It is only useful for LWTOOLS
1473 developers.
1474 </para>
1475 </listitem>
1476 </varlistentry>
1477
1478 <varlistentry>
1479 <term><option>--help</option></term>
1480 <term><option>-?</option></term>
1481 <listitem>
1482 <para>
1483 This provides a listing of command line options and a brief description
1484 of each.
1485 </para>
1486 </listitem>
1487 </varlistentry>
1488
1489 <varlistentry>
1490 <term><option>--usage</option></term>
1491 <listitem>
1492 <para>
1493 This will display a usage summary.
1494 of each.
1495 </para>
1496 </listitem>
1497 </varlistentry>
1498
1499
1500 <varlistentry>
1501 <term><option>--version</option></term>
1502 <term><option>-V</option></term>
1503 <listitem>
1504 <para>
1505 This will display the version of LWLINK.
1506 </para>
1507 </listitem>
1508 </varlistentry>
1509
1510 </section>
1511
1512 <section>
1513 <title>Linker Operation</title>
1514
1515 <para>
1516
1517 LWLINK takes one or more files in supported input formats and links them
1518 into a single binary. Currently supported formats are the LWTOOLS object
1519 file format and the archive format used by LWAR. While the precise method is
1520 slightly different, linking can be conceptualized as the following steps.
1521
1522 </para>
1523
1524 <orderedlist>
1525 <listitem>
1526 <para>
1527 First, the linker loads a linking script. If no script is specified, it
1528 loads a built-in default script based on the output format selected. This
1529 script tells the linker how to lay out the various sections in the final
1530 binary.
1531 </para>
1532 </listitem>
1533
1534 <listitem>
1535 <para>
1536 Next, the linker reads all the input files into memory. At this time, it
1537 flags any format errors in those files. It constructs a table of symbols
1538 for each object at this time.
1539 </para>
1540 </listitem>
1541
1542 <listitem>
1543 <para>
1544 The linker then proceeds with organizing the sections loaded from each file
1545 according to the linking script. As it does so, it is able to assign addresses
1546 to each symbol defined in each object file. At this time, the linker may
1547 also collapse different instances of the same section name into a single
1548 section by appending the data from each subsequent instance of the section
1549 to the first instance of the section.
1550 </para>
1551 </listitem>
1552
1553 <listitem>
1554 <para>
1555 Next, the linker looks through every object file for every incomplete reference.
1556 It then attempts to fully resolve that reference. If it cannot do so, it
1557 throws an error. Once a reference is resolved, the value is placed into
1558 the binary code at the specified section. It should be noted that an
1559 incomplete reference can reference either a symbol internal to the object
1560 file or an external symbol which is in the export list of another object
1561 file.
1562 </para>
1563 </listitem>
1564
1565 <listitem>
1566 <para>
1567 If all of the above steps are successful, the linker opens the output file
1568 and actually constructs the binary.
1569 </para>
1570 </listitem>
1571 </orderedlist>
1572
1573 </section>
1574
1575 <section
1576 <title>Linking Scripts</title>
1577 <para>
1578 A linker script is used to instruct the linker about how to assemble the
1579 various sections into a completed binary. It consists of a series of
1580 directives which are considered in the order they are encountered.
1581 </para>
1582 <para>
1583 The sections will appear in the resulting binary in the order they are
1584 specified in the script file. If a referenced section is not found, the linker will behave as though the
1585 section did exist but had a zero size, no relocations, and no exports.
1586 A section should only be referenced once. Any subsequent references will have
1587 an undefined effect.
1588 </para>
1589
1590 <para>
1591 All numbers are in linking scripts are specified in hexadecimal. All directives
1592 are case sensitive although the hexadecimal numbers are not.
1593 </para>
1594
1595 <para>A section name can be specified as a "*", then any section not
1596 already matched by the script will be matched. The "*" can be followed
1597 by a comma and a flag to narrow the section down slightly, also.
1598 If the flag is "!bss", then any section that is not flagged as a bss section
1599 will be matched. If the flag is "bss", then any section that is flagged as
1600 bss will be matched.
1601 </para>
1602
1603 <para>The following directives are understood in a linker script.</para>
1604 <variablelist>
1605 <varlistentry>
1606 <term>section <parameter>name</parameter> load <parameter>addr</parameter></term>
1607 <listitem><para>
1608
1609 This causes the section <parameter>name</parameter> to load at
1610 <parameter>addr</parameter>. For the raw target, only one "load at" entry is
1611 allowed for non-bss sections and it must be the first one. For raw targets,
1612 it affects the addresses the linker assigns to symbols but has no other
1613 affect on the output. bss sections may all have separate load addresses but
1614 since they will not appear in the binary anyway, this is okay.
1615 </para><para>
1616 For the decb target, each "load" entry will cause a new "block" to be
1617 output to the binary which will contain the load address. It is legal for
1618 sections to overlap in this manner - the linker assumes the loader will sort
1619 everything out.
1620 </para></listitem>
1621 </varlistentry>
1622
1623 <varlistentry>
1624 <term>section <parameter>name</parameter></term>
1625 <listitem><para>
1626
1627 This will cause the section <parameter>name</parameter> to load after the previously listed
1628 section.
1629 </para></listitem></varlistentry>
1630 <varlistentry>
1631 <term>exec <parameter>addr or sym</parameter></term>
1632 <listitem>
1633 <para>
1634 This will cause the execution address (entry point) to be the address
1635 specified (in hex) or the specified symbol name. The symbol name must
1636 match a symbol that is exported by one of the object files being linked.
1637 This has no effect for targets that do not encode the entry point into the
1638 resulting file. If not specified, the entry point is assumed to be address 0
1639 which is probably not what you want. The default link scripts for targets
1640 that support this directive automatically starts at the beginning of the
1641 first section (usually "init" or "code") that is emitted in the binary.
1642 </para>
1643 </listitem>
1644 </varlistentry>
1645
1646 <varlistentry>
1647 <term>pad <parameter>size</parameter></term>
1648 <listitem><para>
1649 This will cause the output file to be padded with NUL bytes to be exactly
1650 <parameter>size</parameter> bytes in length. This only makes sense for a raw target.
1651 </para>
1652 </listitem>
1653 </varlistentry>
1654 </variablelist>
1655
1656
1657
1658 </section>
1659
1660 </chapter>
1661
1662 <chapter>
1663 <title>Libraries and LWAR</title>
1664
1665 <para>
1666 LWTOOLS also includes a tool for managing libraries. These are analogous to
1667 the static libraries created with the "ar" tool on POSIX systems. Each library
1668 file contains one or more object files. The linker will treat the object
1669 files within a library as though they had been specified individually on
1670 the command line except when resolving external references. External references
1671 are looked up first within the object files within the library and then, if
1672 not found, the usual lookup based on the order the files are specified on
1673 the command line occurs.
1674 </para>
1675
1676 <para>
1677 The tool for creating these libary files is called LWAR.
1678 </para>
1679
1680 <section>
1681 <title>Command Line Options</title>
1682 <para>
1683 The binary for LWAR is called "lwar". Note that the binary is in lower
1684 case. The options lwar understands are listed below. For archive manipulation
1685 options, the first non-option argument is the name of the archive. All other
1686 non-option arguments are the names of files to operate on.
1687 </para>
1688
1689 <variablelist>
1690 <varlistentry>
1691 <term><option>--add</option></term>
1692 <term><option>-a</option></term>
1693 <listitem>
1694 <para>
1695 This option specifies that an archive is going to have files added to it.
1696 If the archive does not already exist, it is created. New files are added
1697 to the end of the archive.
1698 </para>
1699 </listitem>
1700 </varlistentry>
1701
1702 <varlistentry>
1703 <term><option>--create</option></term>
1704 <term><option>-c</option></term>
1705 <listitem>
1706 <para>
1707 This option specifies that an archive is going to be created and have files
1708 added to it. If the archive already exists, it is truncated.
1709 </para>
1710 </listitem>
1711 </varlistentry>
1712
1713 <varlistentry>
1714 <term><option>--merge</option></term>
1715 <term><option>-m</option></term>
1716 <listitem>
1717 <para>
1718 If specified, any files specified to be added to an archive will be checked
1719 to see if they are archives themselves. If so, their constituent members are
1720 added to the archive. This is useful for avoiding archives containing archives.
1721 </para>
1722 </listitem>
1723 </varlistentry>
1724
1725 <varlistentry>
1726 <term><option>--list</option></term>
1727 <term><option>-l</option></term>
1728 <listitem>
1729 <para>
1730 This will display a list of the files contained in the archive.
1731 </para>
1732 </listitem>
1733 </varlistentry>
1734
1735 <varlistentry>
1736 <term><option>--debug</option></term>
1737 <term><option>-d</option></term>
1738 <listitem>
1739 <para>
1740 This option increases the debugging level. It is only useful for LWTOOLS
1741 developers.
1742 </para>
1743 </listitem>
1744 </varlistentry>
1745
1746 <varlistentry>
1747 <term><option>--help</option></term>
1748 <term><option>-?</option></term>
1749 <listitem>
1750 <para>
1751 This provides a listing of command line options and a brief description
1752 of each.
1753 </para>
1754 </listitem>
1755 </varlistentry>
1756
1757 <varlistentry>
1758 <term><option>--usage</option></term>
1759 <listitem>
1760 <para>
1761 This will display a usage summary.
1762 of each.
1763 </para>
1764 </listitem>
1765 </varlistentry>
1766
1767
1768 <varlistentry>
1769 <term><option>--version</option></term>
1770 <term><option>-V</option></term>
1771 <listitem>
1772 <para>
1773 This will display the version of LWLINK.
1774 of each.
1775 </para>
1776 </listitem>
1777 </varlistentry>
1778
1779 </section>
1780
1781 </chapter>
1782
1783 <chapter id="objchap">
1784 <title>Object Files</title>
1785 <para>
1786 LWTOOLS uses a proprietary object file format. It is proprietary in the sense
1787 that it is specific to LWTOOLS, not that it is a hidden format. It would be
1788 hard to keep it hidden in an open source tool chain anyway. This chapter
1789 documents the object file format.
1790 </para>
1791
1792 <para>
1793 An object file consists of a series of sections each of which contains a
1794 list of exported symbols, a list of incomplete references, and a list of
1795 "local" symbols which may be used in calculating incomplete references. Each
1796 section will obviously also contain the object code.
1797 </para>
1798
1799 <para>
1800 Exported symbols must be completely resolved to an address within the
1801 section it is exported from. That is, an exported symbol must be a constant
1802 rather than defined in terms of other symbols.</para>
1803
1804 <para>
1805 Each object file starts with a magic number and version number. The magic
1806 number is the string "LWOBJ16" for this 16 bit object file format. The only
1807 defined version number is currently 0. Thus, the first 8 bytes of the object
1808 file are <code>4C574F424A313600</code>
1809 </para>
1810
1811 <para>
1812 Each section has the following items in order:
1813 </para>
1814
1815 <itemizedlist>
1816 <listitem><para>section name</para></listitem>
1817 <listitem><para>flags</para></listitem>
1818 <listitem><para>list of local symbols (and addresses within the section)</para></listitem>
1819 <listitem><para>list of exported symbols (and addresses within the section)</para></listitem>
1820 <listitem><para>list of incomplete references along with the expressions to calculate them</para></listitem>
1821 <listitem><para>the actual object code (for non-BSS sections)</para></listitem>
1822 </itemizedlist>
1823
1824 <para>
1825 The section starts with the name of the section with a NUL termination
1826 followed by a series of flag bytes terminated by NUL. There are only two
1827 flag bytes defined. A NUL (0) indicates no more flags and a value of 1
1828 indicates the section is a BSS section. For a BSS section, no actual
1829 code is included in the object file.
1830 </para>
1831
1832 <para>
1833 Either a NULL section name or end of file indicate the presence of no more
1834 sections.
1835 </para>
1836
1837 <para>
1838 Each entry in the exported and local symbols table consists of the symbol
1839 (NUL terminated) followed by two bytes which contain the value in big endian
1840 order. The end of a symbol table is indicated by a NULL symbol name.
1841 </para>
1842
1843 <para>
1844 Each entry in the incomplete references table consists of an expression
1845 followed by a 16 bit offset where the reference goes. Expressions are
1846 defined as a series of terms up to an "end of expression" term. Each term
1847 consists of a single byte which identifies the type of term (see below)
1848 followed by any data required by the term. Then end of the list is flagged
1849 by a NULL expression (only an end of expression term).
1850 </para>
1851
1852 <table frame="all"><title>Object File Term Types</title>
1853 <tgroup cols="2">
1854 <thead>
1855 <row>
1856 <entry>TERMTYPE</entry>
1857 <entry>Meaning</entry>
1858 </row>
1859 </thead>
1860 <tbody>
1861 <row>
1862 <entry>00</entry>
1863 <entry>end of expression</entry>
1864 </row>
1865
1866 <row>
1867 <entry>01</entry>
1868 <entry>integer (16 bit in big endian order follows)</entry>
1869 </row>
1870 <row>
1871 <entry>02</entry>
1872 <entry> external symbol reference (NUL terminated symbol name follows)</entry>
1873 </row>
1874
1875 <row>
1876 <entry>03</entry>
1877 <entry>local symbol reference (NUL terminated symbol name follows)</entry>
1878 </row>
1879
1880 <row>
1881 <entry>04</entry>
1882 <entry>operator (1 byte operator number)</entry>
1883 </row>
1884 <row>
1885 <entry>05</entry>
1886 <entry>section base address reference</entry>
1887 </row>
1888
1889 <row>
1890 <entry>FF</entry>
1891 <entry>This term will set flags for the expression. Each one of these terms will set a single flag. All of them should be specified first in an expression. If they are not, the behaviour is undefined. The byte following is the flag. There is currently only one flag defined. Flag 01 indicates an 8 bit relocation.</entry>
1892 </row>
1893 </tbody>
1894 </tgroup>
1895 </table>
1896
1897
1898 <para>
1899 External references are resolved using other object files while local
1900 references are resolved using the local symbol table(s) from this file. This
1901 allows local symbols that are not exported to have the same names as
1902 exported symbols or external references.
1903 </para>
1904
1905 <table frame="all"><title>Object File Operator Numbers</title>
1906 <tgroup cols="2">
1907 <thead>
1908 <row>
1909 <entry>Number</entry>
1910 <entry>Operator</entry>
1911 </row>
1912 </thead>
1913 <tbody>
1914 <row>
1915 <entry>01</entry>
1916 <entry>addition (+)</entry>
1917 </row>
1918 <row>
1919 <entry>02</entry>
1920 <entry>subtraction (-)</entry>
1921 </row>
1922 <row>
1923 <entry>03</entry>
1924 <entry>multiplication (*)</entry>
1925 </row>
1926 <row>
1927 <entry>04</entry>
1928 <entry>division (/)</entry>
1929 </row>
1930 <row>
1931 <entry>05</entry>
1932 <entry>modulus (%)</entry>
1933 </row>
1934 <row>
1935 <entry>06</entry>
1936 <entry>integer division (\) (same as division)</entry>
1937 </row>
1938
1939 <row>
1940 <entry>07</entry>
1941 <entry>bitwise and</entry>
1942 </row>
1943
1944 <row>
1945 <entry>08</entry>
1946 <entry>bitwise or</entry>
1947 </row>
1948
1949 <row>
1950 <entry>09</entry>
1951 <entry>bitwise xor</entry>
1952 </row>
1953
1954 <row>
1955 <entry>0A</entry>
1956 <entry>boolean and</entry>
1957 </row>
1958
1959 <row>
1960 <entry>0B</entry>
1961 <entry>boolean or</entry>
1962 </row>
1963
1964 <row>
1965 <entry>0C</entry>
1966 <entry>unary negation, 2's complement (-)</entry>
1967 </row>
1968
1969 <row>
1970 <entry>0D</entry>
1971 <entry>unary 1's complement (^)</entry>
1972 </row>
1973 </tbody>
1974 </tgroup>
1975 </table>
1976
1977 <para>
1978 An expression is represented in a postfix manner with both operands for
1979 binary operators preceding the operator and the single operand for unary
1980 operators preceding the operator.
1981 </para>
1982
1983 </chapter>
1984 </book>
1985