comparison docs/manual.docbook.sgml @ 0:2c24602be78f

Initial import from lwtools 3.0.1 version, with new hand built build system and file reorganization
author lost@l-w.ca
date Wed, 19 Jan 2011 22:27:17 -0700
parents
children fd1ecc5d6e69
comparison
equal deleted inserted replaced
-1:000000000000 0:2c24602be78f
1 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.5//EN">
2 <book>
3 <bookinfo>
4 <title>LW Tool Chain</title>
5 <author><firstname>William</firstname><surname>Astle</surname></author>
6 <copyright><year>2009, 2010</year><holder>William Astle</holder></copyright>
7 </bookinfo>
8 <chapter>
9
10 <title>Introduction</title>
11
12 <para>
13 The LW tool chain provides utilities for building binaries for MC6809 and
14 HD6309 CPUs. The tool chain includes a cross-assembler and a cross-linker
15 which support several styles of output.
16 </para>
17
18 <section>
19 <title>History</title>
20 <para>
21 For a long time, I have had an interest in creating an operating system for
22 the Coco3. I finally started working on that project around the beginning of
23 2006. I had a number of assemblers I could choose from. Eventually, I settled
24 on one and started tinkering. After a while, I realized that assembler was not
25 going to be sufficient due to lack of macros and issues with forward references.
26 Then I tried another which handled forward references correctly but still did
27 not support macros. I looked around at other assemblers and they all lacked
28 one feature or another that I really wanted for creating my operating system.
29 </para>
30
31 <para>
32 The solution seemed clear at that point. I am a fair programmer so I figured
33 I could write an assembler that would do everything I wanted an assembler to
34 do. Thus the LWASM probject was born. After more than two years of on and off
35 work, version 1.0 of LWASM was released in October of 2008.
36 </para>
37
38 <para>
39 As the aforementioned operating system project progressed further, it became
40 clear that while assembling the whole project through a single file was doable,
41 it was not practical. When I found myself playing some fancy games with macros
42 in a bid to simulate sections, I realized I needed a means of assembling
43 source files separately and linking them later. This spawned a major development
44 effort to add an object file support to LWASM. It also spawned the LWLINK
45 project to provide a means to actually link the files.
46 </para>
47
48 </section>
49
50 </chapter>
51
52 <chapter>
53 <title>Output Formats</title>
54
55 <para>
56 The LW tool chain supports multiple output formats. Each format has its
57 advantages and disadvantages. Each format is described below.
58 </para>
59
60 <section>
61 <title>Raw Binaries</title>
62 <para>
63 A raw binary is simply a string of bytes. There are no headers or other
64 niceties. Both LWLINK and LWASM support generating raw binaries. ORG directives
65 in the source code only serve to set the addresses that will be used for
66 symbols but otherwise have no direct impact on the resulting binary.
67 </para>
68
69 </section>
70 <section>
71 <title>DECB Binaries</title>
72
73 <para>A DECB binary is compatible with the LOADM command in Disk Extended
74 Color Basic on the CoCo. They are also compatible with CLOADM from Extended
75 Color Basic. These binaries include the load address of the binary as well
76 as encoding an execution address. These binaries may contain multiple loadable
77 sections, each of which has its own load address.</para>
78
79 <para>
80 Each binary starts with a preamble. Each preamble is five bytes long. The
81 first byte is zero. The next two bytes specify the number of bytes to load
82 and the last two bytes specify the address to load the bytes at. Then, a
83 string of bytes follows. After this string of bytes, there may be another
84 preamble or a postamble. A postamble is also five bytes in length. The first
85 byte of the postamble is $FF, the next two are zero, and the last two are
86 the execution address for the binary.
87 </para>
88
89 <para>
90 Both LWASM and LWLINK can output this format.
91 </para>
92 </section>
93
94 <section>
95 <title>OS9 Modules</title>
96 <para>
97
98 Since version 2.5, LWASM is able to generate OS9 modules. The syntax is
99 basically the same as for other assemblers. A module starts with the MOD
100 directive and ends with the EMOD directive. The OS9 directive is provided
101 as a shortcut for writing system calls.
102
103 </para>
104
105 <para>
106
107 LWASM does NOT provide an OS9Defs file. You must provide your own. Also note
108 that the common practice of using "ifp1" around the inclusion of the OS9Defs
109 file is discouraged as it is pointless and can lead to unintentional
110 problems and phasing errors. Because LWASM reads each file exactly once,
111 there is no benefit to restricting the inclusion to the first assembly pass.
112
113 </para>
114
115 <para>
116
117 It is also critical to understand that unlike many OS9 assemblers, LWASM
118 does NOT maintain a separate data address counter. Thus, you must define
119 all your data offsets and so on outside of the mod/emod segment. It is,
120 therefore, likely that source code targeted at other assemblers will require
121 edits to build correctly.
122
123 </para>
124
125 <para>
126
127 LWLINK does not, yet, have the ability to create OS9 modules from object
128 files.
129
130 </para>
131 </section>
132
133 <section>
134 <title>Object Files</title>
135 <para>LWASM supports generating a proprietary object file format which is
136 described in <xref linkend="objchap">. LWLINK is then used to link these
137 object files into a final binary in any of LWLINK's supported binary
138 formats.</para>
139
140 <para>Object files also support the concept of sections which are not valid
141 for other output types. This allows related code from each object file
142 linked to be collapsed together in the final binary.</para>
143
144 <para>
145 Object files are very flexible in that they allow references that are not
146 known at assembly time to be resolved at link time. However, because the
147 addresses of such references are not known at assembly time, there is no way
148 for the assembler to deduce that an eight bit addressing mode is possible.
149 That means the assember will default to using sixteen bit addressing
150 whenever an external or cross-section reference is used.
151 </para>
152
153 <para>
154 As of LWASM 2.4, it is possible to force direct page addressing for an
155 external reference. Care must be taken to ensure the resulting addresses
156 are really in the direct page since the linker does not know what the direct
157 page is supposed to be and does not emit errors for byte overflows.
158 </para>
159
160 <para>
161 It is also possible to use external references in an eight bit immediate
162 mode instruction. In this case, only the low order eight bits will be used.
163 Again, no byte overflows will be flagged.
164 </para>
165
166
167 </section>
168
169 </chapter>
170
171 <chapter>
172 <title>LWASM</title>
173 <para>
174 The LWTOOLS assembler is called LWASM. This chapter documents the various
175 features of the assembler. It is not, however, a tutorial on 6x09 assembly
176 language programming.
177 </para>
178
179 <section>
180 <title>Command Line Options</title>
181 <para>
182 The binary for LWASM is called "lwasm". Note that the binary is in lower
183 case. lwasm takes the following command line arguments.
184 </para>
185
186 <variablelist>
187
188 <varlistentry>
189 <term><option>--6309</option></term>
190 <term><option>-3</option></term>
191 <listitem>
192 <para>
193 This will cause the assembler to accept the additional instructions available
194 on the 6309 processor. This is the default mode; this option is provided for
195 completeness and to override preset command arguments.
196 </para>
197 </listitem>
198 </varlistentry>
199
200 <varlistentry>
201 <term><option>--6809</option></term>
202 <term><option>-9</option></term>
203 <listitem>
204 <para>
205 This will cause the assembler to reject instructions that are only available
206 on the 6309 processor.
207 </para>
208 </listitem>
209 </varlistentry>
210
211 <varlistentry>
212 <term><option>--decb</option></term>
213 <term><option>-b</option></term>
214 <listitem>
215 <para>
216 Select the DECB output format target. Equivalent to <option>--format=decb</option>.
217 </para>
218 <para>While this is the default output format currently, it is not safe to rely
219 on that fact. Future versions may have different defaults. It is also trivial
220 to modify the source code to change the default. Thus, it is recommended to specify
221 this option if you need DECB output.
222 </listitem>
223 </varlistentry>
224
225 <varlistentry>
226 <term><option>--format=type</option></term>
227 <term><option>-f type</option></term>
228 <listitem>
229 <para>
230 Select the output format. Valid values are <option>obj</option> for the
231 object file target, <option>decb</option> for the DECB LOADM format,
232 <option>os9</option> for creating OS9 modules, and <option>raw</option> for
233 a raw binary.
234 </para>
235 </listitem>
236 </varlistentry>
237
238 <varlistentry>
239 <term><option>--list[=file]</option></term>
240 <term><option>-l[file]</option></term>
241 <listitem>
242 <para>
243 Cause LWASM to generate a listing. If <option>file</option> is specified,
244 the listing will go to that file. Otherwise it will go to the standard output
245 stream. By default, no listing is generated. Unless <option>--symbols</option>
246 is specified, the list will not include the symbol table.
247 </para>
248 </listitem>
249 </varlistentry>
250
251 <varlistentry>
252 <term><option>--symbols</option></term>
253 <term><option>-s</option></term>
254 <listitem>
255 <para>
256 Causes LWASM to generate a list of symbols when generating a listing.
257 It has no effect unless a listing is being generated.
258 </para>
259 </listitem>
260 </varlistentry>
261
262 <varlistentry>
263 <term><option>--obj</option></term>
264 <listitem>
265 <para>
266 Select the proprietary object file format as the output target.
267 </para>
268 </listitem>
269 </varlistentry>
270
271 <varlistentry>
272 <term><option>--output=FILE</option></term>
273 <term><option>-o FILE</option></term>
274 <listitem>
275 <para>
276 This option specifies the name of the output file. If not specified, the
277 default is <option>a.out</option>.
278 </para>
279 </listitem>
280 </varlistentry>
281
282 <varlistentry>
283 <term><option>--pragma=pragma</option></term>
284 <term><option>-p pragma</option></term>
285 <listitem>
286 <para>
287 Specify assembler pragmas. Multiple pragmas are separated by commas. The
288 pragmas accepted are the same as for the PRAGMA assembler directive described
289 below.
290 </para>
291 </listitem>
292 </varlistentry>
293
294 <varlistentry>
295 <term><option>--raw</option></term>
296 <term><option>-r</option></term>
297 <listitem>
298 <para>
299 Select raw binary as the output target.
300 </para>
301 </listitem>
302 </varlistentry>
303
304 <varlistentry>
305 <term><option>--includedir=path</option></term>
306 <term><option>-I path</option></term>
307 <listitem>
308 <para>
309 Add <option>path</option> to the end of the include path.
310 </para>
311 </listitem>
312 </varlistentry>
313
314 <varlistentry>
315 <term><option>--help</option></term>
316 <term><option>-?</option></term>
317 <listitem>
318 <para>
319 Present a help screen describing the command line options.
320 </para>
321 </listitem>
322 </varlistentry>
323
324 <varlistentry>
325 <term><option>--usage</option></term>
326 <listitem>
327 <para>
328 Provide a summary of the command line options.
329 </para>
330 </listitem>
331 </varlistentry>
332
333 <varlistentry>
334 <term><option>--version</option></term>
335 <term><option>-V</option></term>
336 <listitem>
337 <para>
338 Display the software version.
339 </para>
340 </listitem>
341 </varlistentry>
342
343 <varlistentry>
344 <term><option>--debug</option></term>
345 <term><option>-d</option></term>
346 <listitem>
347 <para>
348 Increase the debugging level. Only really useful to people hacking on the
349 LWASM source code itself.
350 </para>
351 </listitem>
352 </varlistentry>
353
354 </variablelist>
355
356 </section>
357
358 <section>
359 <title>Dialects</title>
360 <para>
361 LWASM supports all documented MC6809 instructions as defined by Motorola.
362 It also supports all known HD6309 instructions. While there is general
363 agreement on the pneumonics for most of the 6309 instructions, there is some
364 variance with the block transfer instructions. TFM for all four variations
365 seems to have gained the most traction and, thus, this is the form that is
366 recommended for LWASM. However, it also supports COPY, COPY-, IMP, EXP,
367 TFRP, TFRM, TFRS, and TFRR. It further adds COPY+ as a synomym for COPY,
368 IMPLODE for IMP, and EXPAND for EXP.
369 </para>
370
371 <para>By default, LWASM accepts 6309 instructions. However, using the
372 <parameter>--6809</parameter> parameter, you can cause it to throw errors on
373 6309 instructions instead.</para>
374
375 <para>
376 The standard addressing mode specifiers are supported. These are the
377 hash sign ("#") for immediate mode, the less than sign ("&lt;") for forced
378 eight bit modes, and the greater than sign ("&gt;") for forced sixteen bit modes.
379 </para>
380
381 <para>
382 Additionally, LWASM supports using the asterisk ("*") to indicate
383 base page addressing. This should not be used in hand-written source code,
384 however, because it is non-standard and may or may not be present in future
385 versions of LWASM.
386 </para>
387
388 </section>
389
390 <section>
391 <title>Source Format</title>
392
393 <para>
394 LWASM accepts plain text files in a relatively free form. It can handle
395 lines terminated with CR, LF, CRLF, or LFCR which means it should be able
396 to assemble files on any platform on which it compiles.
397 </para>
398 <para>
399 Each line may start with a symbol. If a symbol is present, there must not
400 be any whitespace preceding it. It is legal for a line to contain nothing
401 but a symbol.</para>
402 <para>
403 The op code is separated from the symbol by whitespace. If there is
404 no symbol, there must be at least one white space character preceding it.
405 If applicable, the operand follows separated by whitespace. Following the
406 opcode and operand is an optional comment.
407 </para>
408
409 <para> It is important to note that operands cannot contain any whitespace
410 except in the case of delimited strings. This is because the first
411 whitespace character will be interpreted as the separator between the
412 operand column and the comment. This behaviour is required for approximate
413 source compatibility with other 6x09 assemblers. </para>
414
415 <para>
416 A comment can also be introduced with a * or a ;. The comment character is
417 optional for end of statement comments. However, if a symbol is the only
418 thing present on the line other than the comment, the comment character is
419 mandatory to prevent the assembler from interpreting the comment as an opcode.
420 </para>
421
422 <para>
423 For compatibility with the output generated by some C preprocessors, LWASM
424 will also ignore lines that begin with a #. This should not be used as a general
425 comment character, however.
426 </para>
427
428 <para>
429 The opcode is not treated case sensitively. Neither are register names in
430 the operand fields. Symbols, however, are case sensitive.
431 </para>
432
433 <para> As of version 2.6, LWASM supports files with line numbers. If line
434 numbers are present, the line must start with a digit. The line number
435 itself must consist only of digits. The line number must then be followed
436 by either the end of the line or exactly one white space character. After
437 that white space character, the lines are interpreted exactly as above.
438 </para>
439
440 </section>
441
442 <section>
443 <title>Symbols</title>
444
445 <para>
446 Symbols have no length restriction. They may contain letters, numbers, dots,
447 dollar signs, and underscores. They must start with a letter, dot, or
448 underscore.
449 </para>
450
451 <para>
452 LWASM also supports the concept of a local symbol. A local symbol is one
453 which contains either a "?" or a "@", which can appear anywhere in the symbol.
454 The scope of a local symbol is determined by a number of factors. First,
455 each included file gets its own local symbol scope. A blank line will also
456 be considered a local scope barrier. Macros each have their own local symbol
457 scope as well (which has a side effect that you cannot use a local symbol
458 as an argument to a macro). There are other factors as well. In general,
459 a local symbol is restricted to the block of code it is defined within.
460 </para>
461
462 <para>
463 By default, unless assembling to the os9 target, a "$" in the symbol will
464 also make it local. This can be controlled by the "dollarlocal" and
465 "nodollarlocal" pragmas. In the absence of a pragma to the contrary, for
466 the os9 target, a "$" in the symbol will not make it considered local while
467 for all other targets it will.
468 </para>
469
470 </section>
471
472 <section>
473 <title>Numbers and Expressions</title>
474 <para>
475
476 Numbers can be expressed in binary, octal, decimal, or hexadecimal. Binary
477 numbers may be prefixed with a "%" symbol or suffixed with a "b" or "B".
478 Octal numbers may be prefixed with "@" or suffixed with "Q", "q", "O", or
479 "o". Hexadecimal numbers may be prefixed with "$", "0x" or "0X", or suffixed
480 with "H". No prefix or suffix is required for decimal numbers but they can
481 be prefixed with "&amp;" if desired. Any constant which begins with a letter
482 must be expressed with the correct prefix base identifier or be prefixed
483 with a 0. Thus hexadecimal FF would have to be written either 0FFH or $FF.
484 Numbers are not case sensitive.
485
486 </para>
487
488 <para> A symbol may appear at any point where a number is acceptable. The
489 special symbol "*" can be used to represent the starting address of the
490 current source line within expressions. </para>
491
492 <para>The ASCII value of a character can be included by prefixing it with a
493 single quote ('). The ASCII values of two characters can be included by
494 prefixing the characters with a quote (").</para>
495
496 <para>
497
498 LWASM supports the following basic binary operators: +, -, *, /, and %.
499 These represent addition, subtraction, multiplication, division, and
500 modulus. It also supports unary negation and unary 1's complement (- and ^
501 respectively). It is also possible to use ~ for the unary 1's complement
502 operator. For completeness, a unary positive (+) is supported though it is
503 a no-op. LWASM also supports using |, &, and ^ for bitwise or, bitwise and,
504 and bitwise exclusive or respectively.
505
506 </para>
507
508 <para>
509
510 Operator precedence follows the usual rules. Multiplication, division, and
511 modulus take precedence over addition and subtraction. Unary operators take
512 precedence over binary operators. Bitwise operators are lower precdence
513 than addition and subtraction. To force a specific order of evaluation,
514 parentheses can be used in the usual manner.
515
516 </para>
517
518 <para>
519
520 As of LWASM 2.5, the operators && and || are recognized for boolean and and
521 boolean or respectively. They will return either 0 or 1 (false or true).
522 They have the lowest precedence of all the binary operators.
523
524 </para>
525
526 </section>
527
528 <section>
529 <title>Assembler Directives</title>
530 <para>
531 Various directives can be used to control the behaviour of the
532 assembler or to include non-code/data in the resulting output. Those directives
533 that are not described in detail in other sections of this document are
534 described below.
535 </para>
536
537 <section>
538 <title>Data Directives</title>
539 <variablelist>
540 <varlistentry><term>FCB <parameter>expr[,...]</parameter></term>
541 <term>.DB <parameter>expr[,...]</parameter></term>
542 <term>.BYTE <parameter>expr[,...]</parameter></term>
543 <listitem>
544 <para>Include one or more constant bytes (separated by commas) in the output.</para>
545 </listitem>
546 </varlistentry>
547
548 <varlistentry>
549 <term>FDB <parameter>expr[,...]</parameter></term>
550 <term>.DW <parameter>expr[,...]</parameter></term>
551 <term>.WORD <parameter>expr[,...]</parameter></term>
552 <listitem>
553 <para>Include one or more words (separated by commas) in the output.</para>
554 </listitem>
555 </varlistentry>
556
557 <varlistentry>
558 <term>FQB <parameter>expr[,...]</parameter></term>
559 <term>.QUAD <parameter>expr[,...]</parameter></term>
560 <term>.4BYTE <parameter>expr[,...]</parameter></term>
561 <listitem>
562 <para>Include one or more double words (separated by commas) in the output.</para>
563 </listitem>
564 </varlistentry>
565
566 <varlistentry>
567 <term>FCC <parameter>string</parameter></term>
568 <term>.ASCII <parameter>string</parameter></term>
569 <term>.STR <parameter>string</parameter></term>
570 <listitem>
571 <para>
572 Include a string of text in the output. The first character of the operand
573 is the delimiter which must appear as the last character and cannot appear
574 within the string. The string is included with no modifications>
575 </para>
576 </listitem>
577 </varlistentry>
578
579 <varlistentry>
580 <term>FCN <parameter>string</parameter></term>
581 <term>.ASCIZ <parameter>string</parameter></term>
582 <term>.STRZ <parameter>string</parameter></term>
583 <listitem>
584 <para>
585 Include a NUL terminated string of text in the output. The first character of
586 the operand is the delimiter which must appear as the last character and
587 cannot appear within the string. A NUL byte is automatically appended to
588 the string.
589 </para>
590 </listitem>
591 </varlistentry>
592
593 <varlistentry>
594 <term>FCS <parameter>string</parameter></term>
595 <term>.ASCIS <parameter>string</parameter></term>
596 <term>.STRS <parameter>string</parameter></term>
597 <listitem>
598 <para>
599 Include a string of text in the output with bit 7 of the final byte set. The
600 first character of the operand is the delimiter which must appear as the last
601 character and cannot appear within the string.
602 </para>
603 </listitem>
604 </varlistentry>
605
606 <varlistentry><term>ZMB <parameter>expr</parameter></term>
607 <listitem>
608 <para>
609 Include a number of NUL bytes in the output. The number must be fully resolvable
610 during pass 1 of assembly so no forward or external references are permitted.
611 </para>
612 </listitem>
613 </varlistentry>
614
615 <varlistentry><term>ZMD <parameter>expr</parameter></term>
616 <listitem>
617 <para>
618 Include a number of zero words in the output. The number must be fully
619 resolvable during pass 1 of assembly so no forward or external references are
620 permitted.
621 </para>
622 </listitem>
623 </varlistentry>
624
625 <varlistentry><term>ZMQ <parameter>expr<parameter></term>
626 <listitem>
627 <para>
628 Include a number of zero double-words in the output. The number must be fully
629 resolvable during pass 1 of assembly so no forward or external references are
630 permitted.
631 </para>
632 </listitem>
633 </varlistentry>
634
635 <varlistentry>
636 <term>RMB <parameter>expr</parameter></term>
637 <term>.BLKB <parameter>expr</parameter></term>
638 <term>.DS <parameter>expr</parameter></term>
639 <term>.RS <parameter>expr</parameter></term>
640 <listitem>
641 <para>
642 Reserve a number of bytes in the output. The number must be fully resolvable
643 during pass 1 of assembly so no forward or external references are permitted.
644 The value of the bytes is undefined.
645 </para>
646 </listitem>
647 </varlistentry>
648
649 <varlistentry><term>RMD <parameter>expr</parameter></term>
650 <listitem>
651 <para>
652 Reserve a number of words in the output. The number must be fully
653 resolvable during pass 1 of assembly so no forward or external references are
654 permitted. The value of the words is undefined.
655 </para>
656 </listitem>
657 </varlistentry>
658
659 <varlistentry><term>RMQ <parameter>expr</parameter></term>
660 <listitem>
661 <para>
662 Reserve a number of double-words in the output. The number must be fully
663 resolvable during pass 1 of assembly so no forward or external references are
664 permitted. The value of the double-words is undefined.
665 </para>
666 </listitem>
667 </varlistentry>
668
669 <varlistentry>
670 <term>INCLUDEBIN <parameter>filename</parameter></term>
671 <listitem>
672 <para>
673 Treat the contents of <parameter>filename</parameter> as a string of bytes to
674 be included literally at the current assembly point. This has the same effect
675 as converting the file contents to a series of FCB statements and including
676 those at the current assembly point.
677 </para>
678
679 <para> If <parameter>filename</parameter> beings with a /, the file name
680 will be taken as absolute. Otherwise, the current directory will be
681 searched followed by the search path in the order specified.</para>
682
683 <para> Please note that absolute path detection including drive letters will
684 not function correctly on Windows platforms. Non-absolute inclusion will
685 work, however.</para>
686
687 </listitem>
688 </varlistentry>
689
690 </variablelist>
691
692 </section>
693
694 <section>
695 <title>Address Definition</title>
696 <para>The directives in this section all control the addresses of symbols
697 or the assembly process itself.</para>
698
699 <variablelist>
700 <varlistentry><term>ORG <parameter>expr</parameter></term>
701 <listitem>
702 <para>Set the assembly address. The address must be fully resolvable on the
703 first pass so no external or forward references are permitted. ORG is not
704 permitted within sections when outputting to object files. For the DECB
705 target, each ORG directive after which output is generated will cause
706 a new preamble to be output. ORG is only used to determine the addresses
707 of symbols when the raw target is used.
708 </para>
709 </listitem>
710 </varlistentry>
711
712 <varlistentry>
713 <term><parameter>sym</parameter> EQU <parameter>expr</parameter></term>
714 <term><parameter>sym</parameter> = <parameter>expr</parameter></term>
715 <listitem>
716 <para>Define the value of <parameter>sym</parameter> to be <parameter>expr</parameter>.
717 </listitem>
718 </varlistentry>
719
720 <varlistentry>
721 <term><parameter>sym</parameter> SET <parameter>expr</parameter></term>
722 <listitem>
723 <para>Define the value of <parameter>sym</parameter> to be <parameter>expr</parameter>.
724 Unlike EQU, SET permits symbols to be defined multiple times as long as SET
725 is used for all instances. Use of the symbol before the first SET statement
726 that sets its value is undefined.</para>
727 </listitem>
728 </varlistentry>
729
730 <varlistentry>
731 <term>SETDP <parameter>expr</parameter></term>
732 <listitem>
733 <para>Inform the assembler that it can assume the DP register contains
734 <parameter>expr</parameter>. This directive is only advice to the assembler
735 to determine whether an address is in the direct page and has no effect
736 on the contents of the DP register. The value must be fully resolved during
737 the first assembly pass because it affects the sizes of subsequent instructions.
738 </para>
739 <para>This directive has no effect in the object file target.
740 </para>
741 </listitem>
742 </varlistentry>
743
744 <varlistentry>
745 <term>ALIGN <parameter>expr</parameter>[,<parameter>value</parameter>]</term>
746 <listitem>
747
748 <para>Force the current assembly address to be a multiple of
749 <parameter>expr</parameter>. If <parameter>value</parameter> is not
750 specified, a series of NUL bytes is output to force the alignment, if
751 required. Otherwise, the low order 8 bits of <parameter>value</parameter>
752 will be used as the fill. The alignment value must be fully resolved on the
753 first pass because it affects the addresses of subsquent instructions.
754 However, <parameter>value</parameter> may include forward references; as
755 long as it resolves to a constant for the second pass, the value will be
756 accepted.</para>
757
758 <para>Unless <parameter>value</parameter> is specified as something like $12,
759 this directive is not suitable for inclusion in the middle of actual code.
760 The default padding value is $00 which is intended to be used within data
761 blocks. </para>
762
763 </listitem>
764 </varlistentry>
765
766 </variablelist>
767
768 </section>
769
770 <section>
771 <title>Conditional Assembly</title>
772 <para>
773 Portions of the source code can be excluded or included based on conditions
774 known at assembly time. Conditionals can be nested arbitrarily deeply. The
775 directives associated with conditional assembly are described in this section.
776 </para>
777 <para>All conditionals must be fully bracketed. That is, every conditional
778 statement must eventually be followed by an ENDC at the same level of nesting.
779 </para>
780 <para>Conditional expressions are only evaluated on the first assembly pass.
781 It is not possible to game the assembly process by having a conditional
782 change its value between assembly passes. Due to the underlying architecture
783 of LWASM, there is no possible utility to IFP1 and IFP2, nor can they, as of LWASM 3.0, actually
784 be implemented meaningfully. Thus there is not and never will
785 be any equivalent of IFP1 or IFP2 as provided by other assemblers. Use of those opcodes
786 will throw a warning and be ignored.</para>
787
788 <para>It is important to note that if a conditional does not resolve to a constant
789 during the first parsing pass, an error will be thrown. This is unavoidable because the assembler
790 must make a decision about which source to include and which source to exclude at this stage.
791 Thus, expressions that work normally elsewhere will not work for conditions.</para>
792
793 <variablelist>
794 <varlistentry>
795 <term>IFEQ <parameter>expr</parameter></term>
796 <listitem>
797 <para>If <parameter>expr</parameter> evaluates to zero, the conditional
798 will be considered true.
799 </para>
800 </listitem>
801 </varlistentry>
802
803 <varlistentry>
804 <term>IFNE <parameter>expr</parameter></term>
805 <term>IF <parameter>expr</parameter></term>
806 <listitem>
807 <para>If <parameter>expr</parameter> evaluates to a non-zero value, the conditional
808 will be considered true.
809 </para>
810 </listitem>
811 </varlistentry>
812
813 <varlistentry>
814 <term>IFGT <parameter>expr</parameter></term>
815 <listitem>
816 <para>If <parameter>expr</parameter> evaluates to a value greater than zero, the conditional
817 will be considered true.
818 </para>
819 </listitem>
820 </varlistentry>
821
822 <varlistentry>
823 <term>IFGE <parameter>expr</parameter></term>
824 <listitem>
825 <para>If <parameter>expr</parameter> evaluates to a value greater than or equal to zero, the conditional
826 will be considered true.
827 </para>
828 </listitem>
829 </varlistentry>
830
831 <varlistentry>
832 <term>IFLT <parameter>expr</parameter></term>
833 <listitem>
834 <para>If <parameter>expr</parameter> evaluates to a value less than zero, the conditional
835 will be considered true.
836 </para>
837 </listitem>
838 </varlistentry>
839
840 <varlistentry>
841 <term>IFLE <parameter>expr</parameter></term>
842 <listitem>
843 <para>If <parameter>expr</parameter> evaluates to a value less than or equal to zero , the conditional
844 will be considered true.
845 </para>
846 </listitem>
847 </varlistentry>
848
849 <varlistentry>
850 <term>IFDEF <parameter>sym</parameter></term>
851 <listitem>
852 <para>If <parameter>sym</parameter> is defined at this point in the assembly
853 process, the conditional
854 will be considered true.
855 </para>
856 </listitem>
857 </varlistentry>
858
859 <varlistentry>
860 <term>IFNDEF <parameter>sym</parameter></term>
861 <listitem>
862 <para>If <parameter>sym</parameter> is not defined at this point in the assembly
863 process, the conditional
864 will be considered true.
865 </para>
866 </listitem>
867 </varlistentry>
868
869 <varlistentry>
870 <term>ELSE</term>
871 <listitem>
872 <para>
873 If the preceding conditional at the same level of nesting was false, the
874 statements following will be assembled. If the preceding conditional at
875 the same level was true, the statements following will not be assembled.
876 Note that the preceding conditional might have been another ELSE statement
877 although this behaviour is not guaranteed to be supported in future versions
878 of LWASM.
879 </para>
880 </listitem>
881
882 <varlistentry>
883 <term>ENDC</term>
884 <listitem>
885 <para>
886 This directive marks the end of a conditional construct. Every conditional
887 construct must end with an ENDC directive.
888 </para>
889 </listitem>
890 </varlistentry>
891
892 </variablelist>
893 </section>
894
895 <section>
896 <title>OS9 Target Directives</title>
897
898 <para>This section includes directives that apply solely to the OS9
899 target.</para>
900
901 <variablelist>
902
903 <varlistentry>
904 <term>OS9 <parameter>syscall</parameter></term>
905 <listitem>
906 <para>
907
908 This directive generates a call to the specified system call. <parameter>syscall</parameter> may be an arbitrary expression.
909
910 </para>
911 </listitem>
912 </varlistentry>
913
914 <varlistentry>
915 <term>MOD <parameter>size</parameter>,<parameter>name</parameter>,<parameter>type</parameter>,<parameter>flags</parameter>,<parameter>execoff</parameter>,<parameter>datasize</parameter></term>
916 <listitem>
917 <para>
918
919 This tells LWASM that the beginning of the actual module is here. It will
920 generate a module header based on the parameters specified. It will also
921 begin calcuating the module CRC.
922
923 </para>
924
925 <para>
926
927 The precise meaning of the various parameters is beyond the scope of this
928 document since it is not a tutorial on OS9 module programming.
929
930 </para>
931
932 </listitem>
933 </varlistentry>
934
935 <varlistentry>
936 <term>EMOD</term>
937 <listitem>
938 <para>
939
940 This marks the end of a module and causes LWASM to emit the calculated CRC
941 for the module.
942
943 </para>
944 </varlistentry>
945
946 </variablelist>
947 </section>
948
949 <section>
950 <title>Miscelaneous Directives</title>
951
952 <para>This section includes directives that do not fit into the other
953 categories.</para>
954
955 <variablelist>
956
957 <varlistentry>
958 <term>INCLUDE <parameter>filename</parameter></term>
959 <term>USE <parameter>filename</parameter></term>
960
961 <listitem> <para> Include the contents of <parameter>filename</parameter> at
962 this point in the assembly as though it were a part of the file currently
963 being processed. Note that if whitespace appears in the name of the file,
964 you must enclose <parameter>filename</parameter> in quotes.
965 </para>
966
967 <para>
968 Note that the USE variation is provided only for compatibility with other
969 assemblers. It is recommended to use the INCLUDE variation.</para>
970
971 <para>If <parameter>filename</parameter> begins with a &quot;/&quot;, it is
972 interpreted as an absolute path. If it does not, the search path will be used
973 to find the file. First, the directory containing the file that contains this
974 directive. (Includes within an included file are relative to the included file,
975 not the file that included it.) If the file is not found there, the include path
976 is searched. If it is still not found, an error will be thrown. Note that the
977 current directory as understood by your shell or operating system is not searched.
978 </para>
979
980 </listitem>
981 </varlistentry>
982
983 <varlistentry>
984 <term>END <parameter>[expr]</parameter></term>
985 <listitem>
986 <para>
987 This directive causes the assembler to stop assembling immediately as though
988 it ran out of input. For the DECB target only, <parameter>expr</parameter>
989 can be used to set the execution address of the resulting binary. For all
990 other targets, specifying <parameter>expr</parameter> will cause an error.
991 </para>
992 </listitem>
993 </varlistentry>
994
995 <varlistentry>
996 <term>ERROR <parameter>string</parameter></term>
997 <listitem>
998 <para>
999 Causes a custom error message to be printed at this line. This will cause
1000 assembly to fail. This directive is most useful inside conditional constructs
1001 to cause assembly to fail if some condition that is known bad happens. Everything
1002 from the directive to the end of the line is considered the error message.
1003 </para>
1004 </listitem>
1005 </varlistentry>
1006
1007 <varlistentry>
1008 <term>WARNING <parameter>string</parameter></term>
1009 <listitem>
1010 <para>
1011 Causes a custom warning message to be printed at this line. This will not cause
1012 assembly to fail. This directive is most useful inside conditional constructs
1013 or include files to alert the programmer to a deprecated feature being used
1014 or some other condition that may cause trouble later, but which may, in fact,
1015 not cause any trouble.
1016 </para>
1017 </listitem>
1018 </varlistentry>
1019
1020 <varlistentry>
1021 <term>.MODULE <parameter>string</parameter></term>
1022 <listitem>
1023 <para>
1024 This directive is ignored for most output targets. If the output target
1025 supports encoding a module name into it, <parameter>string</parameter>
1026 will be used as the module name.
1027 </para>
1028 <para>
1029 As of version 3.0, no supported output targets support this directive.
1030 </para>
1031 </listitem>
1032 </varlistentry>
1033
1034 </variablelist>
1035 </section>
1036
1037 </section>
1038
1039 <section>
1040 <title>Macros</title>
1041 <para>
1042 LWASM is a macro assembler. A macro is simply a name that stands in for a
1043 series of instructions. Once a macro is defined, it is used like any other
1044 assembler directive. Defining a macro can be considered equivalent to adding
1045 additional assembler directives.
1046 </para>
1047 <para>Macros may accept parameters. These parameters are referenced within
1048 a macro by the a backslash ("\") followed by a digit 1 through 9 for the first
1049 through ninth parameters. They may also be referenced by enclosing the
1050 decimal parameter number in braces ("{num}"). These parameter references
1051 are replaced with the verbatim text of the parameter passed to the macro. A
1052 reference to a non-existent parameter will be replaced by an empty string.
1053 Macro parameters are expanded everywhere on each source line. That means
1054 the parameter to a macro could be used as a symbol or it could even appear
1055 in a comment or could cause an entire source line to be commented out
1056 when the macro is expanded.
1057 </para>
1058 <para>
1059 Parameters passed to a macro are separated by commas and the parameter list
1060 is terminated by any whitespace. This means that neither a comma nor whitespace
1061 may be included in a macro parameter.
1062 </para>
1063 <para>
1064 Macro expansion is done recursively. That is, within a macro, macros are
1065 expanded. This can lead to infinite loops in macro expansion. If the assembler
1066 hangs for a long time while assembling a file that uses macros, this may be
1067 the reason.</para>
1068
1069 <para>Each macro expansion receives its own local symbol context which is not
1070 inherited by any macros called by it nor is it inherited from the context
1071 the macro was instantiated in. That means it is possible to use local symbols
1072 within macros without having them collide with symbols in other macros or
1073 outside the macro itself. However, this also means that using a local symbol
1074 as a parameter to a macro, while legal, will not do what it would seem to do
1075 as it will result in looking up the local symbol in the macro's symbol context
1076 rather than the enclosing context where it came from, likely yielding either
1077 an undefined symbol error or bizarre assembly results.
1078 </para>
1079 <para>
1080 Note that there is no way to define a macro as local to a symbol context. All
1081 macros are part of the global macro namespace. However, macros have a separate
1082 namespace from symbols so it is possible to have a symbol with the same name
1083 as a macro.
1084 </para>
1085
1086 <para>
1087 Macros are defined only during the first pass. Macro expansion also
1088 only occurs during the first pass. On the second pass, the macro
1089 definition is simply ignored. Macros must be defined before they are used.
1090 </para>
1091
1092 <para>The following directives are used when defining macros.</para>
1093
1094 <variablelist>
1095 <varlistentry>
1096 <term><parameter>macroname</parameter> MACRO</term>
1097 <listitem>
1098 <para>This directive is used to being the definition of a macro called
1099 <parameter>macroname</parameter>. If <parameter>macroname</parameter> already
1100 exists, it is considered an error. Attempting to define a macro within a
1101 macro is undefined. It may work and it may not so the behaviour should not
1102 be relied upon.
1103 </para>
1104 </listitem>
1105 </varlistentry>
1106
1107 <varlistentry>
1108 <term>ENDM</term>
1109 <listitem>
1110 <para>
1111 This directive indicates the end of the macro currently being defined. It
1112 causes the assembler to resume interpreting source lines as normal.
1113 </para>
1114 </listitem>
1115 </variablelist>
1116
1117 </section>
1118
1119 <section>
1120 <title>Structures</title>
1121 <para>
1122
1123 Structures are used to group related data in a fixed structure. A structure
1124 consists a number of fields, defined in sequential order and which take up
1125 specified size. The assembler does not enforce any means of access within a
1126 structure; it assumes that whatever you are doing, you intended to do.
1127 There are two pseudo ops that are used for defining structures.
1128
1129 </para>
1130
1131 <variablelist>
1132 <varlistentry>
1133 <term><parameter>structname</parameter> STRUCT</term>
1134 <listitem>
1135 <para>
1136
1137 This directive is used to begin the definition of a structure with name
1138 <parameter>structname</parameter>. Subsequent statements all form part of
1139 the structure definition until the end of the structure is declared.
1140
1141 </para>
1142 </listitem>
1143 </varlistentry>
1144 <varlistentry>
1145 <term>ENDSTRUCT</term>
1146 <term>ENDS</term>
1147 <listitem>
1148 <para>
1149 This directive ends the definition of the structure. ENDSTRUCT is the
1150 preferred form. Prior to version 3.0 of LWASM, ENDS was used to end a
1151 section instead of a structure.
1152 </para>
1153 </listitem>
1154 </varlistentry>
1155 </variablelist>
1156
1157 <para>
1158
1159 Within a structure definition, only reservation pseudo ops are permitted.
1160 Anything else will cause an assembly error.
1161 </para>
1162
1163 <para> Once a structure is defined, you can reserve an area of memory in the
1164 same structure by using the structure name as the opcode. Structures can
1165 also contain fields that are themselves structures. See the example
1166 below.</para>
1167
1168 <programlisting>
1169 tstruct2 STRUCT
1170 f1 rmb 1
1171 f2 rmb 1
1172 ENDSTRUCT
1173
1174 tstruct STRUCT
1175 field1 rmb 2
1176 field2 rmb 3
1177 field3 tstruct2
1178 ENDSTRUCT
1179
1180 ORG $2000
1181 var1 tstruct
1182 var2 tstruct2
1183 </programlisting>
1184
1185 <para>Fields are referenced using a dot (.) as a separator. To refer to the
1186 generic offset within a structure, use the structure name to the left of the
1187 dot. If referring to a field within an actual variable, use the variable's
1188 symbol name to the left of the dot.</para>
1189
1190 <para>You can also refer to the actual size of a structure (or a variable
1191 declared as a structure) using the special symbol sizeof{structname} where
1192 structname will be the name of the structure or the name of the
1193 variable.</para>
1194
1195 <para>Essentially, structures are a shortcut for defining a vast number of
1196 symbols. When a structure is defined, the assembler creates symbols for the
1197 various fields in the form structname.fieldname as well as the appropriate
1198 sizeof{structname} symbol. When a variable is declared as a structure, the
1199 assembler does the same thing using the name of the variable. You will see
1200 these symbols in the symbol table when the assembler is instructed to
1201 provide a listing. For instance, the above listing will create the
1202 following symbols (symbol values in parentheses): tstruct2.f1 (0),
1203 tstruct2.f2 (1), sizeof{tstruct2} (2), tstruct.field1 (0), tstruct.field2
1204 (2), tstruct.field3 (5), tstruct.field3.f1 (5), tstruct.field3.f2 (6),
1205 sizeof{tstruct.field3} (2), sizeof{tstruct} (7), var1 {$2000}, var1.field1
1206 {$2000}, var1.field2 {$2002}, var1.field3 {$2005}, var1.field3.f1 {$2005},
1207 var1.field3.f2 {$2006}, sizeof(var1.field3} (2), sizeof{var1} (7), var2
1208 ($2007), var2.f1 ($2007), var2.f2 ($2008), sizeof{var2} (2). </para>
1209
1210 </section>
1211
1212 <section>
1213 <title>Object Files and Sections</title>
1214 <para>
1215 The object file target is very useful for large project because it allows
1216 multiple files to be assembled independently and then linked into the final
1217 binary at a later time. It allows only the small portion of the project
1218 that was modified to be re-assembled rather than requiring the entire set
1219 of source code to be available to the assembler in a single assembly process.
1220 This can be particularly important if there are a large number of macros,
1221 symbol definitions, or other metadata that uses resources at assembly time.
1222 By far the largest benefit, however, is keeping the source files small enough
1223 for a mere mortal to find things in them.
1224 </para>
1225
1226 <para>
1227 With multi-file projects, there needs to be a means of resolving references to
1228 symbols in other source files. These are known as external references. The
1229 addresses of these symbols cannot be known until the linker joins all the
1230 object files into a single binary. This means that the assembler must be
1231 able to output the object code without knowing the value of the symbol. This
1232 places some restrictions on the code generated by the assembler. For
1233 example, the assembler cannot generate direct page addressing for instructions
1234 that reference external symbols because the address of the symbol may not
1235 be in the direct page. Similarly, relative branches and PC relative addressing
1236 cannot be used in their eight bit forms. Everything that must be resolved
1237 by the linker must be assembled to use the largest address size possible to
1238 allow the linker to fill in the correct value at link time. Note that the
1239 same problem applies to absolute address references as well, even those in
1240 the same source file, because the address is not known until link time.
1241 </para>
1242
1243 <para>
1244 It is often desired in multi-file projects to have code of various types grouped
1245 together in the final binary generated by the linker as well. The same applies
1246 to data. In order for the linker to do that, the bits that are to be grouped
1247 must be tagged in some manner. This is where the concept of sections comes in.
1248 Each chunk of code or data is part of a section in the object file. Then,
1249 when the linker reads all the object files, it coalesces all sections of the
1250 same name into a single section and then considers it as a unit.
1251 </para>
1252
1253 <para>
1254 The existence of sections, however, raises a problem for symbols even
1255 within the same source file. Thus, the assembler must treat symbols from
1256 different sections within the same source file in the same manner as external
1257 symbols. That is, it must leave them for the linker to resolve at link time,
1258 with all the limitations that entails.
1259 </para>
1260
1261 <para>
1262 In the object file target mode, LWASM requires all source lines that
1263 cause bytes to be output to be inside a section. Any directives that do
1264 not cause any bytes to be output can appear outside of a section. This includes
1265 such things as EQU or RMB. Even ORG can appear outside a section. ORG, however,
1266 makes no sense within a section because it is the linker that determines
1267 the starting address of the section's code, not the assembler.
1268 </para>
1269
1270 <para>
1271 All symbols defined globally in the assembly process are local to the
1272 source file and cannot be exported. All symbols defined within a section are
1273 considered local to the source file unless otherwise explicitly exported.
1274 Symbols referenced from external source files must be declared external,
1275 either explicitly or by asking the assembler to assume that all undefined
1276 symbols are external.
1277 </para>
1278
1279 <para>
1280 It is often handy to define a number of memory addresses that will be
1281 used for data at run-time but which need not be included in the binary file.
1282 These memory addresses are not initialized until run-time, either by the
1283 program itself or by the program loader, depending on the operating environment.
1284 Such sections are often known as BSS sections. LWASM supports generating
1285 sections with a BSS attribute set which causes the section definition including
1286 symbols exported from that section and those symbols required to resolve
1287 references from the local file, but with no actual code in the object file.
1288 It is illegal for any source lines within a BSS flagged section to cause any
1289 bytes to be output.
1290 </para>
1291
1292 <para>The following directives apply to section handling.</para>
1293
1294 <variablelist>
1295 <varlistentry>
1296 <term>SECTION <parameter>name[,flags]</parameter></term>
1297 <term>SECT <parameter>name[,flags]</parameter></term>
1298 <term>.AREA <parameter>name[,flags]</parameter></term>
1299 <listitem>
1300 <para>
1301 Instructs the assembler that the code following this directive is to be
1302 considered part of the section <parameter>name</parameter>. A section name
1303 may appear multiple times in which case it is as though all the code from
1304 all the instances of that section appeared adjacent within the source file.
1305 However, <parameter>flags</parameter> may only be specified on the first
1306 instance of the section.
1307 </para>
1308 <para>There is a single flag supported in <parameter>flags</parameter>. The
1309 flag <parameter>bss</parameter> will cause the section to be treated as a BSS
1310 section and, thus, no code will be included in the object file nor will any
1311 bytes be permitted to be output.</para>
1312 <para>
1313 If the section name is "bss" or ".bss" in any combination of upper and
1314 lower case, the section is assumed to be a BSS section. In that case,
1315 the flag <parameter>!bss</parameter> can be used to override this assumption.
1316 </para>
1317 <para>
1318 If assembly is already happening within a section, the section is implicitly
1319 ended and the new section started. This is not considered an error although
1320 it is recommended that all sections be explicitly closed.
1321 </para>
1322 </listitem>
1323 </varlistentry>
1324
1325 <varlistentry>
1326 <term>ENDSECTION</term>
1327 <term>ENDSECT</term>
1328 <listitem>
1329 <para>
1330 This directive ends the current section. This puts assembly outside of any
1331 sections until the next SECTION directive. ENDSECTION is the preferred form.
1332 Prior to version 3.0 of LWASM, ENDS could also be used to end a section but
1333 as of version 3.0, it is now an alias for ENDSTRUCT instead.
1334 </listitem>
1335 </varlistentry>
1336
1337 <varlistentry>
1338 <term><parameter>sym</parameter> EXTERN</term>
1339 <term><parameter>sym</parameter> EXTERNAL</term>
1340 <term><parameter>sym</parameter> IMPORT</term>
1341 <listitem>
1342 <para>
1343 This directive defines <parameter>sym</parameter> as an external symbol.
1344 This directive may occur at any point in the source code. EXTERN definitions
1345 are resolved on the first pass so an EXTERN definition anywhere in the
1346 source file is valid for the entire file. The use of this directive is
1347 optional when the assembler is instructed to assume that all undefined
1348 symbols are external. In fact, in that mode, if the symbol is referenced
1349 before the EXTERN directive, an error will occur.
1350 </para>
1351 </listitem>
1352 </varlistentry>
1353
1354 <varlistentry>
1355 <term><parameter>sym</parameter> EXPORT</term>
1356 <term><parameter>sym</parameter> .GLOBL</term>
1357
1358 <term>EXPORT <parameter>sym</parameter></term>
1359 <term>.GLOBL <parameter>sym</parameter></term>
1360
1361 <listitem>
1362 <para>
1363 This directive defines <parameter>sym</parameter> as an exported symbol.
1364 This directive may occur at any point in the source code, even before the
1365 definition of the exported symbol.
1366 </para>
1367 <para>
1368 Note that <parameter>sym</parameter> may appear as the operand or as the
1369 statement's symbol. If there is a symbol on the statement, that will
1370 take precedence over any operand that is present.
1371 </para>
1372 </listitem>
1373
1374 </varlistentry>
1375
1376 <varlistentry>
1377 <term><parameter>sym</parameter> EXTDEP</term>
1378 <listitem>
1379
1380 <para>This directive forces an external dependency on
1381 <parameter>sym</parameter>, even if it is never referenced anywhere else in
1382 this file.</para>
1383
1384 </listitem>
1385 </varlistentry>
1386 </variablelist>
1387
1388 </section>
1389
1390 <section>
1391 <title>Assembler Modes and Pragmas</title>
1392 <para>
1393 There are a number of options that affect the way assembly is performed.
1394 Some of these options can only be specified on the command line because
1395 they determine something absolute about the assembly process. These include
1396 such things as the output target. Other things may be switchable during
1397 the assembly process. These are known as pragmas and are, by definition,
1398 not portable between assemblers.
1399 </para>
1400
1401 <para>LWASM supports a number of pragmas that affect code generation or
1402 otherwise affect the behaviour of the assembler. These may be specified by
1403 way of a command line option or by assembler directives. The directives
1404 are as follows.
1405 </para>
1406
1407 <variablelist>
1408 <varlistentry>
1409 <term>PRAGMA <parameter>pragma[,...]</parameter></term>
1410 <listitem>
1411 <para>
1412 Specifies that the assembler should bring into force all <parameter>pragma</parameter>s
1413 specified. Any unrecognized pragma will cause an assembly error. The new
1414 pragmas will take effect immediately. This directive should be used when
1415 the program will assemble incorrectly if the pragma is ignored or not supported.
1416 </para>
1417 </listitem>
1418 </varlistentry>
1419
1420 <varlistentry>
1421 <term>*PRAGMA <parameter>pragma[,...]</parameter></term>
1422 <listitem>
1423 <para>
1424 This is identical to the PRAGMA directive except no error will occur with
1425 unrecognized or unsupported pragmas. This directive, by virtue of starting
1426 with a comment character, will also be ignored by assemblers that do not
1427 support this directive. Use this variation if the pragma is not required
1428 for correct functioning of the code.
1429 </para>
1430 </listitem>
1431 </varlistentry>
1432 </variablelist>
1433
1434 <para>Each pragma supported has a positive version and a negative version.
1435 The positive version enables the pragma while the negative version disables
1436 it. The negatitve version is simply the positive version with "no" prefixed
1437 to it. For instance, "pragma" vs. "nopragma". Only the positive version is
1438 listed below.</para>
1439
1440 <para>Pragmas are not case sensitive.</para>
1441
1442 <variablelist>
1443 <varlistentry>
1444 <term>index0tonone</term>
1445 <listitem>
1446 <para>
1447 When in force, this pragma enables an optimization affecting indexed addressing
1448 modes. When the offset expression in an indexed mode evaluates to zero but is
1449 not explicity written as 0, this will replace the operand with the equivalent
1450 no offset mode, thus creating slightly faster code. Because of the advantages
1451 of this optimization, it is enabled by default.
1452 </para>
1453 </listitem>
1454 </varlistentry>
1455
1456 <varlistentry>
1457 <term>cescapes</term>
1458 <listitem>
1459 <para>
1460 This pragma will cause strings in the FCC, FCS, and FCN pseudo operations to
1461 have C-style escape sequences interpreted. The one departure from the official
1462 spec is that unrecognized escape sequences will return either the character
1463 immediately following the backslash or some undefined value. Do not rely
1464 on the behaviour of undefined escape sequences.
1465 </para>
1466 </listitem>
1467 </varlistentry>
1468
1469 <varlistentry>
1470 <term>importundefexport</term>
1471 <listitem>
1472 <para>
1473 This pragma is only valid for targets that support external references. When
1474 in force, it will cause the EXPORT directive to act as IMPORT if the symbol
1475 to be exported is not defined. This is provided for compatibility with the
1476 output of gcc6809 and should not be used in hand written code. Because of
1477 the confusion this pragma can cause, it is disabled by default.
1478 </para>
1479 </listitem>
1480 </varlistentry>
1481
1482 <varlistentry>
1483 <term>undefextern</term>
1484 <listitem>
1485 <para>
1486 This pragma is only valid for targets that support external references. When in
1487 force, if the assembler sees an undefined symbol on the second pass, it will
1488 automatically define it as an external symbol. This automatic definition will
1489 apply for the remainder of the assembly process, even if the pragma is
1490 subsequently turned off. Because this behaviour would be potentially surprising,
1491 this pragma defaults to off.
1492 </para>
1493 <para>
1494 The primary use for this pragma is for projects that share a large number of
1495 symbols between source files. In such cases, it is impractical to enumerate
1496 all the external references in every source file. This allows the assembler
1497 and linker to do the heavy lifting while not preventing a particular source
1498 module from defining a local symbol of the same name as an external symbol
1499 if it does not need the external symbol. (This pragma will not cause an
1500 automatic external definition if there is already a locally defined symbol.)
1501 </para>
1502 <para>
1503 This pragma will often be specified on the command line for large projects.
1504 However, depending on the specific dynamics of the project, it may be sufficient
1505 for one or two files to use this pragma internally.
1506 </para>
1507 </listitem>
1508 </varlistentry>
1509
1510 <varlistentry>
1511 <term>dollarlocal</term>
1512 <listitem>
1513
1514 <para>When set, a "$" in a symbol makes it local. When not set, "$" does not
1515 cause a symbol to be local. It is set by default except when using the OS9
1516 target.</para>
1517
1518 </listitem>
1519 </varlistentry>
1520
1521 <varlistentry>
1522 <term>dollarnotlocal</term>
1523 <listitem>
1524
1525 <para> This is the same as the "dollarlocal" pragma except its sense is
1526 reversed. That is, "dollarlocal" and "nodollarnotlocal" are equivalent and
1527 "nodollarlocal" and "dollarnotlocal" are equivalent. </para>
1528
1529 </listitem>
1530 </varlistentry>
1531
1532 <varlistentry>
1533 <term>pcaspcr</term>
1534 <listitem>
1535
1536 <para> Normally, LWASM makes a distinction between PC and PCR in program
1537 counter relative addressing. In particular, the use of PC means an absolute
1538 offset from PC while PCR causes the assembler to calculate the offset to the
1539 specified operand and use that as the offset from PC. By setting this
1540 pragma, you can have PC treated the same as PCR. </para>
1541
1542
1543 </listitem>
1544 </varlistentry>
1545
1546 </variablelist>
1547
1548 </section>
1549
1550 </chapter>
1551
1552 <chapter>
1553 <title>LWLINK</title>
1554 <para>
1555 The LWTOOLS linker is called LWLINK. This chapter documents the various features
1556 of the linker.
1557 </para>
1558
1559 <section>
1560 <title>Command Line Options</title>
1561 <para>
1562 The binary for LWLINK is called "lwlink". Note that the binary is in lower
1563 case. lwlink takes the following command line arguments.
1564 </para>
1565 <variablelist>
1566 <varlistentry>
1567 <term><option>--decb</option></term>
1568 <term><option>-b</option></term>
1569 <listitem>
1570 <para>
1571 Selects the DECB output format target. This is equivalent to <option>--format=decb</option>
1572 </para>
1573 </listitem>
1574 </varlistentry>
1575
1576 <varlistentry>
1577 <term><option>--output=FILE</option></term>
1578 <term><option>-o FILE</option></term>
1579 <listitem>
1580 <para>
1581 This option specifies the name of the output file. If not specified, the
1582 default is <option>a.out</option>.
1583 </para>
1584 </listitem>
1585 </varlistentry>
1586
1587 <varlistentry>
1588 <term><option>--format=TYPE</option></term>
1589 <term><option>-f TYPE</option></term>
1590 <listitem>
1591 <para>
1592 This option specifies the output format. Valid values are <option>decb</option>
1593 and <option>raw</option>
1594 </para>
1595 </listitem>
1596 </varlistentry>
1597
1598 <varlistentry>
1599 <term><option>--raw</option></term>
1600 <term><option>-r</option></term>
1601 <listitem>
1602 <para>
1603 This option specifies the raw output format.
1604 It is equivalent to <option>--format=raw</option>
1605 and <option>-f raw</option>
1606 </para>
1607 </listitem>
1608 </varlistentry>
1609
1610 <varlistentry>
1611 <term><option>--script=FILE</option></term>
1612 <term><option>-s</option></term>
1613 <listitem>
1614 <para>
1615 This option allows specifying a linking script to override the linker's
1616 built in defaults.
1617 </para>
1618 </listitem>
1619 </varlistentry>
1620
1621 <varlistentry>
1622 <term><option>--section-base=SECT=BASE</option></term>
1623 <listitem>
1624 <para>
1625 Cause section SECT to load at base address BASE. This will be prepended
1626 to the built-in link script. It is ignored if a link script is provided.
1627 </para>
1628 </listitem>
1629 </varlistentry>
1630
1631 <varlistentry>
1632 <term><option>--map=FILE</option></term>
1633 <term><option>-m FILE</option></term>
1634 <listitem>
1635 <para>
1636 This will output a description of the link result to FILE.
1637 </para>
1638 </listitem>
1639 </varlistentry>
1640
1641 <varlistentry>
1642 <term><option>--library=LIBSPEC</option></term>
1643 <term><option>-l LIBSPEC</option></term>
1644 <listitem>
1645 <para>
1646 Load a library using the library search path. LIBSPEC will have "lib" prepended
1647 and ".a" appended.
1648 </para>
1649 </listitem>
1650 </varlistentry>
1651
1652 <varlistentry>
1653 <term><option>--library-path=DIR</option></term>
1654 <term><option>-L DIR</option></term>
1655 <listitem>
1656 <para>
1657 Add DIR to the library search path.
1658 </para>
1659 </listitem>
1660 </varlistentry>
1661
1662 <varlistentry>
1663 <term><option>--debug</option></term>
1664 <term><option>-d</option></term>
1665 <listitem>
1666 <para>
1667 This option increases the debugging level. It is only useful for LWTOOLS
1668 developers.
1669 </para>
1670 </listitem>
1671 </varlistentry>
1672
1673 <varlistentry>
1674 <term><option>--help</option></term>
1675 <term><option>-?</option></term>
1676 <listitem>
1677 <para>
1678 This provides a listing of command line options and a brief description
1679 of each.
1680 </para>
1681 </listitem>
1682 </varlistentry>
1683
1684 <varlistentry>
1685 <term><option>--usage</option></term>
1686 <listitem>
1687 <para>
1688 This will display a usage summary
1689 of each command line option.
1690 </para>
1691 </listitem>
1692 </varlistentry>
1693
1694
1695 <varlistentry>
1696 <term><option>--version</option></term>
1697 <term><option>-V</option></term>
1698 <listitem>
1699 <para>
1700 This will display the version of LWLINK.
1701 </para>
1702 </listitem>
1703 </varlistentry>
1704
1705 </section>
1706
1707 <section>
1708 <title>Linker Operation</title>
1709
1710 <para>
1711
1712 LWLINK takes one or more files in supported input formats and links them
1713 into a single binary. Currently supported formats are the LWTOOLS object
1714 file format and the archive format used by LWAR. While the precise method is
1715 slightly different, linking can be conceptualized as the following steps.
1716
1717 </para>
1718
1719 <orderedlist>
1720 <listitem>
1721 <para>
1722 First, the linker loads a linking script. If no script is specified, it
1723 loads a built-in default script based on the output format selected. This
1724 script tells the linker how to lay out the various sections in the final
1725 binary.
1726 </para>
1727 </listitem>
1728
1729 <listitem>
1730 <para>
1731 Next, the linker reads all the input files into memory. At this time, it
1732 flags any format errors in those files. It constructs a table of symbols
1733 for each object at this time.
1734 </para>
1735 </listitem>
1736
1737 <listitem>
1738 <para>
1739 The linker then proceeds with organizing the sections loaded from each file
1740 according to the linking script. As it does so, it is able to assign addresses
1741 to each symbol defined in each object file. At this time, the linker may
1742 also collapse different instances of the same section name into a single
1743 section by appending the data from each subsequent instance of the section
1744 to the first instance of the section.
1745 </para>
1746 </listitem>
1747
1748 <listitem>
1749 <para>
1750 Next, the linker looks through every object file for every incomplete reference.
1751 It then attempts to fully resolve that reference. If it cannot do so, it
1752 throws an error. Once a reference is resolved, the value is placed into
1753 the binary code at the specified section. It should be noted that an
1754 incomplete reference can reference either a symbol internal to the object
1755 file or an external symbol which is in the export list of another object
1756 file.
1757 </para>
1758 </listitem>
1759
1760 <listitem>
1761 <para>
1762 If all of the above steps are successful, the linker opens the output file
1763 and actually constructs the binary.
1764 </para>
1765 </listitem>
1766 </orderedlist>
1767
1768 </section>
1769
1770 <section
1771 <title>Linking Scripts</title>
1772 <para>
1773 A linker script is used to instruct the linker about how to assemble the
1774 various sections into a completed binary. It consists of a series of
1775 directives which are considered in the order they are encountered.
1776 </para>
1777 <para>
1778 The sections will appear in the resulting binary in the order they are
1779 specified in the script file. If a referenced section is not found, the linker will behave as though the
1780 section did exist but had a zero size, no relocations, and no exports.
1781 A section should only be referenced once. Any subsequent references will have
1782 an undefined effect.
1783 </para>
1784
1785 <para>
1786 All numbers are in linking scripts are specified in hexadecimal. All directives
1787 are case sensitive although the hexadecimal numbers are not.
1788 </para>
1789
1790 <para>A section name can be specified as a "*", then any section not
1791 already matched by the script will be matched. The "*" can be followed
1792 by a comma and a flag to narrow the section down slightly, also.
1793 If the flag is "!bss", then any section that is not flagged as a bss section
1794 will be matched. If the flag is "bss", then any section that is flagged as
1795 bss will be matched.
1796 </para>
1797
1798 <para>The following directives are understood in a linker script.</para>
1799 <variablelist>
1800 <varlistentry>
1801 <term>section <parameter>name</parameter> load <parameter>addr</parameter></term>
1802 <listitem><para>
1803
1804 This causes the section <parameter>name</parameter> to load at
1805 <parameter>addr</parameter>. For the raw target, only one "load at" entry is
1806 allowed for non-bss sections and it must be the first one. For raw targets,
1807 it affects the addresses the linker assigns to symbols but has no other
1808 affect on the output. bss sections may all have separate load addresses but
1809 since they will not appear in the binary anyway, this is okay.
1810 </para><para>
1811 For the decb target, each "load" entry will cause a new "block" to be
1812 output to the binary which will contain the load address. It is legal for
1813 sections to overlap in this manner - the linker assumes the loader will sort
1814 everything out.
1815 </para></listitem>
1816 </varlistentry>
1817
1818 <varlistentry>
1819 <term>section <parameter>name</parameter></term>
1820 <listitem><para>
1821
1822 This will cause the section <parameter>name</parameter> to load after the previously listed
1823 section.
1824 </para></listitem></varlistentry>
1825 <varlistentry>
1826 <term>exec <parameter>addr or sym</parameter></term>
1827 <listitem>
1828 <para>
1829 This will cause the execution address (entry point) to be the address
1830 specified (in hex) or the specified symbol name. The symbol name must
1831 match a symbol that is exported by one of the object files being linked.
1832 This has no effect for targets that do not encode the entry point into the
1833 resulting file. If not specified, the entry point is assumed to be address 0
1834 which is probably not what you want. The default link scripts for targets
1835 that support this directive automatically starts at the beginning of the
1836 first section (usually "init" or "code") that is emitted in the binary.
1837 </para>
1838 </listitem>
1839 </varlistentry>
1840
1841 <varlistentry>
1842 <term>pad <parameter>size</parameter></term>
1843 <listitem><para>
1844 This will cause the output file to be padded with NUL bytes to be exactly
1845 <parameter>size</parameter> bytes in length. This only makes sense for a raw target.
1846 </para>
1847 </listitem>
1848 </varlistentry>
1849 </variablelist>
1850
1851
1852
1853 </section>
1854
1855 </chapter>
1856
1857 <chapter>
1858 <title>Libraries and LWAR</title>
1859
1860 <para>
1861 LWTOOLS also includes a tool for managing libraries. These are analogous to
1862 the static libraries created with the "ar" tool on POSIX systems. Each library
1863 file contains one or more object files. The linker will treat the object
1864 files within a library as though they had been specified individually on
1865 the command line except when resolving external references. External references
1866 are looked up first within the object files within the library and then, if
1867 not found, the usual lookup based on the order the files are specified on
1868 the command line occurs.
1869 </para>
1870
1871 <para>
1872 The tool for creating these libary files is called LWAR.
1873 </para>
1874
1875 <section>
1876 <title>Command Line Options</title>
1877 <para>
1878 The binary for LWAR is called "lwar". Note that the binary is in lower
1879 case. The options lwar understands are listed below. For archive manipulation
1880 options, the first non-option argument is the name of the archive. All other
1881 non-option arguments are the names of files to operate on.
1882 </para>
1883
1884 <variablelist>
1885 <varlistentry>
1886 <term><option>--add</option></term>
1887 <term><option>-a</option></term>
1888 <listitem>
1889 <para>
1890 This option specifies that an archive is going to have files added to it.
1891 If the archive does not already exist, it is created. New files are added
1892 to the end of the archive.
1893 </para>
1894 </listitem>
1895 </varlistentry>
1896
1897 <varlistentry>
1898 <term><option>--create</option></term>
1899 <term><option>-c</option></term>
1900 <listitem>
1901 <para>
1902 This option specifies that an archive is going to be created and have files
1903 added to it. If the archive already exists, it is truncated.
1904 </para>
1905 </listitem>
1906 </varlistentry>
1907
1908 <varlistentry>
1909 <term><option>--merge</option></term>
1910 <term><option>-m</option></term>
1911 <listitem>
1912 <para>
1913 If specified, any files specified to be added to an archive will be checked
1914 to see if they are archives themselves. If so, their constituent members are
1915 added to the archive. This is useful for avoiding archives containing archives.
1916 </para>
1917 </listitem>
1918 </varlistentry>
1919
1920 <varlistentry>
1921 <term><option>--list</option></term>
1922 <term><option>-l</option></term>
1923 <listitem>
1924 <para>
1925 This will display a list of the files contained in the archive.
1926 </para>
1927 </listitem>
1928 </varlistentry>
1929
1930 <varlistentry>
1931 <term><option>--debug</option></term>
1932 <term><option>-d</option></term>
1933 <listitem>
1934 <para>
1935 This option increases the debugging level. It is only useful for LWTOOLS
1936 developers.
1937 </para>
1938 </listitem>
1939 </varlistentry>
1940
1941 <varlistentry>
1942 <term><option>--help</option></term>
1943 <term><option>-?</option></term>
1944 <listitem>
1945 <para>
1946 This provides a listing of command line options and a brief description
1947 of each.
1948 </para>
1949 </listitem>
1950 </varlistentry>
1951
1952 <varlistentry>
1953 <term><option>--usage</option></term>
1954 <listitem>
1955 <para>
1956 This will display a usage summary
1957 of each command line option.
1958 </para>
1959 </listitem>
1960 </varlistentry>
1961
1962
1963 <varlistentry>
1964 <term><option>--version</option></term>
1965 <term><option>-V</option></term>
1966 <listitem>
1967 <para>
1968 This will display the version of LWLINK.
1969 of each.
1970 </para>
1971 </listitem>
1972 </varlistentry>
1973
1974 </section>
1975
1976 </chapter>
1977
1978 <chapter id="objchap">
1979 <title>Object Files</title>
1980 <para>
1981 LWTOOLS uses a proprietary object file format. It is proprietary in the sense
1982 that it is specific to LWTOOLS, not that it is a hidden format. It would be
1983 hard to keep it hidden in an open source tool chain anyway. This chapter
1984 documents the object file format.
1985 </para>
1986
1987 <para>
1988 An object file consists of a series of sections each of which contains a
1989 list of exported symbols, a list of incomplete references, and a list of
1990 "local" symbols which may be used in calculating incomplete references. Each
1991 section will obviously also contain the object code.
1992 </para>
1993
1994 <para>
1995 Exported symbols must be completely resolved to an address within the
1996 section it is exported from. That is, an exported symbol must be a constant
1997 rather than defined in terms of other symbols.</para>
1998
1999 <para>
2000 Each object file starts with a magic number and version number. The magic
2001 number is the string "LWOBJ16" for this 16 bit object file format. The only
2002 defined version number is currently 0. Thus, the first 8 bytes of the object
2003 file are <code>4C574F424A313600</code>
2004 </para>
2005
2006 <para>
2007 Each section has the following items in order:
2008 </para>
2009
2010 <itemizedlist>
2011 <listitem><para>section name</para></listitem>
2012 <listitem><para>flags</para></listitem>
2013 <listitem><para>list of local symbols (and addresses within the section)</para></listitem>
2014 <listitem><para>list of exported symbols (and addresses within the section)</para></listitem>
2015 <listitem><para>list of incomplete references along with the expressions to calculate them</para></listitem>
2016 <listitem><para>the actual object code (for non-BSS sections)</para></listitem>
2017 </itemizedlist>
2018
2019 <para>
2020 The section starts with the name of the section with a NUL termination
2021 followed by a series of flag bytes terminated by NUL. There are only two
2022 flag bytes defined. A NUL (0) indicates no more flags and a value of 1
2023 indicates the section is a BSS section. For a BSS section, no actual
2024 code is included in the object file.
2025 </para>
2026
2027 <para>
2028 Either a NULL section name or end of file indicate the presence of no more
2029 sections.
2030 </para>
2031
2032 <para>
2033 Each entry in the exported and local symbols table consists of the symbol
2034 (NUL terminated) followed by two bytes which contain the value in big endian
2035 order. The end of a symbol table is indicated by a NULL symbol name.
2036 </para>
2037
2038 <para>
2039 Each entry in the incomplete references table consists of an expression
2040 followed by a 16 bit offset where the reference goes. Expressions are
2041 defined as a series of terms up to an "end of expression" term. Each term
2042 consists of a single byte which identifies the type of term (see below)
2043 followed by any data required by the term. Then end of the list is flagged
2044 by a NULL expression (only an end of expression term).
2045 </para>
2046
2047 <table frame="all"><title>Object File Term Types</title>
2048 <tgroup cols="2">
2049 <thead>
2050 <row>
2051 <entry>TERMTYPE</entry>
2052 <entry>Meaning</entry>
2053 </row>
2054 </thead>
2055 <tbody>
2056 <row>
2057 <entry>00</entry>
2058 <entry>end of expression</entry>
2059 </row>
2060
2061 <row>
2062 <entry>01</entry>
2063 <entry>integer (16 bit in big endian order follows)</entry>
2064 </row>
2065 <row>
2066 <entry>02</entry>
2067 <entry> external symbol reference (NUL terminated symbol name follows)</entry>
2068 </row>
2069
2070 <row>
2071 <entry>03</entry>
2072 <entry>local symbol reference (NUL terminated symbol name follows)</entry>
2073 </row>
2074
2075 <row>
2076 <entry>04</entry>
2077 <entry>operator (1 byte operator number)</entry>
2078 </row>
2079 <row>
2080 <entry>05</entry>
2081 <entry>section base address reference</entry>
2082 </row>
2083
2084 <row>
2085 <entry>FF</entry>
2086 <entry>This term will set flags for the expression. Each one of these terms will set a single flag. All of them should be specified first in an expression. If they are not, the behaviour is undefined. The byte following is the flag. Flag 01 indicates an 8 bit relocation. Flag 02 indicates a zero-width relocation (see the EXTDEP pseudo op in LWASM).</entry>
2087 </row>
2088 </tbody>
2089 </tgroup>
2090 </table>
2091
2092
2093 <para>
2094 External references are resolved using other object files while local
2095 references are resolved using the local symbol table(s) from this file. This
2096 allows local symbols that are not exported to have the same names as
2097 exported symbols or external references.
2098 </para>
2099
2100 <table frame="all"><title>Object File Operator Numbers</title>
2101 <tgroup cols="2">
2102 <thead>
2103 <row>
2104 <entry>Number</entry>
2105 <entry>Operator</entry>
2106 </row>
2107 </thead>
2108 <tbody>
2109 <row>
2110 <entry>01</entry>
2111 <entry>addition (+)</entry>
2112 </row>
2113 <row>
2114 <entry>02</entry>
2115 <entry>subtraction (-)</entry>
2116 </row>
2117 <row>
2118 <entry>03</entry>
2119 <entry>multiplication (*)</entry>
2120 </row>
2121 <row>
2122 <entry>04</entry>
2123 <entry>division (/)</entry>
2124 </row>
2125 <row>
2126 <entry>05</entry>
2127 <entry>modulus (%)</entry>
2128 </row>
2129 <row>
2130 <entry>06</entry>
2131 <entry>integer division (\) (same as division)</entry>
2132 </row>
2133
2134 <row>
2135 <entry>07</entry>
2136 <entry>bitwise and</entry>
2137 </row>
2138
2139 <row>
2140 <entry>08</entry>
2141 <entry>bitwise or</entry>
2142 </row>
2143
2144 <row>
2145 <entry>09</entry>
2146 <entry>bitwise xor</entry>
2147 </row>
2148
2149 <row>
2150 <entry>0A</entry>
2151 <entry>boolean and</entry>
2152 </row>
2153
2154 <row>
2155 <entry>0B</entry>
2156 <entry>boolean or</entry>
2157 </row>
2158
2159 <row>
2160 <entry>0C</entry>
2161 <entry>unary negation, 2's complement (-)</entry>
2162 </row>
2163
2164 <row>
2165 <entry>0D</entry>
2166 <entry>unary 1's complement (^)</entry>
2167 </row>
2168 </tbody>
2169 </tgroup>
2170 </table>
2171
2172 <para>
2173 An expression is represented in a postfix manner with both operands for
2174 binary operators preceding the operator and the single operand for unary
2175 operators preceding the operator.
2176 </para>
2177
2178 </chapter>
2179 </book>
2180