view doc/manual.docbook.sgml @ 145:afe30454382f

Made development version of LWASM be 2.1, not 3.0, because the next release will be an incremental feature release
author lost
date Thu, 29 Jan 2009 06:13:00 +0000
parents f21a5593a661
children 6c0a30278982
line wrap: on
line source

<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.5//EN">
<book>
<bookinfo>
<title>LW Tool Chain</title>
<author><firstname>William</firstname><surname>Astle</surname></author>
<copyright><year>2009</year><holder>William Astle</holder></copyright>
</bookinfo>
<chapter>

<title>Introduction</title>

<para>
The LW tool chain provides utilities for building binaries for MC6809 and
HD6309 CPUs. The tool chain includes a cross-assembler and a cross-linker
which support several styles of output.
</para>

<section>
<title>History</title>
<para>
For a long time, I have had an interest in creating an operating system for
the Coco3. I finally started working on that project around the beginning of
2006. I had a number of assemblers I could choose from. Eventually, I settled
on one and started tinkering. After a while, I realized that assembler was not
going to be sufficient due to lack of macros and issues with forward references.
Then I tried another which handled forward references correctly but still did
not support macros. I looked around at other assemblers and they all lacked
one feature or another that I really wanted for creating my operating system.
</para>

<para>
The solution seemed clear at that point. I am a fair programmer so I figured
I could write an assembler that would do everything I wanted an assembler to
do. Thus the LWASM probject was born. After more than two years of on and off
work, version 1.0 of LWASM was released in October of 2008.
</para>

<para>
As the aforementioned operating system project progressed further, it became
clear that while assembling the whole project through a single file was doable,
it was not practical. When I found myself playing some fancy games with macros
in a bid to simulate sections, I realized I needed a means of assembling
source files separately and linking them later. This spawned a major development
effort to add an object file support to LWASM. It also spawned the LWLINK
project to provide a means to actually link the files.
</para>

</section>

</chapter>

<chapter>
<title>Output Formats</title>

<para>
The LW tool chain supports multiple output formats. Each format has its
advantages and disadvantages. Each format is described below.
</para>

<section>
<title>Raw Binaries</title>
<para>
A raw binary is simply a string of bytes. There are no headers or other
niceties. Both LWLINK and LWASM support generating raw binaries. ORG directives
in the source code only serve to set the addresses that will be used for
symbols but otherwise have no direct impact on the resulting binary.
</para>

</section>
<section>
<title>DECB Binaries</title>

<para>A DECB binary is compatible with the LOADM command in Disk Extended
Color Basic on the CoCo. They are also compatible with CLOADM from Extended
Color Basic. These binaries include the load address of the binary as well
as encoding an execution address. These binaries may contain multiple loadable
sections, each of which has its own load address.</para>

<para>
Each binary starts with a preamble. Each preamble is five bytes long. The
first byte is zero. The next two bytes specify the number of bytes to load
and the last two bytes specify the address to load the bytes at. Then, a
string of bytes follows. After this string of bytes, there may be another
preamble or a postamble. A postamble is also five bytes in length. The first
byte of the postamble is $FF, the next two are zero, and the last two are
the execution address for the binary.
</para>

<para>
Both LWASM and LWLINK can output this format.
</para>
</section>

<section>
<title>Object Files</title>
<para>LWASM supports generating a proprietary object file format which is
described in <xref linkend="objchap">. LWLINK is then used to link these
object files into a final binary in any of LWLINK's supported binary
formats.</para>

<para>Object files are very flexible in that they allow references that are not
known at assembly time to be resolved at link time. However, because the
addresses of such references are not known, there is no way for the assembler
has to use sixteen bit addressing modes for these references. The linker
will always use sixteen bits when resolving a reference which means any
instruction that requires an eight bit operand cannot use external references.
</para>

<para>Object files also support the concept of sections which are not valid
for other output types. This allows related code from each object file
linked to be collapsed together in the final binary.</para> 

</section>

</chapter>

<chapter>
<title>LWASM</title>
<para>
The LWTOOLS assembler is called LWASM. This chapter documents the various
features of the assembler. It is not, however, a tutorial on 6x09 assembly
language programming.
</para>

<section>
<title>Command Line Options</title>
<para>
The binary for LWASM is called "lwasm". Note that the binary is in lower
case. lwasm takes the following command line arguments.
</para>

<variablelist>
<varlistentry>
<term><option>--decb</option></term>
<term><option>-b</option></term>
<listitem>
<para>
Select the DECB output format target. Equivalent to <option>--format=decb</option>.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><option>--debug</option></term>
<term><option>-d</option></term>
<listitem>
<para>
Increase the debugging level. Only really useful to people hacking on the
LWASM source code itself.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><option>--format=type</option></term>
<term><option>-f type</option></term>
<listitem>
<para>
Select the output format. Valid values are <option>obj</option> for the object
file target, <option>decb</option> for the DECB LOADM format, and <option>raw</option>
for a raw binary.
</para>
</listitem>
</varlistentry>


<varlistentry>
<term><option>--list[=file]</option></term>
<term><option>-l[file]</option></term>
<listitem>
<para>
Cause LWASM to generate a listing. If <option>file</option> is specified,
the listing will go to that file. Otherwise it will go to the standard output
stream. By default, no listing is generated.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><option>--obj</option></term>
<listitem>
<para>
Select the proprietary object file format as the output target.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><option>--pragma=pragma</option></term>
<term><option>-p pragma</option></term>
<listitem>
<para>
Specify assembler pragmas. Multiple pragmas are separated by commas. The
pragmas accepted are the same as for the PRAGMA assembler directive described
below.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><option>--raw</option></term>
<term><option>-r</option></term>
<listitem>
<para>
Select raw binary as the output target.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><option>--help</option></term>
<term><option>-?</option></term>
<listitem>
<para>
Present a help screen describing the command line options.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><option>--usage</option></term>
<listitem>
<para>
Provide a summary of the command line options.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><option>--version</option></term>
<term><option>-V</option></term>
<listitem>
<para>
Display the software version.
</para>
</listitem>
</varlistentry>

</variablelist>

</section>

<section>
<title>Dialects</title>
<para>
LWASM supports all documented MC6809 instructions as defined by Motorola.
It also supports all known HD6309 instructions. There is some variation,
however, in the pneumonics used for the block transfer instructions. LWASM
uses TFM for all four of them as do several other assemblers. Others, such
as CCASM, use four separate opcodes for it (compare: copy+, copy-, implode,
and explode). There are advantages to both methods. However, it seems like
TFM has the most traction and thus, this is what LWASM supports. Support
for such variations may be added in the future.
</para>

<para>
The standard addressing mode specifiers are supported. These are the
hash sign ("#") for immediate mode, the less than sign ("&lt;") for forced
eight bit modes, and the greater than sign ("&gt;") for forced sixteen bit modes.
</para>

</section>

<section>
<title>Source Format</title>

<para>
LWASM accepts plain text files in a relatively free form. It can handle
lines terminated with CR, LF, CRLF, or LFCR which means it should be able
to assemble files on any platform on which it compiles.
</para>
<para>
Each line may start with a symbol. If a symbol is present, there must not
be any whitespace preceding it. It is legal for a line to contain nothing
but a symbol.</para>
<para>
The op code is separated from the symbol by whitespace. If there is
no symbol, there must be at least one white space character preceding it.
If applicable, the operand follows separated by whitespace. Following the
opcode and operand is an optional comment.
</para>
<para>
A comment can also be introduced with a * or a ;. The comment character is
optional for end of statement comments. However, if a symbol is the only
thing present on the line other than the comment, the comment character is
mandatory to prevent the assembler from interpreting the comment as an opcode.
</para>

<para>
The opcode is not treated case sensitively. Neither are register names in
the operand fields. Symbols, however, are case sensitive.
</para>

<para>
LWASM does not support line numbers in the file.
</para>

</section>

<section>
<title>Symbols</title>

<para>
Symbols have no length restriction. They may contain letters, numbers, dots,
dollar signs, and underscores. They must start with a letter, dot, or
underscore.
</para>

<para>
LWASM also supports the concept of a local symbol. A local symbol is one
which contains either a "?" or a "@", which can appear anywhere in the symbol.
The scope of a local symbol is determined by a number of factors. First,
each included file gets its own local symbol scope. A blank line will also
be considered a local scope barrier. Macros each have their own local symbol
scope as well (which has a side effect that you cannot use a local symbol
as an argument to a macro). There are other factors as well. In general,
a local symbol is restricted to the block of code it is defined within.
</para>

</section>

<section>
<title>Numbers and Expressions</title>
<para>
Numbers can be expressed in binary, octal, decimal, or hexadecimal.
Binary numbers may be prefixed with a "%" symbol or suffixed with a
"b" or "B". Octal numbers may be prefixed with "@" or suffixed with
"Q", "q", "O", or "o". Hexadecimal numbers may be prefixed with "$" or
suffixed with "H". No prefix or suffix is required for decimal numbers but
they can be prefixed with "&amp;" if desired. Any constant which begins with
a letter must be expressed with the correct prefix base identifier or be
prefixed with a 0. Thus hexadecimal FF would have to be written either 0FFH
or $FF. Numbers are not case sensitive.
</para>

<para> A symbol may appear at any point where a number is acceptable. The
special symbol "*" can be used to represent the starting address of the
current source line within expressions. </para>

<para>The ASCII value of a character can be included by prefixing it with a
single quote ('). The ASCII values of two characters can be included by
prefixing the characters with a quote (").</para>

<para>
LWASM supports the following basic binary operators: +, -, *, /, and %.
These represent addition, subtraction, multiplication, division, and modulus.
It also supports unary negation and unary 1's complement (- and ^ respectively).
For completeness, a unary positive (+) is supported though it is a no-op.
</para>

<para>Operator precedence follows the usual rules. multiplication, division,
and modulus take precedence over addition and subtraction. Unary operators
take precedence over binary operators. To force a specific order of evaluation,
parentheses can be used in the usual manner.
</para>
</section>

<section>
<title>Assembler Directives</title>
<para>
Various directives can be used to control the behaviour of the
assembler or to include non-code/data in the resulting output. Those directives
that are not described in detail in other sections of this document are
described below.
</para>

<section>
<title>Data Directives</title>
<variablelist>
<varlistentry><term>FCB <parameter>expr[,...]</parameter></term>
<listitem>
<para>Include one or more constant bytes (separated by commas) in the output.</para>
</listitem>
</varlistentry>

<varlistentry><term>FDB <parameter>expr[,...]</parameter></term>
<listitem>
<para>Include one or more words (separated by commas) in the output.</para>
</listitem>
</varlistentry>

<varlistentry><term>FQB <parameter>expr[,...]</parameter></term>
<listitem>
<para>Include one or more double words (separated by commas) in the output.</para>
</listitem>
</varlistentry>

<varlistentry><term>FCC <parameter>string</parameter></term>
<listitem>
<para>
Include a string of text in the output. The first character of the operand
is the delimiter which must appear as the last character and cannot appear
within the string. The string is included with no modifications>
</para>
</listitem>
</varlistentry>

<varlistentry><term>FCN <parameter>string</parameter></term>
<listitem>
<para>
Include a NUL terminated string of text in the output. The first character of
the operand is the delimiter which must appear as the last character and
cannot appear within the string. A NUL byte is automatically appended to
the string.
</para>
</listitem>
</varlistentry>

<varlistentry><term>FCS <parameter>string</parameter></term>
<listitem>
<para>
Include a string of text in the output with bit 7 of the final byte set. The
first character of the operand is the delimiter which must appear as the last
character and cannot appear within the string.
</para>
</listitem>
</varlistentry>

<varlistentry><term>ZMB <parameter>expr</parameter></term>
<listitem>
<para>
Include a number of NUL bytes in the output. The number must be fully resolvable
during pass 1 of assembly so no forward or external references are permitted.
</para>
</listitem>
</varlistentry>

<varlistentry><term>ZMD <parameter>expr</parameter></term>
<listitem>
<para>
Include a number of zero words in the output. The number must be fully
resolvable during pass 1 of assembly so no forward or external references are
permitted.
</para>
</listitem>
</varlistentry>

<varlistentry><term>ZMQ <parameter>expr<parameter></term>
<listitem>
<para>
Include a number of zero double-words in the output. The number must be fully
resolvable during pass 1 of assembly so no forward or external references are
permitted.
</para>
</listitem>
</varlistentry>

<varlistentry><term>RMB <parameter>expr</parameter></term>
<listitem>
<para>
Reserve a number of bytes in the output. The number must be fully resolvable
during pass 1 of assembly so no forward or external references are permitted.
The value of the bytes is undefined.
</para>
</listitem>
</varlistentry>

<varlistentry><term>RMD <parameter>expr</parameter></term>
<listitem>
<para>
Reserve a number of words in the output. The number must be fully
resolvable during pass 1 of assembly so no forward or external references are
permitted. The value of the words is undefined.
</para>
</listitem>
</varlistentry>

<varlistentry><term>RMQ <parameter>expr</parameter></term>
<listitem>
<para>
Reserve a number of double-words in the output. The number must be fully
resolvable during pass 1 of assembly so no forward or external references are
permitted. The value of the double-words is undefined.
</para>
</listitem>
</varlistentry>
</variablelist>

</section>

<section>
<title>Address Definition</title>
<para>The directives in this section all control the addresses of symbols
or the assembly process itself.</para>

<variablelist>
<varlistentry><term>ORG <parameter>expr</parameter></term>
<listitem>
<para>Set the assembly address. The address must be fully resolvable on the
first pass so no external or forward references are permitted. ORG is not
permitted within sections when outputting to object files. For the DECB
target, each ORG directive after which output is generated will cause
a new preamble to be output. ORG is only used to determine the addresses
of symbols when the raw target is used.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><parameter>sym</parameter> EQU <parameter>expr</parameter></term>
<term><parameter>sym</parameter> = <parameter>expr</parameter></term>
<listitem>
<para>Define the value of <parameter>sym</parameter> to be <parameter>expr</parameter>.
</listitem>
</varlistentry>

<varlistentry>
<term><parameter>sym</parameter> SET <parameter>expr</parameter></term>
<listitem>
<para>Define the value of <parameter>sym</parameter> to be <parameter>expr</parameter>.
Unlike EQU, SET permits symbols to be defined multiple times as long as SET
is used for all instances. Use of the symbol before the first SET statement
that sets its value is undefined.</para>
</listitem>
</varlistentry>

<varlistentry>
<term>SETDP <parameter>expr</parameter></term>
<listitem>
<para>Inform the assembler that it can assume the DP register contains
<parameter>expr</parameter>. This directive is only advice to the assembler
to determine whether an address is in the direct page and has no effect
on the contents of the DP register. The value must be fully resolved during
the first assembly pass because it affects the sizes of subsequent instructions.
</para>
<para>This directive has no effect in the object file target.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>ALIGN <parameter>expr</parameter></term>
<listitem>
<para>Force the current assembly address to be a multiple of <parameter>expr</parameter>.
A series of NUL bytes is output to force the alignment, if required. The
alignment value must be fully resolved on the first pass because it affects
the addresses of subsquent instructions.</para>
<para>This directive is not suitable for inclusion in the middle of actual
code. It is intended to appear where the bytes output will not be executed.
</para>
</listitem>
</varlistentry>

</variablelist>

</section>

<section>
<title>Conditional Assembly</title>
<para>
Portions of the source code can be excluded or included based on conditions
known at assembly time. Conditionals can be nested arbitrarily deeply. The
directives associated with conditional assembly are described in this section.
</para>
<para>All conditionals must be fully bracketed. That is, every conditional
statement must eventually be followed by an ENDC at the same level of nesting.
</para>
<para>Conditional expressions are only evaluated on the first assembly pass.
It is not possible to game the assembly process by having a conditional
change its value between assembly passes. Thus there is not and never will
be any equivalent of IFP1 or IFP2 as provided by other assemblers.</para>

<variablelist>
<varlistentry>
<term>IFEQ <parameter>expr</parameter></term>
<listitem>
<para>If <parameter>expr</parameter> evaluates to zero, the conditional
will be considered true.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>IFNE <parameter>expr</parameter></term>
<term>IF <parameter>expr</parameter></term>
<listitem>
<para>If <parameter>expr</parameter> evaluates to a non-zero value, the conditional
will be considered true.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>IFGT <parameter>expr</parameter></term>
<listitem>
<para>If <parameter>expr</parameter> evaluates to a value greater than zero, the conditional
will be considered true.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>IFGE <parameter>expr</parameter></term>
<listitem>
<para>If <parameter>expr</parameter> evaluates to a value greater than or equal to zero, the conditional
will be considered true.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>IFLT <parameter>expr</parameter></term>
<listitem>
<para>If <parameter>expr</parameter> evaluates to a value less than zero, the conditional
will be considered true.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>IFLE <parameter>expr</parameter></term>
<listitem>
<para>If <parameter>expr</parameter> evaluates to a value less than or equal to zero , the conditional
will be considered true.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>IFDEF <parameter>sym</parameter></term>
<listitem>
<para>If <parameter>sym</parameter> is defined at this point in the assembly
process, the conditional
will be considered true.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>IFNDEF <parameter>sym</parameter></term>
<listitem>
<para>If <parameter>sym</parameter> is not defined at this point in the assembly
process, the conditional
will be considered true.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>ELSE</term>
<listitem>
<para>
If the preceding conditional at the same level of nesting was false, the
statements following will be assembled. If the preceding conditional at
the same level was true, the statements following will not be assembled.
Note that the preceding conditional might have been another ELSE statement
although this behaviour is not guaranteed to be supported in future versions
of LWASM.
</para>
</listitem>

<varlistentry>
<term>ENDC</term>
<listitem>
<para>
This directive marks the end of a conditional construct. Every conditional
construct must end with an ENDC directive.
</para>
</listitem>
</varlistentry>

</variablelist>
</section>

<section>
<title>Miscelaneous Directives</title>

<para>This section includes directives that do not fit into the other
categories.</para>

<variablelist>

<varlistentry>
<term>INCLUDE <parameter>filename</parameter></term>
<listitem>
<para>
Include the contents of <parameter>filename</parameter> at this point in
the assembly as though it were a part of the file currently being processed.
Note that whitespace cannot appear in the name of the file.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>END <parameter>[expr]</parameter></term>
<listitem>
<para>
This directive causes the assembler to stop assembling immediately as though
it ran out of input. For the DECB target only, <parameter>expr</parameter>
can be used to set the execution address of the resulting binary. For all
other targets, specifying <parameter>expr</parameter> will cause an error.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>ERROR <parameter>string</parameter></term>
<listitem>
<para>
Causes a custom error message to be printed at this line. This will cause
assembly to fail. This directive is most useful inside conditional constructs
to cause assembly to fail if some condition that is known bad happens.
</para>
</listitem>
</varlistentry>

</variablelist>
</section>

</section>

<section>
<title>Macros</title>
<para>
LWASM is a macro assembler. A macro is simply a name that stands in for a
series of instructions. Once a macro is defined, it is used like any other
assembler directive. Defining a macro can be considered equivalent to adding
additional assembler directives.
</para>
<para>Macros my accept parameters. These parameters are referenced within
a macro by the a backslash ("\") followed by a digit 1 through 9 for the first
through ninth parameters. They may also be referenced by enclosing the
decimal parameter number in braces ("{num}"). These parameter references
are replaced with the verbatim text of the parameter passed to the macro. A
reference to a non-existent parameter will be replaced by an empty string.
Macro parameters are expanded everywhere on each source line. That means
the parameter to a macro could be used as a symbol or it could even appear
in a comment or could cause an entire source line to be commented out
when the macro is expanded.
</para>
<para>
Parameters passed to a macro are separated by commas and the parameter list
is terminated by any whitespace. This means that neither a comma nor whitespace
may be included in a macro parameter.
</para>
<para>
Macro expansion is done recursively. That is, within a macro, macros are
expanded. This can lead to infinite loops in macro expansion. If the assembler
hangs for a long time while assembling a file that uses macros, this may be
the reason.</para>

<para>Each macro expansion receives its own local symbol context which is not
inherited by any macros called by it nor is it inherited from the context
the macro was instantiated in. That means it is possible to use local symbols
within macros without having them collide with symbols in other macros or
outside the macro itself. However, this also means that using a local symbol
as a parameter to a macro, while legal, will not do what it would seem to do
as it will result in looking up the local symbol in the macro's symbol context
rather than the enclosing context where it came from, likely yielding either
an undefined symbol error or bizarre assembly results.
</para>
<para>
Note that there is no way to define a macro as local to a symbol context. All
macros are part of the global macro namespace. However, macros have a separate
namespace from symbols so it is possible to have a symbol with the same name
as a macro.
</para>

<para>
Macros are defined only during the first pass. Macro expansion also
only occurs during the first pass. On the second pass, the macro
definition is simply ignored. Macros must be defined before they are used.
</para>

<para>The following directives are used when defining macros.</para>

<variablelist>
<varlistentry>
<term><parameter>macroname</parameter> MACRO</term>
<listitem>
<para>This directive is used to being the definition of a macro called
<parameter>macroname</parameter>. If <parameter>macroname</parameter> already
exists, it is considered an error. Attempting to define a macro within a
macro is undefined. It may work and it may not so the behaviour should not
be relied upon.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>ENDM</term>
<listitem>
<para>
This directive indicates the end of the macro currently being defined. It
causes the assembler to resume interpreting source lines as normal.
</para>
</listitem>
</variablelist>

</section>

<section>
<title>Object Files and Sections</title>
<para>
The object file target is very useful for large project because it allows
multiple files to be assembled independently and then linked into the final
binary at a later time. It allows only the small portion of the project
that was modified to be re-assembled rather than requiring the entire set
of source code to be available to the assembler in a single assembly process.
This can be particularly important if there are a large number of macros,
symbol definitions, or other metadata that uses resources at assembly time.
By far the largest benefit, however, is keeping the source files small enough
for a mere mortal to find things in them.
</para>

<para>
With multi-file projects, there needs to be a means of resolving references to
symbols in other source files. These are known as external references. The
addresses of these symbols cannot be known until the linker joins all the
object files into a single binary. This means that the assembler must be
able to output the object code without knowing the value of the symbol. This
places some restrictions on the code generated by the assembler. For
example, the assembler cannot generate direct page addressing for instructions
that reference external symbols because the address of the symbol may not
be in the direct page. Similarly, relative branches and PC relative addressing
cannot be used in their eight bit forms. Everything that must be resolved
by the linker must be assembled to use the largest address size possible to
allow the linker to fill in the correct value at link time. Note that the
same problem applies to absolute address references as well, even those in
the same source file, because the address is not known until link time.
</para>

<para>
It is often desired in multi-file projects to have code of various types grouped
together in the final binary generated by the linker as well. The same applies
to data. In order for the linker to do that, the bits that are to be grouped
must be tagged in some manner. This is where the concept of sections comes in.
Each chunk of code or data is part of a section in the object file. Then,
when the linker reads all the object files, it coalesces all sections of the
same name into a single section and then considers it as a unit.
</para>

<para>
The existence of sections, however, raises a problem for symbols even
within the same source file. Thus, the assembler must treat symbols from
different sections within the same source file in the same manner as external
symbols. That is, it must leave them for the linker to resolve at link time,
with all the limitations that entails.
</para>

<para>
In the object file target mode, LWASM requires all source lines that
cause bytes to be output to be inside a section. Any directives that do
not cause any bytes to be output can appear outside of a section. This includes
such things as EQU or RMB. Even ORG can appear outside a section. ORG, however,
makes no sense within a section because it is the linker that determines
the starting address of the section's code, not the assembler.
</para>

<para>
All symbols defined globally in the assembly process are local to the 
source file and cannot be exported. All symbols defined within a section are
considered local to the source file unless otherwise explicitly exported.
Symbols referenced from external source files must be declared external,
either explicitly or by asking the assembler to assume that all undefined
symbols are external.
</para>

<para>
It is often handy to define a number of memory addresses that will be
used for data at run-time but which need not be included in the binary file.
These memory addresses are not initialized until run-time, either by the
program itself or by the program loader, depending on the operating environment.
Such sections are often known as BSS sections. LWASM supports generating
sections with a BSS attribute set which causes the section definition including
symbols exported from that section and those symbols required to resolve
references from the local file, but with no actual code in the object file.
It is illegal for any source lines within a BSS flagged section to cause any
bytes to be output.
</para>

<para>The following directives apply to section handling.</para>

<variablelist>
<varlistentry>
<term>SECTION <parameter>name[,flags]</parameter></term>
<term>SECT <parameter>name[,flags]</parameter></term>
<listitem>
<para>
Instructs the assembler that the code following this directive is to be
considered part of the section <parameter>name</parameter>. A section name
may appear multiple times in which case it is as though all the code from
all the instances of that section appeared adjacent within the source file.
However, <parameter>flags</parameter> may only be specified on the first
instance of the section.
</para>
<para>There is a single flag supported in <parameter>flags</parameter>. The
flag <parameter>bss</parameter> will cause the section to be treated as a BSS
section and, thus, no code will be included in the object file nor will any
bytes be permitted to be output.</para>
<para>
If assembly is already happening within a section, the section is implicitly
ended and the new section started. This is not considered an error although
it is recommended that all sections be explicitly closed.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>ENDSECTION</term>
<term>ENDSECT</term>
<term>ENDS</term>
<listitem>
<para>
This directive ends the current section. This puts assembly outside of any
sections until the next SECTION directive.
</listitem>
</varlistentry>

<varlistentry>
<term><parameter>sym</parameter> EXTERN</term>
<term><parameter>sym</parameter> EXTERNAL</term>
<term><parameter>sym</parameter> IMPORT</term>
<listitem>
<para>
This directive defines <parameter>sym</parameter> as an external symbol.
This directive may occur at any point in the source code. EXTERN definitions
are resolved on the first pass so an EXTERN definition anywhere in the
source file is valid for the entire file. The use of this directive is
optional when the assembler is instructed to assume that all undefined
symbols are external. In fact, in that mode, if the symbol is referenced
before the EXTERN directive, an error will occur.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><parameter>sym</parameter> EXPORT</term>
<listitem>
<para>
This directive defines <parameter>sym</parameter> as an exported symbol.
This directive may occur at any point in the source code, even before the
definition of the exported symbol.
</para>
</listitem>
</varlistentry>

</variablelist>

</section>

<section>
<title>Assembler Modes and Pragmas</title>
<para>
There are a number of options that affect the way assembly is performed.
Some of these options can only be specified on the command line because
they determine something absolute about the assembly process. These include
such things as the output target. Other things may be switchable during
the assembly process. These are known as pragmas and are, by definition,
not portable between assemblers.
</para>

<para>LWASM supports a number of pragmas that affect code generation or
otherwise affect the behaviour of the assembler. These may be specified by
way of a command line option or by assembler directives. The directives
are as follows.
</para>

<variablelist>
<varlistentry>
<term>PRAGMA <parameter>pragma[,...]</parameter></term>
<listitem>
<para>
Specifies that the assembler should bring into force all <parameter>pragma</parameter>s
specified. Any unrecognized pragma will cause an assembly error. The new
pragmas will take effect immediately. This directive should be used when
the program will assemble incorrectly if the pragma is ignored or not supported.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>*PRAGMA <parameter>pragma[,...]</parameter></term>
<listitem>
<para>
This is identical to the PRAGMA directive except no error will occur with
unrecognized or unsupported pragmas. This directive, by virtue of starting
with a comment character, will also be ignored by assemblers that do not
support this directive. Use this variation if the pragma is not required
for correct functioning of the code.
</para>
</listitem>
</varlistentry>
</variablelist>

<para>Each pragma supported has a positive version and a negative version.
The positive version enables the pragma while the negative version disables
it. The negatitve version is simply the positive version with "no" prefixed
to it. For instance, "pragma" vs. "nopragma". Only the positive version is
listed below.</para>

<para>Pragmas are not case sensitive.</para>

<variablelist>
<varlistentry>
<term>index0tonone</term>
<listitem>
<para>
When in force, this pragma enables an optimization affecting indexed addressing
modes. When the offset expression in an indexed mode evaluates to zero but is
not explicity written as 0, this will replace the operand with the equivalent
no offset mode, thus creating slightly faster code. Because of the advantages
of this optimization, it is enabled by default.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term>undefextern</term>
<listitem>
<para>
This pragma is only valid for targets that support external references. When in
force, if the assembler sees an undefined symbol on the second pass, it will
automatically define it as an external symbol. This automatic definition will
apply for the remainder of the assembly process, even if the pragma is
subsequently turned off. Because this behaviour would be potentially surprising,
this pragma defaults to off.
</para>
<para>
The primary use for this pragma is for projects that share a large number of
symbols between source files. In such cases, it is impractical to enumerate
all the external references in every source file. This allows the assembler
and linker to do the heavy lifting while not preventing a particular source
module from defining a local symbol of the same name as an external symbol
if it does not need the external symbol. (This pragma will not cause an
automatic external definition if there is already a locally defined symbol.)
</para>
<para>
This pragma will often be specified on the command line for large projects.
However, depending on the specific dynamics of the project, it may be sufficient
for one or two files to use this pragma internally.
</para>
</listitem>
</varlistentry>
</variablelist>

</section>

</chapter>

<chapter>
<title>LWLINK</title>
<para>
</para>
</chapter>

<chapter id="objchap">
<title>Object Files</title>
<para>
LWTOOLS uses a proprietary object file format. It is proprietary in the sense
that it is specific to LWTOOLS, not that it is a hidden format. It would be
hard to keep it hidden in an open source tool chain anyway. This chapter
documents the object file format.
</para>
</chapter>
</book>