# HG changeset patch
# User lost
# Date 1233209580 0
# Node ID afe30454382fd83aac90fcc409aa99b68fb9fbc0
# Parent 006d737756fdc2530aaa5fcb37001d2163545179
Made development version of LWASM be 2.1, not 3.0, because the next release will be an incremental feature release
diff -r 006d737756fd -r afe30454382f configure.ac
--- a/configure.ac Thu Jan 29 06:12:21 2009 +0000
+++ b/configure.ac Thu Jan 29 06:13:00 2009 +0000
@@ -1,4 +1,4 @@
-AC_INIT([LWTOOLS], [3.0], [lost@l-w.ca])
+AC_INIT([LWTOOLS], [2.1], [lost@l-w.ca])
AM_INIT_AUTOMAKE([-Wall -Werror foreign])
AC_PROG_CC
AC_CONFIG_HEADERS([src/config.h])
diff -r 006d737756fd -r afe30454382f doc/manual.docbook.sgml
--- a/doc/manual.docbook.sgml Thu Jan 29 06:12:21 2009 +0000
+++ b/doc/manual.docbook.sgml Thu Jan 29 06:13:00 2009 +0000
@@ -114,9 +114,944 @@
+
+LWASM
+
+The LWTOOLS assembler is called LWASM. This chapter documents the various
+features of the assembler. It is not, however, a tutorial on 6x09 assembly
+language programming.
+
+
+
+Command Line Options
+
+The binary for LWASM is called "lwasm". Note that the binary is in lower
+case. lwasm takes the following command line arguments.
+
+
+
+
+
+
+
+
+Select the DECB output format target. Equivalent to .
+
+
+
+
+
+
+
+
+
+Increase the debugging level. Only really useful to people hacking on the
+LWASM source code itself.
+
+
+
+
+
+
+
+
+
+Select the output format. Valid values are for the object
+file target, for the DECB LOADM format, and
+for a raw binary.
+
+
+
+
+
+
+
+
+
+
+Cause LWASM to generate a listing. If is specified,
+the listing will go to that file. Otherwise it will go to the standard output
+stream. By default, no listing is generated.
+
+
+
+
+
+
+
+
+Select the proprietary object file format as the output target.
+
+
+
+
+
+
+
+
+
+Specify assembler pragmas. Multiple pragmas are separated by commas. The
+pragmas accepted are the same as for the PRAGMA assembler directive described
+below.
+
+
+
+
+
+
+
+
+
+Select raw binary as the output target.
+
+
+
+
+
+
+
+
+
+Present a help screen describing the command line options.
+
+
+
+
+
+
+
+
+Provide a summary of the command line options.
+
+
+
+
+
+
+
+
+
+Display the software version.
+
+
+
+
+
+
+
+
+
+Dialects
+
+LWASM supports all documented MC6809 instructions as defined by Motorola.
+It also supports all known HD6309 instructions. There is some variation,
+however, in the pneumonics used for the block transfer instructions. LWASM
+uses TFM for all four of them as do several other assemblers. Others, such
+as CCASM, use four separate opcodes for it (compare: copy+, copy-, implode,
+and explode). There are advantages to both methods. However, it seems like
+TFM has the most traction and thus, this is what LWASM supports. Support
+for such variations may be added in the future.
+
+
+
+The standard addressing mode specifiers are supported. These are the
+hash sign ("#") for immediate mode, the less than sign ("<") for forced
+eight bit modes, and the greater than sign (">") for forced sixteen bit modes.
+
+
+
+
+
+Source Format
+
+
+LWASM accepts plain text files in a relatively free form. It can handle
+lines terminated with CR, LF, CRLF, or LFCR which means it should be able
+to assemble files on any platform on which it compiles.
+
+
+Each line may start with a symbol. If a symbol is present, there must not
+be any whitespace preceding it. It is legal for a line to contain nothing
+but a symbol.
+
+The op code is separated from the symbol by whitespace. If there is
+no symbol, there must be at least one white space character preceding it.
+If applicable, the operand follows separated by whitespace. Following the
+opcode and operand is an optional comment.
+
+
+A comment can also be introduced with a * or a ;. The comment character is
+optional for end of statement comments. However, if a symbol is the only
+thing present on the line other than the comment, the comment character is
+mandatory to prevent the assembler from interpreting the comment as an opcode.
+
+
+
+The opcode is not treated case sensitively. Neither are register names in
+the operand fields. Symbols, however, are case sensitive.
+
+
+
+LWASM does not support line numbers in the file.
+
+
+
+
+
+Symbols
+
+
+Symbols have no length restriction. They may contain letters, numbers, dots,
+dollar signs, and underscores. They must start with a letter, dot, or
+underscore.
+
+
+
+LWASM also supports the concept of a local symbol. A local symbol is one
+which contains either a "?" or a "@", which can appear anywhere in the symbol.
+The scope of a local symbol is determined by a number of factors. First,
+each included file gets its own local symbol scope. A blank line will also
+be considered a local scope barrier. Macros each have their own local symbol
+scope as well (which has a side effect that you cannot use a local symbol
+as an argument to a macro). There are other factors as well. In general,
+a local symbol is restricted to the block of code it is defined within.
+
+
+
+
+
+Numbers and Expressions
+
+Numbers can be expressed in binary, octal, decimal, or hexadecimal.
+Binary numbers may be prefixed with a "%" symbol or suffixed with a
+"b" or "B". Octal numbers may be prefixed with "@" or suffixed with
+"Q", "q", "O", or "o". Hexadecimal numbers may be prefixed with "$" or
+suffixed with "H". No prefix or suffix is required for decimal numbers but
+they can be prefixed with "&" if desired. Any constant which begins with
+a letter must be expressed with the correct prefix base identifier or be
+prefixed with a 0. Thus hexadecimal FF would have to be written either 0FFH
+or $FF. Numbers are not case sensitive.
+
+
+ A symbol may appear at any point where a number is acceptable. The
+special symbol "*" can be used to represent the starting address of the
+current source line within expressions.
+
+The ASCII value of a character can be included by prefixing it with a
+single quote ('). The ASCII values of two characters can be included by
+prefixing the characters with a quote (").
+
+
+LWASM supports the following basic binary operators: +, -, *, /, and %.
+These represent addition, subtraction, multiplication, division, and modulus.
+It also supports unary negation and unary 1's complement (- and ^ respectively).
+For completeness, a unary positive (+) is supported though it is a no-op.
+
+
+Operator precedence follows the usual rules. multiplication, division,
+and modulus take precedence over addition and subtraction. Unary operators
+take precedence over binary operators. To force a specific order of evaluation,
+parentheses can be used in the usual manner.
+
+
+
+
+Assembler Directives
+
+Various directives can be used to control the behaviour of the
+assembler or to include non-code/data in the resulting output. Those directives
+that are not described in detail in other sections of this document are
+described below.
+
+
+
+Data Directives
+
+FCB expr[,...]
+
+Include one or more constant bytes (separated by commas) in the output.
+
+
+
+FDB expr[,...]
+
+Include one or more words (separated by commas) in the output.
+
+
+
+FQB expr[,...]
+
+Include one or more double words (separated by commas) in the output.
+
+
+
+FCC string
+
+
+Include a string of text in the output. The first character of the operand
+is the delimiter which must appear as the last character and cannot appear
+within the string. The string is included with no modifications>
+
+
+
+
+FCN string
+
+
+Include a NUL terminated string of text in the output. The first character of
+the operand is the delimiter which must appear as the last character and
+cannot appear within the string. A NUL byte is automatically appended to
+the string.
+
+
+
+
+FCS string
+
+
+Include a string of text in the output with bit 7 of the final byte set. The
+first character of the operand is the delimiter which must appear as the last
+character and cannot appear within the string.
+
+
+
+
+ZMB expr
+
+
+Include a number of NUL bytes in the output. The number must be fully resolvable
+during pass 1 of assembly so no forward or external references are permitted.
+
+
+
+
+ZMD expr
+
+
+Include a number of zero words in the output. The number must be fully
+resolvable during pass 1 of assembly so no forward or external references are
+permitted.
+
+
+
+
+ZMQ expr
+
+
+Include a number of zero double-words in the output. The number must be fully
+resolvable during pass 1 of assembly so no forward or external references are
+permitted.
+
+
+
+
+RMB expr
+
+
+Reserve a number of bytes in the output. The number must be fully resolvable
+during pass 1 of assembly so no forward or external references are permitted.
+The value of the bytes is undefined.
+
+
+
+
+RMD expr
+
+
+Reserve a number of words in the output. The number must be fully
+resolvable during pass 1 of assembly so no forward or external references are
+permitted. The value of the words is undefined.
+
+
+
+
+RMQ expr
+
+
+Reserve a number of double-words in the output. The number must be fully
+resolvable during pass 1 of assembly so no forward or external references are
+permitted. The value of the double-words is undefined.
+
+
+
+
+
+
+
+
+Address Definition
+The directives in this section all control the addresses of symbols
+or the assembly process itself.
+
+
+ORG expr
+
+Set the assembly address. The address must be fully resolvable on the
+first pass so no external or forward references are permitted. ORG is not
+permitted within sections when outputting to object files. For the DECB
+target, each ORG directive after which output is generated will cause
+a new preamble to be output. ORG is only used to determine the addresses
+of symbols when the raw target is used.
+
+
+
+
+
+sym EQU expr
+sym = expr
+
+Define the value of sym to be expr.
+
+
+
+
+sym SET expr
+
+Define the value of sym to be expr.
+Unlike EQU, SET permits symbols to be defined multiple times as long as SET
+is used for all instances. Use of the symbol before the first SET statement
+that sets its value is undefined.
+
+
+
+
+SETDP expr
+
+Inform the assembler that it can assume the DP register contains
+expr. This directive is only advice to the assembler
+to determine whether an address is in the direct page and has no effect
+on the contents of the DP register. The value must be fully resolved during
+the first assembly pass because it affects the sizes of subsequent instructions.
+
+This directive has no effect in the object file target.
+
+
+
+
+
+ALIGN expr
+
+Force the current assembly address to be a multiple of expr.
+A series of NUL bytes is output to force the alignment, if required. The
+alignment value must be fully resolved on the first pass because it affects
+the addresses of subsquent instructions.
+This directive is not suitable for inclusion in the middle of actual
+code. It is intended to appear where the bytes output will not be executed.
+
+
+
+
+
+
+
+
+
+Conditional Assembly
+
+Portions of the source code can be excluded or included based on conditions
+known at assembly time. Conditionals can be nested arbitrarily deeply. The
+directives associated with conditional assembly are described in this section.
+
+All conditionals must be fully bracketed. That is, every conditional
+statement must eventually be followed by an ENDC at the same level of nesting.
+
+Conditional expressions are only evaluated on the first assembly pass.
+It is not possible to game the assembly process by having a conditional
+change its value between assembly passes. Thus there is not and never will
+be any equivalent of IFP1 or IFP2 as provided by other assemblers.
+
+
+
+IFEQ expr
+
+If expr evaluates to zero, the conditional
+will be considered true.
+
+
+
+
+
+IFNE expr
+IF expr
+
+If expr evaluates to a non-zero value, the conditional
+will be considered true.
+
+
+
+
+
+IFGT expr
+
+If expr evaluates to a value greater than zero, the conditional
+will be considered true.
+
+
+
+
+
+IFGE expr
+
+If expr evaluates to a value greater than or equal to zero, the conditional
+will be considered true.
+
+
+
+
+
+IFLT expr
+
+If expr evaluates to a value less than zero, the conditional
+will be considered true.
+
+
+
+
+
+IFLE expr
+
+If expr evaluates to a value less than or equal to zero , the conditional
+will be considered true.
+
+
+
+
+
+IFDEF sym
+
+If sym is defined at this point in the assembly
+process, the conditional
+will be considered true.
+
+
+
+
+
+IFNDEF sym
+
+If sym is not defined at this point in the assembly
+process, the conditional
+will be considered true.
+
+
+
+
+
+ELSE
+
+
+If the preceding conditional at the same level of nesting was false, the
+statements following will be assembled. If the preceding conditional at
+the same level was true, the statements following will not be assembled.
+Note that the preceding conditional might have been another ELSE statement
+although this behaviour is not guaranteed to be supported in future versions
+of LWASM.
+
+
+
+
+ENDC
+
+
+This directive marks the end of a conditional construct. Every conditional
+construct must end with an ENDC directive.
+
+
+
+
+
+
+
+
+Miscelaneous Directives
+
+This section includes directives that do not fit into the other
+categories.
+
+
+
+
+INCLUDE filename
+
+
+Include the contents of filename at this point in
+the assembly as though it were a part of the file currently being processed.
+Note that whitespace cannot appear in the name of the file.
+
+
+
+
+
+END [expr]
+
+
+This directive causes the assembler to stop assembling immediately as though
+it ran out of input. For the DECB target only, expr
+can be used to set the execution address of the resulting binary. For all
+other targets, specifying expr will cause an error.
+
+
+
+
+
+ERROR string
+
+
+Causes a custom error message to be printed at this line. This will cause
+assembly to fail. This directive is most useful inside conditional constructs
+to cause assembly to fail if some condition that is known bad happens.
+
+
+
+
+
+
+
+
+
+
+Macros
+
+LWASM is a macro assembler. A macro is simply a name that stands in for a
+series of instructions. Once a macro is defined, it is used like any other
+assembler directive. Defining a macro can be considered equivalent to adding
+additional assembler directives.
+
+Macros my accept parameters. These parameters are referenced within
+a macro by the a backslash ("\") followed by a digit 1 through 9 for the first
+through ninth parameters. They may also be referenced by enclosing the
+decimal parameter number in braces ("{num}"). These parameter references
+are replaced with the verbatim text of the parameter passed to the macro. A
+reference to a non-existent parameter will be replaced by an empty string.
+Macro parameters are expanded everywhere on each source line. That means
+the parameter to a macro could be used as a symbol or it could even appear
+in a comment or could cause an entire source line to be commented out
+when the macro is expanded.
+
+
+Parameters passed to a macro are separated by commas and the parameter list
+is terminated by any whitespace. This means that neither a comma nor whitespace
+may be included in a macro parameter.
+
+
+Macro expansion is done recursively. That is, within a macro, macros are
+expanded. This can lead to infinite loops in macro expansion. If the assembler
+hangs for a long time while assembling a file that uses macros, this may be
+the reason.
+
+Each macro expansion receives its own local symbol context which is not
+inherited by any macros called by it nor is it inherited from the context
+the macro was instantiated in. That means it is possible to use local symbols
+within macros without having them collide with symbols in other macros or
+outside the macro itself. However, this also means that using a local symbol
+as a parameter to a macro, while legal, will not do what it would seem to do
+as it will result in looking up the local symbol in the macro's symbol context
+rather than the enclosing context where it came from, likely yielding either
+an undefined symbol error or bizarre assembly results.
+
+
+Note that there is no way to define a macro as local to a symbol context. All
+macros are part of the global macro namespace. However, macros have a separate
+namespace from symbols so it is possible to have a symbol with the same name
+as a macro.
+
+
+
+Macros are defined only during the first pass. Macro expansion also
+only occurs during the first pass. On the second pass, the macro
+definition is simply ignored. Macros must be defined before they are used.
+
+
+The following directives are used when defining macros.
+
+
+
+macroname MACRO
+
+This directive is used to being the definition of a macro called
+macroname. If macroname already
+exists, it is considered an error. Attempting to define a macro within a
+macro is undefined. It may work and it may not so the behaviour should not
+be relied upon.
+
+
+
+
+
+ENDM
+
+
+This directive indicates the end of the macro currently being defined. It
+causes the assembler to resume interpreting source lines as normal.
+
+
+
+
+
+
+
+Object Files and Sections
+
+The object file target is very useful for large project because it allows
+multiple files to be assembled independently and then linked into the final
+binary at a later time. It allows only the small portion of the project
+that was modified to be re-assembled rather than requiring the entire set
+of source code to be available to the assembler in a single assembly process.
+This can be particularly important if there are a large number of macros,
+symbol definitions, or other metadata that uses resources at assembly time.
+By far the largest benefit, however, is keeping the source files small enough
+for a mere mortal to find things in them.
+
+
+
+With multi-file projects, there needs to be a means of resolving references to
+symbols in other source files. These are known as external references. The
+addresses of these symbols cannot be known until the linker joins all the
+object files into a single binary. This means that the assembler must be
+able to output the object code without knowing the value of the symbol. This
+places some restrictions on the code generated by the assembler. For
+example, the assembler cannot generate direct page addressing for instructions
+that reference external symbols because the address of the symbol may not
+be in the direct page. Similarly, relative branches and PC relative addressing
+cannot be used in their eight bit forms. Everything that must be resolved
+by the linker must be assembled to use the largest address size possible to
+allow the linker to fill in the correct value at link time. Note that the
+same problem applies to absolute address references as well, even those in
+the same source file, because the address is not known until link time.
+
+
+
+It is often desired in multi-file projects to have code of various types grouped
+together in the final binary generated by the linker as well. The same applies
+to data. In order for the linker to do that, the bits that are to be grouped
+must be tagged in some manner. This is where the concept of sections comes in.
+Each chunk of code or data is part of a section in the object file. Then,
+when the linker reads all the object files, it coalesces all sections of the
+same name into a single section and then considers it as a unit.
+
+
+
+The existence of sections, however, raises a problem for symbols even
+within the same source file. Thus, the assembler must treat symbols from
+different sections within the same source file in the same manner as external
+symbols. That is, it must leave them for the linker to resolve at link time,
+with all the limitations that entails.
+
+
+
+In the object file target mode, LWASM requires all source lines that
+cause bytes to be output to be inside a section. Any directives that do
+not cause any bytes to be output can appear outside of a section. This includes
+such things as EQU or RMB. Even ORG can appear outside a section. ORG, however,
+makes no sense within a section because it is the linker that determines
+the starting address of the section's code, not the assembler.
+
+
+
+All symbols defined globally in the assembly process are local to the
+source file and cannot be exported. All symbols defined within a section are
+considered local to the source file unless otherwise explicitly exported.
+Symbols referenced from external source files must be declared external,
+either explicitly or by asking the assembler to assume that all undefined
+symbols are external.
+
+
+
+It is often handy to define a number of memory addresses that will be
+used for data at run-time but which need not be included in the binary file.
+These memory addresses are not initialized until run-time, either by the
+program itself or by the program loader, depending on the operating environment.
+Such sections are often known as BSS sections. LWASM supports generating
+sections with a BSS attribute set which causes the section definition including
+symbols exported from that section and those symbols required to resolve
+references from the local file, but with no actual code in the object file.
+It is illegal for any source lines within a BSS flagged section to cause any
+bytes to be output.
+
+
+The following directives apply to section handling.
+
+
+
+SECTION name[,flags]
+SECT name[,flags]
+
+
+Instructs the assembler that the code following this directive is to be
+considered part of the section name. A section name
+may appear multiple times in which case it is as though all the code from
+all the instances of that section appeared adjacent within the source file.
+However, flags may only be specified on the first
+instance of the section.
+
+There is a single flag supported in flags. The
+flag bss will cause the section to be treated as a BSS
+section and, thus, no code will be included in the object file nor will any
+bytes be permitted to be output.
+
+If assembly is already happening within a section, the section is implicitly
+ended and the new section started. This is not considered an error although
+it is recommended that all sections be explicitly closed.
+
+
+
+
+
+ENDSECTION
+ENDSECT
+ENDS
+
+
+This directive ends the current section. This puts assembly outside of any
+sections until the next SECTION directive.
+
+
+
+
+sym EXTERN
+sym EXTERNAL
+sym IMPORT
+
+
+This directive defines sym as an external symbol.
+This directive may occur at any point in the source code. EXTERN definitions
+are resolved on the first pass so an EXTERN definition anywhere in the
+source file is valid for the entire file. The use of this directive is
+optional when the assembler is instructed to assume that all undefined
+symbols are external. In fact, in that mode, if the symbol is referenced
+before the EXTERN directive, an error will occur.
+
+
+
+
+
+sym EXPORT
+
+
+This directive defines sym as an exported symbol.
+This directive may occur at any point in the source code, even before the
+definition of the exported symbol.
+
+
+
+
+
+
+
+
+
+Assembler Modes and Pragmas
+
+There are a number of options that affect the way assembly is performed.
+Some of these options can only be specified on the command line because
+they determine something absolute about the assembly process. These include
+such things as the output target. Other things may be switchable during
+the assembly process. These are known as pragmas and are, by definition,
+not portable between assemblers.
+
+
+LWASM supports a number of pragmas that affect code generation or
+otherwise affect the behaviour of the assembler. These may be specified by
+way of a command line option or by assembler directives. The directives
+are as follows.
+
+
+
+
+PRAGMA pragma[,...]
+
+
+Specifies that the assembler should bring into force all pragmas
+specified. Any unrecognized pragma will cause an assembly error. The new
+pragmas will take effect immediately. This directive should be used when
+the program will assemble incorrectly if the pragma is ignored or not supported.
+
+
+
+
+
+*PRAGMA pragma[,...]
+
+
+This is identical to the PRAGMA directive except no error will occur with
+unrecognized or unsupported pragmas. This directive, by virtue of starting
+with a comment character, will also be ignored by assemblers that do not
+support this directive. Use this variation if the pragma is not required
+for correct functioning of the code.
+
+
+
+
+
+Each pragma supported has a positive version and a negative version.
+The positive version enables the pragma while the negative version disables
+it. The negatitve version is simply the positive version with "no" prefixed
+to it. For instance, "pragma" vs. "nopragma". Only the positive version is
+listed below.
+
+Pragmas are not case sensitive.
+
+
+
+index0tonone
+
+
+When in force, this pragma enables an optimization affecting indexed addressing
+modes. When the offset expression in an indexed mode evaluates to zero but is
+not explicity written as 0, this will replace the operand with the equivalent
+no offset mode, thus creating slightly faster code. Because of the advantages
+of this optimization, it is enabled by default.
+
+
+
+
+
+undefextern
+
+
+This pragma is only valid for targets that support external references. When in
+force, if the assembler sees an undefined symbol on the second pass, it will
+automatically define it as an external symbol. This automatic definition will
+apply for the remainder of the assembly process, even if the pragma is
+subsequently turned off. Because this behaviour would be potentially surprising,
+this pragma defaults to off.
+
+
+The primary use for this pragma is for projects that share a large number of
+symbols between source files. In such cases, it is impractical to enumerate
+all the external references in every source file. This allows the assembler
+and linker to do the heavy lifting while not preventing a particular source
+module from defining a local symbol of the same name as an external symbol
+if it does not need the external symbol. (This pragma will not cause an
+automatic external definition if there is already a locally defined symbol.)
+
+
+This pragma will often be specified on the command line for large projects.
+However, depending on the specific dynamics of the project, it may be sufficient
+for one or two files to use this pragma internally.
+
+
+
+
+
+
+
+
+
+
+LWLINK
+
+
+
+
Object Files
-
+
+LWTOOLS uses a proprietary object file format. It is proprietary in the sense
+that it is specific to LWTOOLS, not that it is a hidden format. It would be
+hard to keep it hidden in an open source tool chain anyway. This chapter
+documents the object file format.
+