changeset 96:7fbccdd1defb

Added doc subdirectory to distribution
author lost
date Sat, 17 Jan 2009 07:09:02 +0000
parents f3497072ac44
children 2e8dda44027c
files Makefile.am doc/Makefile.am doc/internals.txt doc/lwasm.txt doc/object files.txt doc/pseudo ops.txt
diffstat 6 files changed, 226 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- a/Makefile.am	Sat Jan 17 06:57:58 2009 +0000
+++ b/Makefile.am	Sat Jan 17 07:09:02 2009 +0000
@@ -1,1 +1,2 @@
 SUBDIRS = src
+DIST_SUBDIRS = doc
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/Makefile.am	Sat Jan 17 07:09:02 2009 +0000
@@ -0,0 +1,3 @@
+EXTRA_DIST = lwasm.txt internals.txt pseudo\ ops.txt object\ files.txt
+
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/internals.txt	Sat Jan 17 07:09:02 2009 +0000
@@ -0,0 +1,49 @@
+LWASM Internals
+===============
+
+LWASM is a table-driven assembler that notionally uses two passes. However,
+it implements its assembly in several passes as follows.
+
+Pass 1 - Preprocessing & Parsing
+--------------------------------
+
+This pass reads the source file and all included source files. It handles
+macro definition and expansion.
+
+As it reads the various lines, it also identifies any symbol associated with
+the line, the operation code, and, based on the operation code, the operand,
+if any. Upon examination of the operand, any expressions are stored in an
+internal postfix notation for later evaluation. During this pass,
+preliminary values are assigned to all symbols using the largest possible
+instruction size. A table of lines that reference every symbol is generated
+to be used in the following pass. Note that any symbols for which the value
+is known with no uncertainty factor will be generated with the smallest
+possible instruction.
+
+At this stage, simple optimizations are performed on expressions. This
+includes coalescing constants (1+2+x => 3+x). It also includes some basic
+algebra (x+x => 2*x, 2*x+4*x => 6*x, x-x => 0).
+
+Pass 2 - Optimization
+---------------------
+
+This pass sweeps the code looking for operations which could use a shorter
+instruction. If it finds one, it must then re-define all symbols defined
+subsequently and all symbols defined in terms of one of those symbols in a
+cascade. This process is repeated until no more possible reductions are
+discovered.
+
+If, in the process of implementing an instruction reduction, a phasing error
+or other conflict is encountered, the reduction is backed out and marked as
+forced. 
+
+The following may be candidates for reduction, depending on assembler
+options:
+
+- extended addressing -> direct addressing (not in obj target)
+- 16 bit offset -> 8 bit offset (indirect indexed)
+- 16 bit offset -> 8 bit or 5 bit offset (direct indexed)
+- 16 bit offset -> no offset (indexed)
+- 16 bit relative -> 8 bit relative (depending on configuration)
+
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/lwasm.txt	Sat Jan 17 07:09:02 2009 +0000
@@ -0,0 +1,43 @@
+LWASM 2.0
+=========
+
+LWASM is a cross-assembler for the MC6809 and HD6309 CPUs. It should
+assemble most reasonable EDTASM compatible source code. This document is not
+intended to teach assembly language for these CPUs but rather to document
+the behaviour of LWASM.
+
+
+TARGETS
+-------
+
+LWASM supports several targets for assembly. These are decb, raw, and obj.
+
+The raw target generates a raw binary output. This is useful for building
+ROMs and other items that are not intended to be loaded by any kind of
+loader. In this mode, the ORG directive is merely advisory and does not
+affect the output except for the addresses symbols are defined to have.
+
+The decb target generates output that can be loaded with the CLOADM or LOADM
+commands in Color Basic. There will be approximately one segment in the
+output file for every ORG statement after which any code is emitted. (That
+is, two ORG statements in a row will not generate two output segments.)
+This is approximately equivalent to running A/AO in EDTASM.
+
+The obj target generates output that is intended to be linked later with
+LWLINK. This target disallows the use of ORG for defining anything other
+than constants. In this target, source files consist of a number of sections
+(SECTION/ENDSECTION). Nothing outside of a section is permitted to cause any
+output at all. Use of an ORG statement within a section is an error. This
+target also permits tagging symbols for export (EXPORT) and marking a symbol
+as externally defined (IMPORT/EXTERN). The linker will resolve any external
+references at link time. Additionally, any inter-section references will be
+resolved by the linker. All code in each section is assembled with an
+implicit origin of 0. SETDP has no effect because the assembler has no idea
+what address the linker will assign to the code when it is linked. Any
+direct addressing modes will default to extended to allow for the linker to
+perform relocations. Intersegment references and external references will
+use 16 bit relative addressing but intrasegment internal references may use
+8 bit relative addressing. Forced 8 bit direct modes are probably an error
+but are permitted on the theory that the programmer might know something the
+assembler doesn't.
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/object files.txt	Sat Jan 17 07:09:02 2009 +0000
@@ -0,0 +1,79 @@
+An object file consists of a series of sections each of which contains a
+list of exported symbols, a list of incomplete references, and a list of
+"local" symbols which may be used in calculating incomplete references. Each
+section will obviously also contain the object code.
+
+Exported symbols must be completely resolved to an address within the
+section it is exported from.
+
+Each object file starts with a magic number and version number. The magic
+number is the string "LWOBJ16" for this 16 bit object file format. The only
+defined version number is currently 0. Thus, the first 8 bytes of the object
+file are:
+
+4C574F424A313600
+
+Each section has the following items in order:
+
+* section name
+* flags
+* list of local symbols (and addresses within the section)
+* list of exported symbols (and addresses within the section)
+* list of incomplete references along with the expressions to calculate them
+* the actual object code
+
+The section starts with the name of the section with a NUL termination
+followed by a series of flag bytes terminated by NUL. The following flag
+bytes are defined:
+
+Byte	Meaning
+00		no more flags
+01		section is BSS - no actual code is present
+
+Either a NULL section name or end of file indicate the presence of no more
+sections.
+
+Each entry in the exported and local symbols table consists of the symbol
+(NUL terminated) followed by two bytes which contain the value in big endian
+order. The end of a symbol table is indicated by a NULL symbol name.
+
+Each entry in the incomplete references table consists of an expression
+followed by a 16 bit offset where the reference goes. Expressions are
+defined as a series of terms up to an "end of expression" term. Each term
+consists of a single byte which identifies the type of term (see below)
+followed by any data required by the term. Then end of the list is flagged
+by a NULL expression (only an end of expression term).
+
+TERMTYPE	Meaning
+00			end of expression
+01			integer (16 bit in big endian order follows)
+02			external symbol reference (NUL term symbol)
+03			local symbol reference (NUL term symbol)
+04			operator (1 byte operator number - see below)
+05			section base address reference
+
+External references are resolved using other object files while local
+references are resolved using the local symbol table(s) from this file. This
+allows local symbols that are not exported to have the same names as
+exported symbols or external references.
+
+The operator numbers are:
+
+NUM	OP
+01	+ (plus)
+02	- (minus)
+03	* (times)
+04	/ (divide)
+05	% (modulus)
+06	\ (integer division)
+07	bitwise and
+08	bitwise or
+09	bitwise xor
+0A	boolean and
+0B	boolean or
+0C	- (unary negation, 2's complement)
+0D	^ (unary 1's complement)
+
+An expression is represented in a postfix manner with both operands for
+binary operators preceding the operator and the single operand for unary
+operators preceding the operator.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/pseudo ops.txt	Sat Jan 17 07:09:02 2009 +0000
@@ -0,0 +1,51 @@
+The following pseudo operations are understood by LWASM.
+
+SECTION <name>
+
+This introduces a section called <name>. This is only valid if assembling to
+an object file. Only one section can be open at any given time. Sections
+may be ended with ENDSECTION. Only one section can be open at any given
+time. A subsequent SECTION directive will end the previous section. It is
+important to note that an end of file does not close the currently open
+section. There cannot be a symbol on a SECTION line.
+
+ENDSECTION
+
+Specifies the end of a section. This is optional. There cannot be a symbol
+on an ENDSECTION line.
+
+ORG <addr>
+
+Specifies the assembly address. For the raw target, this is advisory and
+only affects the addresses of symbols. For the object file target, this can
+only appear outside of all sections. For the DECB target, each ORG statement
+after which any output is generated will generate a segment in the output
+file. <addr> must be completely resolved during pass 1 of the assembly
+process and thus may not refer to forward references or external symbols, or
+other symbols that refer to such.
+
+<symbol> EQU <value>
+
+Makes <symbol> equivalent to <value>. <value> may be an external reference
+in which case any references to <symbol> will also be external references.
+
+EXPORT <symbol>[ as <name>]
+
+Marks previously defined <symbol> for export. If <name> is specified, it
+will be exported as <name>. <symbol> must not be an external reference and
+must be defined before EXPORT.
+
+EXTERN <symbol>[ as <name>]
+IMPORT <symbol>[ as <name>]
+
+Marks <symbol> as an external reference. If <name> is specified, <name> is
+the local name the symbol is references as in this assembly file while
+<symbol> is the actual symbol to be referenced externally.
+
+END [<addr>]
+
+Marks the end of the assembly process. Immediately terminates assembly
+without processing any other lines in this file or any others. It is
+optional. <addr> is only allowed for the DECB target in which case it
+specifies the execution address. If it is not specified, the address
+defaults to 0.