# HG changeset patch # User lost # Date 1232176142 0 # Node ID 7fbccdd1defb076e6f3fa292df6153a4ba77620c # Parent f3497072ac44f8b4d5bc56a5fcaf7de2f00687c6 Added doc subdirectory to distribution diff -r f3497072ac44 -r 7fbccdd1defb Makefile.am --- a/Makefile.am Sat Jan 17 06:57:58 2009 +0000 +++ b/Makefile.am Sat Jan 17 07:09:02 2009 +0000 @@ -1,1 +1,2 @@ SUBDIRS = src +DIST_SUBDIRS = doc diff -r f3497072ac44 -r 7fbccdd1defb doc/Makefile.am --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/Makefile.am Sat Jan 17 07:09:02 2009 +0000 @@ -0,0 +1,3 @@ +EXTRA_DIST = lwasm.txt internals.txt pseudo\ ops.txt object\ files.txt + + diff -r f3497072ac44 -r 7fbccdd1defb doc/internals.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/internals.txt Sat Jan 17 07:09:02 2009 +0000 @@ -0,0 +1,49 @@ +LWASM Internals +=============== + +LWASM is a table-driven assembler that notionally uses two passes. However, +it implements its assembly in several passes as follows. + +Pass 1 - Preprocessing & Parsing +-------------------------------- + +This pass reads the source file and all included source files. It handles +macro definition and expansion. + +As it reads the various lines, it also identifies any symbol associated with +the line, the operation code, and, based on the operation code, the operand, +if any. Upon examination of the operand, any expressions are stored in an +internal postfix notation for later evaluation. During this pass, +preliminary values are assigned to all symbols using the largest possible +instruction size. A table of lines that reference every symbol is generated +to be used in the following pass. Note that any symbols for which the value +is known with no uncertainty factor will be generated with the smallest +possible instruction. + +At this stage, simple optimizations are performed on expressions. This +includes coalescing constants (1+2+x => 3+x). It also includes some basic +algebra (x+x => 2*x, 2*x+4*x => 6*x, x-x => 0). + +Pass 2 - Optimization +--------------------- + +This pass sweeps the code looking for operations which could use a shorter +instruction. If it finds one, it must then re-define all symbols defined +subsequently and all symbols defined in terms of one of those symbols in a +cascade. This process is repeated until no more possible reductions are +discovered. + +If, in the process of implementing an instruction reduction, a phasing error +or other conflict is encountered, the reduction is backed out and marked as +forced. + +The following may be candidates for reduction, depending on assembler +options: + +- extended addressing -> direct addressing (not in obj target) +- 16 bit offset -> 8 bit offset (indirect indexed) +- 16 bit offset -> 8 bit or 5 bit offset (direct indexed) +- 16 bit offset -> no offset (indexed) +- 16 bit relative -> 8 bit relative (depending on configuration) + + diff -r f3497072ac44 -r 7fbccdd1defb doc/lwasm.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/lwasm.txt Sat Jan 17 07:09:02 2009 +0000 @@ -0,0 +1,43 @@ +LWASM 2.0 +========= + +LWASM is a cross-assembler for the MC6809 and HD6309 CPUs. It should +assemble most reasonable EDTASM compatible source code. This document is not +intended to teach assembly language for these CPUs but rather to document +the behaviour of LWASM. + + +TARGETS +------- + +LWASM supports several targets for assembly. These are decb, raw, and obj. + +The raw target generates a raw binary output. This is useful for building +ROMs and other items that are not intended to be loaded by any kind of +loader. In this mode, the ORG directive is merely advisory and does not +affect the output except for the addresses symbols are defined to have. + +The decb target generates output that can be loaded with the CLOADM or LOADM +commands in Color Basic. There will be approximately one segment in the +output file for every ORG statement after which any code is emitted. (That +is, two ORG statements in a row will not generate two output segments.) +This is approximately equivalent to running A/AO in EDTASM. + +The obj target generates output that is intended to be linked later with +LWLINK. This target disallows the use of ORG for defining anything other +than constants. In this target, source files consist of a number of sections +(SECTION/ENDSECTION). Nothing outside of a section is permitted to cause any +output at all. Use of an ORG statement within a section is an error. This +target also permits tagging symbols for export (EXPORT) and marking a symbol +as externally defined (IMPORT/EXTERN). The linker will resolve any external +references at link time. Additionally, any inter-section references will be +resolved by the linker. All code in each section is assembled with an +implicit origin of 0. SETDP has no effect because the assembler has no idea +what address the linker will assign to the code when it is linked. Any +direct addressing modes will default to extended to allow for the linker to +perform relocations. Intersegment references and external references will +use 16 bit relative addressing but intrasegment internal references may use +8 bit relative addressing. Forced 8 bit direct modes are probably an error +but are permitted on the theory that the programmer might know something the +assembler doesn't. + diff -r f3497072ac44 -r 7fbccdd1defb doc/object files.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/object files.txt Sat Jan 17 07:09:02 2009 +0000 @@ -0,0 +1,79 @@ +An object file consists of a series of sections each of which contains a +list of exported symbols, a list of incomplete references, and a list of +"local" symbols which may be used in calculating incomplete references. Each +section will obviously also contain the object code. + +Exported symbols must be completely resolved to an address within the +section it is exported from. + +Each object file starts with a magic number and version number. The magic +number is the string "LWOBJ16" for this 16 bit object file format. The only +defined version number is currently 0. Thus, the first 8 bytes of the object +file are: + +4C574F424A313600 + +Each section has the following items in order: + +* section name +* flags +* list of local symbols (and addresses within the section) +* list of exported symbols (and addresses within the section) +* list of incomplete references along with the expressions to calculate them +* the actual object code + +The section starts with the name of the section with a NUL termination +followed by a series of flag bytes terminated by NUL. The following flag +bytes are defined: + +Byte Meaning +00 no more flags +01 section is BSS - no actual code is present + +Either a NULL section name or end of file indicate the presence of no more +sections. + +Each entry in the exported and local symbols table consists of the symbol +(NUL terminated) followed by two bytes which contain the value in big endian +order. The end of a symbol table is indicated by a NULL symbol name. + +Each entry in the incomplete references table consists of an expression +followed by a 16 bit offset where the reference goes. Expressions are +defined as a series of terms up to an "end of expression" term. Each term +consists of a single byte which identifies the type of term (see below) +followed by any data required by the term. Then end of the list is flagged +by a NULL expression (only an end of expression term). + +TERMTYPE Meaning +00 end of expression +01 integer (16 bit in big endian order follows) +02 external symbol reference (NUL term symbol) +03 local symbol reference (NUL term symbol) +04 operator (1 byte operator number - see below) +05 section base address reference + +External references are resolved using other object files while local +references are resolved using the local symbol table(s) from this file. This +allows local symbols that are not exported to have the same names as +exported symbols or external references. + +The operator numbers are: + +NUM OP +01 + (plus) +02 - (minus) +03 * (times) +04 / (divide) +05 % (modulus) +06 \ (integer division) +07 bitwise and +08 bitwise or +09 bitwise xor +0A boolean and +0B boolean or +0C - (unary negation, 2's complement) +0D ^ (unary 1's complement) + +An expression is represented in a postfix manner with both operands for +binary operators preceding the operator and the single operand for unary +operators preceding the operator. diff -r f3497072ac44 -r 7fbccdd1defb doc/pseudo ops.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/doc/pseudo ops.txt Sat Jan 17 07:09:02 2009 +0000 @@ -0,0 +1,51 @@ +The following pseudo operations are understood by LWASM. + +SECTION + +This introduces a section called . This is only valid if assembling to +an object file. Only one section can be open at any given time. Sections +may be ended with ENDSECTION. Only one section can be open at any given +time. A subsequent SECTION directive will end the previous section. It is +important to note that an end of file does not close the currently open +section. There cannot be a symbol on a SECTION line. + +ENDSECTION + +Specifies the end of a section. This is optional. There cannot be a symbol +on an ENDSECTION line. + +ORG + +Specifies the assembly address. For the raw target, this is advisory and +only affects the addresses of symbols. For the object file target, this can +only appear outside of all sections. For the DECB target, each ORG statement +after which any output is generated will generate a segment in the output +file. must be completely resolved during pass 1 of the assembly +process and thus may not refer to forward references or external symbols, or +other symbols that refer to such. + + EQU + +Makes equivalent to . may be an external reference +in which case any references to will also be external references. + +EXPORT [ as ] + +Marks previously defined for export. If is specified, it +will be exported as . must not be an external reference and +must be defined before EXPORT. + +EXTERN [ as ] +IMPORT [ as ] + +Marks as an external reference. If is specified, is +the local name the symbol is references as in this assembly file while + is the actual symbol to be referenced externally. + +END [] + +Marks the end of the assembly process. Immediately terminates assembly +without processing any other lines in this file or any others. It is +optional. is only allowed for the DECB target in which case it +specifies the execution address. If it is not specified, the address +defaults to 0.