# HG changeset patch # User lost # Date 1267490658 0 # Node ID ebff3a3e8fa6d0403e34ad2b6871177967dc2926 # Parent 67224d8d1024085e8081243e809baf7ee59bdd07 Updated internals to describe the multi-pass architecture diff -r 67224d8d1024 -r ebff3a3e8fa6 doc/README --- a/doc/README Tue Mar 02 00:10:32 2010 +0000 +++ b/doc/README Tue Mar 02 00:44:18 2010 +0000 @@ -11,3 +11,6 @@ docbook2html -u manual.docbook.sgml && mv manual.docbook.html manual/manual.html +PDF can be generated by doing: + +docbook2pdf -u manual.docbook.sgml && mv manual.docbook.pdf manual/manual.pdf \ No newline at end of file diff -r 67224d8d1024 -r ebff3a3e8fa6 doc/internals.txt --- a/doc/internals.txt Tue Mar 02 00:10:32 2010 +0000 +++ b/doc/internals.txt Tue Mar 02 00:44:18 2010 +0000 @@ -4,46 +4,55 @@ LWASM is a table-driven assembler that notionally uses two passes. However, it implements its assembly in several passes as follows. -Pass 1 - Preprocessing & Parsing --------------------------------- +Pass 1 +------ -This pass reads the source file and all included source files. It handles -macro definition and expansion. +This pass reads the entire source code and parses each line into an internal +representation. Macros, file inclusions, and conditional assembly +instructions are resolved at this point as well. + +Pass 2 +------ -As it reads the various lines, it also identifies any symbol associated with -the line, the operation code, and, based on the operation code, the operand, -if any. Upon examination of the operand, any expressions are stored in an -internal postfix notation for later evaluation. During this pass, -preliminary values are assigned to all symbols using the largest possible -instruction size. A table of lines that reference every symbol is generated -to be used in the following pass. Note that any symbols for which the value -is known with no uncertainty factor will be generated with the smallest -possible instruction. +This pass assigns instruction sizes to all invariate instructions. Invariate +instructions are any instructions with a fixed size, including those with +forced addressing modes. -At this stage, simple optimizations are performed on expressions. This -includes coalescing constants (1+2+x => 3+x). It also includes some basic -algebra (x+x => 2*x, 2*x+4*x => 6*x, x-x => 0). +Pass 3 +------ + +This pass resolves all instruction sizes that can be resolved without +setting addresses for instructions. This process is repeated until no +further instructions sizes are resolved. -Pass 2 - Optimization ---------------------- +Pass 4 +------ -This pass sweeps the code looking for operations which could use a shorter -instruction. If it finds one, it must then re-define all symbols defined -subsequently and all symbols defined in terms of one of those symbols in a -cascade. This process is repeated until no more possible reductions are -discovered. +This pass assigns addresses to all symbols where values are known. It does +the same for instructions. Then a repeat of similar algorithms as in the +previous pass is used to resolve as many operands as possible. + +This pass is repeated multiple times until no further instructions or +symbols are resolved. -If, in the process of implementing an instruction reduction, a phasing error -or other conflict is encountered, the reduction is backed out and marked as -forced. +Pass 5 +------ -The following may be candidates for reduction, depending on assembler -options: +Finalization of all instruction sizes by forcing them to the maximum +addressing mode. Then all remaining instruction addresses and symbol values +are resolved. -- extended addressing -> direct addressing (not in obj target) -- 16 bit offset -> 8 bit offset (indirect indexed) -- 16 bit offset -> 8 bit or 5 bit offset (direct indexed) -- 16 bit offset -> no offset (indexed) -- 16 bit relative -> 8 bit relative (depending on configuration) +Pass 6 +------ + +This pass does actual code generation. +Expression Evaluation +===================== + +Each expression carries a certainty flag. Any expression in which any term +is flagged as uncertain is, itself, uncertain. There are a few specific +cases where such uncertainty can cancel out. For instance, X-X where X is +uncertain is guaranteed to be 0 and so there is no uncertainty. +