The H1ASM assembler (v.0)

H1ASM v.0 is a rudimentary two-passes assembler designed to directly produce executable code for the Heritage/1 minicomputer. This tool is written in PHP command-line interface (CLI) to be ran at a modern Personal Computer under Linux, Windows or Mac OS-X. Resulting executable code needs to be transfered to the target minicomputer in the form of a binary file.


The H1ASM v.0 assembler is designed to work without a Linker companion; the assembler directly produces executable code, implying that it will not export symbols and it won't work with precompiled object code or libraries which is indeed a limitation.

It does, however, accept source file inclusion which allows for organizing a large project into manageable source modules. It also implements the notion of variable which, combined with file inclusion, will hopefully help in writing structured, reusable, maintainable code.

Macros, conditional compilation, pseudoinstructions and other advanced features are not implemented in this version but plans exist for provide those in the near future.

Launching the assembler program

The assembler's main file is a PHP script named: h1asm.php. Assuming you have exec writes on that script, you launch the assembler with the following command:

h1asm.php  main_src_file_path  include_dir

The first argument is a path to the intended main source file. The second argument is optional and refers to the directory where included files are located; if missing, the assembler will extract the include directory from the first argument assuming that the main file is placed together with all included files.

Anatomy of assembly code

The assembler processes the source file one line at a time. Lines are trimmed before processing so leading spaces and tab characters can be inserted for better formatting if desired. Each line could be one of the following types:

- Blank line                : Will be ignored
- Comment                      : Starts with semicolon (;)
- Instruction               : Syntactically valid Hieritage/1 instruction
- Label                     : Symbolic address. Start with colon (:)
- Symbol definition         : In the form SYMBOL equ VALUE, or: #define SYMBOL = VALUE
- Origin directive             : In the form #org ADDRESS
- Data directive            : In the form #data EXPRESSION
- Output Format directive   : In the form #format FORMAT
- Include directive         : In the form #include FILE_NAME

Comments cannot appear in a line that is not a comment. Following is an example of acceptable assembly code.

; Listing #1
; ----------------------------------------------------
; ----------------------------------------------------
; This code implements INT 7 both vector and handler code
;   for copying BUFF_SIZE words from BUFF_SRC to BUFF_DES
#format H12
#include vectors.equ
        INT_7_CODE      equ     0x0400
        BUFF_SRC        equ     0x4000
        BUFF_DES        equ     0x0800
        BUFF_SIZE       equ     0x0020
#org    INT_7_VECTOR
#data   INT_7_CODE

#org    INT_7_CODE
         mvi        c,    BUFF_SIZE
         mvi        d,     BUFF_SRC
         mvi        e,    BUFF_DES
         ldx        a, d
         stoxa, e
         inc        d
         inc        e
         dec        c
         jnz        LOOP
         jp        QUIT

; garbage, just to justify the forward jump
#data    0xffff            


Assembler Output

The assembly process results in two files:

- Code file (.bin)            : Executable (binary) code.
- Listing file (.lst)      : As built (ASCII) listing of source with numeric
                             addresses and error messages if any as well as
                             extra info such as the Symbols Table.

The assembler creates their filenames from the source filename by replacing the file extension with .lst and .bin respectively.

The code file (.bin) is built acording to one of the following formats (indicated in code by the directive #format FORMAT):

- H10 : Exact image of code in memory using relative branch instructions only.
- H11 : Monolithic code with header specifying the ORG address.
- H12 : Fragmented code with header specifying different ORG addresses.
- H13 : Relocatable code with header indicating words (off-sets) in need
        for relocation.

These different formats are targeted to different operational environment that the target computer may offer. At early stage of development, no operational environment exists so formats H10 and H11 are the most appropriate. Format H12 is useful for full memory dumping including vectors (such as in Listing #1). Format H13 is better suited for applications in presence of a Loader capable of relocation.

If #format H10 is specified, the Assembler will translate absolute branch instructions to relative ones, automatically. If the directive is missing, format H12 will be assumed.

Error reporting

The assembler reports errors in the Listing file (.lst) in the form of single-line messages inserted right after the line where the error was found. The code file (.bin) will not be produced if errors have occurred.

There is no distinction between "errors" and "warning"; errors are simply errors and will be reported as such.

Error checking depends on the targeted format. For example, duplicated #org directives will be reported if H11 format was requested.

Syntax Details


Expressions are used to directly represent numbers or strings. Since arithmetic operations are not included in the current assembly language, expressions are nothing but "literals".

As Heritage/1 is a strict 16-bits machine (memory is organized in 16-bits words), numerical expressions are always 16-bits integers (singed or unsigned) and values greater than 65,535 will generate overflow errors at assembling-time. Bytes (8-bits) are NOT represented within assembly code in any way. NOTE: A notable exception to this is the instruction INT which takes an 8-bits vector number as argument in assembly code (this byte is actually embedded within the op. code); in this case the argument is written as a normal 16-bits word although the assembler will report overflow error if it is greater than 0x00ff.

The following numerical expressions are valid:
0x03ff    Hexadecimal
0b010101  Binary
44        Decimal
-44       Decimal negative
65000     Decimal. Valid but it may represent a negative binary (two's complement)

The following expressions are illegal:
-0x03ff   Only decimals can be explicitly signed.
68456     Overflow.
64,000    Commas as delimiters in numbers are not accepted.

A string is a null-terminated sequence of ASCII characters occupying a 16-bits word each (the assembler fills the MSB with zeros). The null termination consists of a word with all bits cleared.

Strings are written in assembly code starting with dollar sign ($) as in the following example:

    #data $Drive not on-line
    #data $Volume not mounted on drive

A words about signed and unsigned integers

The Heritage/1 ALU makes use of two's complement arithmetic for adding and subtracting 16-bits numbers. However, this does not limit the programmer to the use of signed integers in assembly code, as signed or unsigned is mostly a matter of interpretation.

For instance, using the instruction ADD for adding 0xf000 to 0x0001 will result in 0xf001 which can be interpreted either as positive 61441 or negative or -4095. Moreover, when building more powerful arithmetic by software (such as BCD or Floating Point) the programmer is responsibly for defining data types and to make the correct representation and interpretation of the arithmetic sign.


Simply put, symbols represent expressions. You define symbols by using the directive #define or the equ construct. The assembler also auto-define some symbols (labels in particular) while parsing the code; in the following example, the value for label QUIT is not given explicitly but calculated by the assembler.

; Listing #2
; Examples of symbols used as both address and data
STU_STOP     equ   0x0000
CMD_RWD      equ   0x0004
TAPE_REG     equ   0x4000
START        equ   0x0400

#org   START
       ; Check if tape is stopped.
       ld     a, TAPE_REG
       cmp    STU_STOP      
       jnz    QUIT

      ; Sent Rewind command to the tape driver:
       mvi     a, CMD_RWD
       sto     a, TAPE_REG

As seen in code, a valid symbol contains nothing but alphabetic and numeral characters as well as underscores ( _ ). The length is restricted to 40 characters and the first one can not be numeric.

Each symbol must be defined only once in the entire project scope, so caution must be observed specially with large projects composed by many files. Duplicated symbols will be reported as errors.


Variables differ from symbols in that they can appear more than once within the code. You assign a value (either a symbol, an expression or another variable) to a variable by using the #set directive.

Variables don't need to be defined as they get auto-defined with the first assignment encountered during the parsing process.

The following statements are valid:

#set foo = 0x044
#set foo = SOME_SYMBOL
#set other_var = foo


By directive we understand those commands placed in assembly code that are targeted to the Assembler, as opposite to instructions which are targeted to the computer running the resulting binary code.

Directives start with the sharp character (#) followed by the directive's name, then the argument. These three parts are separated from each other by the mean of spaces or tabs.

The following directives are currently available.



#include FILE_NAME

This directive causes the given file (FILE_NAME) to be opened and processed immediately as if it were part of the current file. This action is recursive so further #include directives found in included file will be processed in the encountered order.

The #include's argument (FILE_NAME) is the name of a source file expected to be in the "include directory"; the later was passed explicitly in the command line (second argument) when the assembler was launched, or was automatically extracted otherwise from the source-file path (first argument) at that time.

You can also specify a full path in the #include directive. That may be the case of being using reusable code from files placed in separate directories for better organization. The assembler will realize whether the #include argument is a filename or a full path and it will act accordingly. Either case, if the file does not exist, an error will be reported.

A given file can be #included more than once. This might be used to compensate for the lack of Macros, as in the following example:

      ; Using reusable code for sorting a list in memory.
      ; The included code accepts arguments in registers d, c.      

      mvi d, BUFF        ; List in memory to be sort out
      ld  c, BUFF_SIZE   ; Certain var in memory holding the buffer's size

      ; The included code does the job...
      #include /home/armando/src/lib/quick_sort.asm




Sets the origin address, effective since the first instruction following the directive's line. ADDRESS can be either a symbol or a numerical expression.

The following statements are valid:

START  equ 0x400
#org   START
#org   0x400

The following will result in error: "Illegal use of string":

#org  $START




This directive is useful for filling areas of memory with fixed data such as lookup tables and string messages, as illustrated below.

#define   MSG_TABLE = 0x800

#data     $File not found
#data     $Stack overflow

Starting at address 0x800 will be the (null-terminated) string "File not found" followed by the string "Stack overflow" (30 words total).



#define SYMBOL = VALUE

This directive defines a symbol by indicating the expression (VALUE) it represents. The same can also be done with the equ construct for numeric expressions as illustrated in previous examples.

Actually, support for equ was introduced in H1ASM for compatibility with such "traditional" construct. However, the #define directive is conceptually more robust and more powerful in practice since it allows for symbolic strings too.

The previous example could also be written this way:

#define   MSG_TABLE         = 0x800
#define   ERR_MSG_F_NOFOUND = $File not found
#define   ERR_MSG_S_OV      = $Stack overflow


#data     ERR_MSG_S_OV



#set VAR = VALUE

The set directive is used to assign a value to a variable. If the variable didn't exist, it is created at that time. VALUE can be either a symbol, an expression, another variable or even the same variable (which doesn't sound too much useful but it is legal anyways).

Once a variable is created, it can be used in instructions in place of the operand, as illustrated in the following example.

#define    BUFF_TAPE_1    = 0x4000
#define    BUFF_TAPE_2    = 0x4100

; Call subroutine passing argument in variable:
#set  @_buff = BUFF_TAPE_1

; Call subroutine again passing a different argument in the same variable:
#set  @_buf  = BUFF_TAPE_2

      ; Variable used as operand:
      ld   e, @_buff

Symbols and variables can not share names; failure to observe this will result in errors reported by the assembler. To overcome this limitation, naming conventions must be employed. We suggest to write symbols with uppercases and variables, written with lowercases, to start with the '@' character as illustrated in the previous example.

The need for variables comes from the lack of macros in H1ASM. In fact, you can "encapsulate" reusable generic code into separate source files, then #include it in your "client code" passing arguments through variables. Here is an example:

#set  @_quick_sort_buff       = BUFF_ADDR
#set  @_quick_sort_buff_size  = BUFF_SIZE_VAR

; The included code does the job. It uses the above variables:
#include /home/armando/src/h1_lib/quick_sort.asm



#format FORMAT

Specifies the output format as explained in section Assembler Output.


Output File Format Details

Homebuilt CPUs WebRing

JavaScript by Qirien Dhaela

Join the ring?

David Brooks, the designer of the Simplex-III homebrew computer, has founded the Homebuilt CPUs Web Ring. To join, drop David a line, mentioning your page's URL. He will then add it to the list.
You will need to copy this code fragment into your page.