Branch Content

Table of Content

Branch Up

Machine Cycles

Point, Open, Take

Any data transfer within the computer occurs always between two storage devices. By "storage" we mean any kind of device capable of holding data, such as a CPU register, a memory cell or an UART buffer.

Usually, the transfer takes three steps in time. First, both devices needs to be addressed. Second, the source devices needs to be opened to the bus. Third, the destination device needs to be clocked to latch the data from the bus. We call these steps: Point, Open and Take respectively. Each step has a duraction calculated to allow data to estabilize before commiting the operation.

The sequence of these steps is called a Machine Cycle (MC). There are different types of MCs as we shall see, but they are all essencially the same: Point (address), Open (three-state buffer) and Take (clock latch device). Some times we can skip one or two steps, making the MC shorter.

Fetch and Execute Cycles

In the following time diagram, T1, T2 and T3 constitute the steps mentioned before: Point, Open, Take respectively. Green traces represent sigals in bus lines whereas blue ones represent internal control signals (name indicated between brackets).

The naming convention for internal control signals is this:

OP_DEV

Where OP (operation) can be either RD (read, that is open the three-state buffer to the bus) or WR (write, that is clock the device to latch the input from the bus). And DEV (device) can be a register or any other storage device.

The diagram shows a complete instruction cycle for an indirect address operation. The memory location pointed by 20-bits register Y will be written from register A.

This instruction completes in two machine cycles: Fetch and Execute. The thing to notice is how different control signals appear in time to perform "Point", "Open", "Take" operations over different devices. The duration of T pulses should be big enough to allow data to stabilize before commitment.

Fetch Operand Cycle

A memory-read cycle for 16-bits data reads a word from memory. An instruction such as

    movi    a, value   ; a <- value (16-bits)

can complete in three machine cycles: Fetch OP-Code, Fetch Operand, Execute. For 20-bits operands, the mechanism is different. A 20-bits operand is needed for direct address instructions such as this:

    sto    a, addr     ; [addr] <- a (addr is 20-bits)

It is also needed for Immmediate 20-bits instructions such as this:

    movi   sp, value   ; sp <- value (20-bits)

In both cases, the least significant 16 bits word (LSW) of the 20-bits address or value is given in the word following the instruction. The most significant 4-bits nibble (MSN) is embedded within the OP-CODE. Thus such instructions can complete in three machine cycles as the following figure shows.

Control signals RD_ID and WR_ID applies to a special 20-bits register that I called "ID" in the drawing but I will call it "MDR" instead. Its sole mission is to hold the operand so it is available to the Exectute cycle. Yes, in the Heritage/1 the MDR register is the size of the Address Bus and there is no MAR register.

Transfer between registers don't require the "Point" step because registers are implicitly addressed by the OP CODE. In the time diagram below, EXEC machine cycle, T1 is an "Open" step and T2 is a "Take" step.

The following figure illustrates how this could be made in practice. Please know that this is not a definite design but only an illustration to the idea.

The instruction code shown in the figure presents two 3-bits fields to encode the destination and source register respectively. This bits are wired to respective decoders which in turn provide all S_reg. (select, that is, open) and WR_reg. (write, that is clock to latch) for all registers. In the example we have assumed eight registers in the CPU.

Note that one of these decoders (the one in the top) is dedicated to S_reg. signals whereas the other is dedicated to WR_reg. signals. The encoders also have an "strobe" input, so they will output nothing until the strobe signal is present and this comes from the Instruction Decoder Logic.

This logic takes the OP COD as input and it also takes the synchronization signals T1, T2, T3, T4. Those signals are obtaining from a 2-to-4 decoder fed from a 2-bits counter which in turn is clocked from the master clock.

We have seen that most instructions takes three steps (T1, T2, T3), but this takes only two. The solution is to provide a "shortcut" by reseting the counter so the decoder can jump to T1.

For that purpose, the Instruction have a 2-bits field termed "last T". The content is constantly compared with the counter output as we can see in the diagram. In this case, last T is equal 2, so as soon as T3 tries to appear, it will be forced to become T1.

The Logic is designed so signals S_B and WR_A appear as shown in the timing diagram. Then the counter is reset, as it is IR, so the next cycle will be, with no choice, a fetch cycle.

Registers suitable for increment/decrement operations are implemented with binary counter chips. This means that increment/decrement instructions (such as INCR Y) do not involve data transfer and can be therefore performed in one single step, that is during T1.

Synchronizing with slow Peripherals

Peripherals for the Heritage/1 must be designed so they communicate with the Bus via three-state high-speed buffers such as the 74HC374. This is because the CPU will read and write the device (that is the buffer) at the same speed it does for static RAM memory. This means that, when reading from a peripheral, any delay introduced by the device itself must lie between the moment it is addressed and that in which the buffer got a valid data read from the device.

During this period (from addressing to data ready) the device must activate and keep active a bus line termed WAIT. The device will deactivate that line once the data has been safely transferred to the buffer.

The CPU samples the WAIT line during both "Open" and "Take" steps of every Memory Read cycle, that is T2 and T3 (remember that peripherals are memory mapped). If WAIT is active, the RD and WR signals are not activated, the IR register is not cleared and PC not incremented. This results into the same Machine Cycle being repeated over and over until the WAIT signal is removed by the device so the data (from the buffer) can be read as normal.

If an interrupt occurs while the CPU is waiting, it will be served. Upon return a new Machine Cycle will automatically resume the waiting condition.

Similar mechanism apply to write operations. The CPU will sample the WAIT line before rasing the WR signal in T2. Most likely the first time it will success since the data is written to a high speed buffer in the Peripheral circuitry. As soon as it is received, the peripheral will activate the WAIT line and will keep it active until it is ready for the next one. This implies that, when writting bulk data to a peripheral, the CPU will success the first but it will have to insert wait machine cycles for all others.

Proper design of peripherals will include, of course, the use of multibyte FIFO buffers. Those need to be 16 bits wide (the size of the Heritage/1 Data Bus); remember that this computer cannot address bytes individually.

The WAIT signal is then used to indicate buffer full condition.

Homebuilt CPUs WebRing

JavaScript by Qirien Dhaela

Join the ring?

David Brooks, the designer of the Simplex-III homebrew computer, has founded the Homebuilt CPUs Web Ring. To join, drop David a line, mentioning your page's URL. He will then add it to the list.
You will need to copy this code fragment into your page.

Project start date: May 13 of 2009