Select

From Vlsiwiki
Revision as of 00:35, 28 March 2009 by Test (Talk | contribs) (New page: == Select Module Design Specifications == The select stage is between the Scheduler and the Compute Engine (CE). The main purpose of this unit is to avoid RAW hazards between clusters and ...)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Select Module Design Specifications

The select stage is between the Scheduler and the Compute Engine (CE). The main purpose of this unit is to avoid RAW hazards between clusters and to enforce structural hazards due to limited amount of units available. The select unit has two stages: SE0 and SE1. Each stage is processed in one clock cycle. There is also an overlapping stage of CE0 between the Select module and CE module. The select module takes in 6 instructions per clock cycle. Each instruction is put into a FIFO queue. There are 6 queues and each hold 6 instructions.

Instruction FIFO queues
FIFO 6 I35 I34 I33 I32 I31 I30
FIFO 5 I30 I29 I28 I27 I26 I25
FIFO 4 I24 I23 I22 I21 I20 I19
FIFO 3 I18 I17 I16 I15 I14 I13
FIFO 2 I12 I11 I10 I9 I8 I7
FIFO 1 I6 I5 I4 I3 I2 I1
Select Module
SE0 SE1
Overlapping Module
CE0

Each of the four clusters in the CE holds 128 registers in its Register File. There are a total of 512 registers. A 9-bit binary number is assigned to each register. Since there are only 512 registers, the highest 2-bits are indicative of the CE cluster/unit to which they belong.

Register Numbers and Corresponding Units
Register Number Highest 2 bits Compute Engine (CE)
0 - 127 00 A Unit RF
128 - 255 01 B Unit RF
256 - 383 10 C Unit RF
384 - 511 11 M Unit RF

There are three vectors called READY, SCHEDULING, and PENDING and each have 512 rows with one column. The value of each element is either 0 or 1. The horizontal index is the register name/position. The READY vector is initialized to all x’s. The SCHEDULING and PENDING vectors are both initialized to zeros.

READY bit

It represents the fact that the content of a register are valid. When the destination register of an instruction is not ready the ready bit is set to 0, and when the DONE signal from the CE is received and the content of the destination register becomes valid, it is set to 1. In other words

@ SE0:
      READY [DINST.DEST] = 0
@ WB stage of the CE:|
      READY [DINST.DEST] = 1

SCHEDULING bit

At the first stage, SE0, the scheduling bit for the destination register is set to 1 upon reading. When the instruction has been successfully sent out and the corresponding CE resource is not busy, the scheduling bit of the destination register is set to 0.

@ SE0:
        SCHEDULING [DINST.DEST] = 1
@ CE0: //if sent & !busy
        SCHEDULING [DINST.DEST] = 0

PENDING bit

This bit is required to let the CE know when the content of a register from an external source is needed. For example,

     RA1 ← RA2 + RB1 

The arithmetic operation is taking place in the A unit of the CE. The source register, RB1, is located in the Register File of the B unit. Therefore, there needs to be a signal which lets the B unit know in advance to forward that particular register to the A unit. The signal is simply the address of the RB1 which is to be forwarded to A unit. Before the forwarding is performed, the PENDING bit is set to 1. As it can be concluded the pending bit is set or reset based on the source registers. Another important point to remember is that the register has to be ready. Meaning, the READY bit has to be set before any forwarding is issued. The more complete check for forwarding is performed as follows:

IF (READY & PENDING)
        SEND it to be forwarded //send the address

The pending bit is cleared when forwarding is finished.

@ SE0: //setting to zero
	 IF forwarding is needed
            PENDING [fwd] = 0
@ CE0:  //setting to 1
         IF (DINST.SRC1.POS	!=  DINST.DEST.POS)
            PENDING [DINST.SRC1] = 1
         IF (DINST.SRC2.POS	!=  DINST.DEST.POS)
            PENDING [DINST.SRC1] = 1

We can send up to 4 forwarding signal in each clock cycle. The predetermined order is

  1. LOAD
  2. unit – ALU
  3. A unit
  4. C unit
  5. B unit – Branch
  6. STORE