Term Definitions
Contents
SCOORE Definitions
Physical Structures Terms
Reservation Station
Instruction window located inside each cluster. Reservation stations (RS) are part of the issue logic together with scheduler and select.
Scheduler
Main instruction window (scheduler) performed after rename and before select. The scheduler is a big instruction window with an efficient FPGA mapping. It requires less hardware per entry than the reservation stations.
The scheduler can be activated/deactivated at run-time or compile time and SCOORE still works (with a lower IPC).
Logic Terms
Window Logic
A structure that can insert/remove elements out-of-order. As opposed to a FIFO or LIFO that have insertion/removal constraints.
Wakeup/Select
Whenever an element can be removed from a window logic. It is necessary to wakeup (set ready) the entry and decide or select which entry from the currently ready should be removed from the window logic.
Busy Paradigm
In SCOORE we can set a "busy" signal which means that the receiving unit has notified the sending unit that it can NOT accept anymore inputs. For a "busy: implementation this means that the sending unit will send the new data compared to when busy initially went high. Sender will continue to send this data until 1 cycle AFTER busy goes low. E.g: The channel control between Unit A and Unit B could be as follows (the values are available only after clock posedge):
Retry Paradigm
In SCOORE we can also use the "retry" paradigm, which is slightly different than busy paradigm. For retry the sending unit must remember the output data that was sent on the cycle that busy went high to the receiving unit. It will "retry", keep resending this data, until busy is goes low and for one cycle after that.
A is sending unit: B is receiving unit:
ALL signals are in reference to what A see as input on rising edge of clock. ALL signals are in reference to what A sets as output on rising edge of clock.
Description of signals below: CLK = clock The top row of numbers to the right of the CLK signal represent a given clock period. The F(false) and T(true)signals just underneath the numbers represent that the signal is set as input to A in case of "B_to_A busy" or set as output from A in case of "A_to_B Dinst"
Retry Paradigm*************************
CLK 0 1 2 3 4 5 - 0 1 2 3 4 5 -1 2 3 4 5 B_to_A busy F F T T T F F F T F F F F T T F F A_to_B DInst A B C C C C D E F F G H I J J J K busy > 2 cycles busy 1 cycle busy 2 cycles
Busy Paradigm*************************
CLK 0 1 2 3 4 5 - 0 1 2 3 4 5 -1 2 3 4 5 B_to_A busy F F T T T F F F T F F F F T T F F A_to_B DInst A B C D D C D E F G H I I J K K L busy > 2 cycles busy 1 cycle busy 2 cycles
In-Flight Instructions
In-flight instructions refers to the total number of instructions being process by SCOORE. It starts from the moment that an instruction is fetched until the instruction is retired.
NOTE that it is NOT the same as the ROB size which accounts from the rename stage until retire stage.
RAMP Relation
RAMP is a collaborative research community that aims at using FPGAs to accelerate processor simulations. Most RAMP projects aim at multiprocessor acceleration.
This is a similar goal to SCOORE, a major difference is that RAMP does not have an emphasis on ASIC or synthesis but rather on the FPGA acceleration.
RTL Layers
RAMP common interface divides the RTL implementation in 3 blocks:
-Model RTL: The RTL required to implement a CPU/Memory/... model. Equivalen to the synos/scoore directory.
-Unmodel RTL: The RTL required to gather statistics or control the modeled architecture. It may not be necessary to execute instructions in the model architecture, but it is a requirement for researchers.
-Platform RTL: Each model and unmodel RTL needs to be synthesized in a specific FPGA platform. The platform RTL is the RTL specific for each platform.
The RAMP project aims at providing a common interface to avoid platform and unmodel RTL. SCOORE can benefit as it could potentially reuse both.
Unit / Channel
RAMP partitions the design into units and channels. Main characteristics:
-Units only communicate through channels -Units wait until all the inputs (in channels) are ready before generating outputs (out channels) -Channels have a fixed latency -Channels are unidirectional -Channels have an initialization -A firing happens when a unit reads/consumes all the inputs and generates all the outputs.
The RAMP channels are similar to the SCOORE inter-block communication. In SCOORE:
-All the inputs are read every cycle. The unit or block is responsible to read. A key difference is the handling of the busy signal previously explained.
-RAMP uses a "enable" signal to notify that the channel has data. We use a "valid" signal. E.g:
typedef struct packed{ BoolType valid; // equivalent to the RAMP enable BoolType fault; RIDType RID; OPType op; PRegType psrc1; PRegType psrc2; IMMType imm; PRegType pdest; }DInst_SEType;