Exercise 2: Compute the average CPI and MIPS for Ultra3
Using your final Ultra3 machine from Lab 3, run the following software program (PC starts at 10):
3 DATA 2Check to make sure your machine functions properly on the program.
4 DATA 1
10 LOADI R3,2
11 LOADD R2,2
12 LOAD R1,3
13 BNE R1,R2,2
14 ADD R1,R1,R2
15 STORE R1,5
16 ADD R1,R0,R2
17 BEQ R1,R2,2
18 JUMP 10
19 STORED R1,6
20 LOADI R1,4
21 BLT R2,R1,1
22 JUMPD 21
23 JUMP 24
24 SUB R1,R2,R3
25 JUMPI -15
Hand in the working Verilog code and output for the above software
program, along with comments that show what it does and why you think
it is correct.
Answer the following three questions (to be handed in).
A. What is the Clock Periods per Instruction (CPI) for each instruction? That is, how many clock periods does each instruction take? An easy way to do this is to run the above program and count the clock periods needed for each instruction. Note that the CPI includes the time to fetch the instruction.
Use a $display() to print out the start of fetching an instruction and other $display()s to print the name of each instruction.
CPI Frequency (in %)B. Assume the instructions have the above frequency of use from a representative workload.
Add ________ 20
BEQ ________ 10
BLT ________ 5
BNE ________ 10
Load ________ 25
LoadI ________ 10
LoadD ________ 4
Store ________ 5
StoreD ________ 5
Sub ________ 2
Jump ________ 2
JumpD ________ 1
JumpI ________ 1
What is the average CPI? ________________C. Assume the clock period is 100 nanoseconds, how many MIPS (Millions of Instructions Per Second) does your Ultra3 do? Show work.
What is the MIPS? ________________
Exercise 3: Speeding Up Your Ultra3
Copy your Ultra3 to a file called Ultra4.v. Modify your Ultra4 to make it faster by removing instances of "#clock" and combining register transfers.
We strongly urge you to read and study pages 11 and 12 in Realization of Verilog HDL Computation Model on how to speed up Verilog code.When one removes a #clock, one must
analyze the effect of doing the statement in the same clock period as
the previous
statement no matter how one branched there.
Important: After making a speed improvement to your Ultra4 make sure that it still works by running it on veriwell. We recommend that initially you use #clocks liberally in your designs. Once your design is working properly then remove one (1) #clock at a time and check the behavior of the machine after each removal.
For example, removing the #clock on the MA <= PC; register transfer
#clock PC <= PC + 1;means that if both IR[0] and IR[1] are 1, the old value of PC will be transferred to the MA during the same clock period that the PC is incremented.
if ( IR[0] & IR[1] )
#clock MA <= PC;
In the below control sequence, the test for the first four bits of the IR is on the old value of the IR. If one intends to test the new value of IR, one needs to add a #clock before the if.
#clock IR <= MD;NOTE: Avoid doing anything like this!
if ( IR[0:3] == 4'b000 ))
#clock MA <= IR[8:15];
while( P[4] )The above is potentially an infinite loop which takes no time periods in the simulation. Why? Because there are no #clocks inside the loop to move time forward. The old value of P[4] is always used; not the updated one. Therefore, if P[4] is 1, a Verilog simulator will go into an infinite loop and just spin with this code.
begin
P <= P + 1;
end
You may combine register transfers and do them in the same clock period such as
#clock A <= B; changed to #clock A <= B;as long as they do not interfere. You can't remove a #clock if a result of one depends on the second, for example,
#clock C <= D; C <= D;
#clock MA <= PC;can't be combined as the new value of MA is used in the second statement. This is called a data dependency.
#clock B <= MA + 1;
However, you could rewrite the above to be faster by substituting MA with PC in the second register transfer as shown:
#clock MA <= PC; B <= PC + 1;
Important: After making a speed improvement to your Ultra4 make sure that it still works by running it on veriwell. We recommend that initially you use #clock's liberally in your designs. Once your design is working properly then remove one (1) #clock at a time and check the behavior of the machine after each removal.
Hand In:
Hand in printouts of code and runs of the original Ultra3 and your enhanced Ultra4. On the printout of the enhanced Ultra4, explain your improvements by underlining and adding written comments.
For your enhanced Ultra4, answer the following four questions:
A. What is the CPI for each instruction?
CPI Frequency (in %)B. Assume the instructions have the same frequencies of use as before.
Add ________ 20
BEQ ________ 10
BLT ________ 5
BNE ________ 10
Load ________ 25
LoadI ________ 10
LoadD ________ 4
Store ________ 5
StoreD ________ 5
Sub ________ 2
Jump ________ 2
JumpD ________ 1
JumpI ________ 1
What is the average CPI? ________________C. Assume the clock period is 100 nanoseconds, how many MIPS (Millions of Instructions Per Second) does your Ultra3 do? Show work!
What is the MIPS? ________________
D. Compare the old and the enhanced versions of your machine by computing the speedup.
What is the speed up of the enhanced Ultra4 over the original
Ultra3? ___________
Category 1: Fastest Ultra4 in MIPS which adheres to our memory model of always accessing memory through the MA and MD registers.
Category 2: Fastest Ultra4 in MIPS with the above restriction relaxed.
Any legal operation of the simulator is allowed but the machine must
still work properly.
______________________________
You must state which category you wish to enter. You may submit entries to both categories. An entry must include a listing of Verilog code, a run of the program in Exercise 2 and correctly computed MIPS. All entries to be considered for judging must be in by Tuesday 1 PM September 25, 2007. For each category, the prize for first place is a half dozen chocolate chip cookies. The machine must produce the correct results. Incorrect machines are automatically disqualified. Decisions of the judges are final. Employees of the company that sponsors the contest and their family members are not eligible to enter. Contest is void in states where regulations prohibit such contests.