LegUp HLS implementation of SUBLEQ and SUBLEQREV computers
janders; 2 November 2015; 24 November 2015; 24 December 2015
Experimental settings:
Quartus 15
LegUp head branch pulled November 2015 (4.0+ with improvements to bitwidth minimization made by Julie)
No “false” path settings in .sdc file
Cyclone V, 28nm FPGA, logic elements based on ALMS (fracturable 6-input LUTs)
Loop pipelining ON for subleq processors
Area reports below are solely for SUBLEQ/SUBLEQREV machines. Reporting ALMs NEEDED (ALTR tools also report ALMs used in final placement, which is a larger #).
Performance reports below are for entire system (including memory)
Power measurements reflect the SUBLEQ/SUBLEQREV machines ONLY (no memory); 15% toggle rate for all signals; 50
MHz clock rate
Scenarios considered:
Key findings:
II = 3 with single cycle memory access (can start a new subleq(rev) instruction every 3 cycles)
II = 5 with dual-cycle memory access (can start a new subleq(rev) instruction every 5 cycles)
The above hold for both subleq and subleq rev
NO latency difference in cycles between subleq and subleqrev computers
Need to make minor code changes for subleqrev implementation to allow loop pipelining to work
SUBLEQ
1 cycle memory
2 cycle memory
SUBLEQREV
1 cycle memory
2 cycle memory
Implementation results for the Tiger MIPS
Experimental settings:
Same Quartus version and device settings as above.
Tiger MIPS is implemented WITHOUT the divider units (janders: should I add it back? we had done this for the EUC paper, because at that time, we didn't support the MIPS division instruction)
Only look at area/power consumed within the Tiger MIPS core (not including system and cache)
Performance measurements (
MHz) are for JUST the MIPS – no memory
Power measurements assume a 15% toggle rate; 50MHz clock frequency
Tiger MIPS:
Area: 1737 ALMs needed, 6 DSP blocks (see Wong, Rose, Betz for tile-area ratio between DSP tiles and LAB tiles in Altera)
-
23.99 mW
~479.8 pJ / cycle. We would expect that for MIPS architecture, IPC is close to 1.