User Tools

Site Tools


milestones

Tasks - Summer 2011

  • Comparison with manually-designed RTL:
    • Take a few manually-designed RTL designs and analyze them: inform decisions about what is needed in LegUp.
    • Take C implementations of such designs and synthesize with LegUp; compare to manually-designed RTL.
    • Examples: FFT, Jason Luu's biomedical project, other UofT projects, MatLab/Simulink, Wei Zhang's matrix multiply, OpenCores
  • LegUp enhancements:
    • Loop pipelining (Andrew's going to take a first-cut in a course project)
    • Loop unrolling
    • Resource sharing
    • Scheduling
      • Try out the LP-driving scheduling that is used by AutoPilot
    • Improved resource estimates:
      • RAMs and FIFOs need to be added (weren't done by Ahmed last summer)
  • SystemC support:
    • AutoPilot builds a system C model to ease debugging and analysis of the synthesized design. This helps with back-end verification.
  • Bit accurate data-types:
    • Allows the synthesized HW to be smaller/faster, and tailored to the bit-widths of the application.
    • Look into Synopsys library.
    • Look into what Catapult C is doing.
  • Pragmas:
    • New pragmas that allow finer control of HW synthesis. Examples:
      • Control scheduling: ordering of operations in the schedule; number of cycles allowed.
      • Loop initiation interval for loop pipelining.
  • Pattern matching:
    • Look for “patterns” in the LLVM DFG that are suitable for ALTR HW blocks, like the DSP units.
  • Debugging framework:
    • Many things to do.
    • Li might do this for his M.A.Sc. thesis.
  • Stratix IV:
    • Enhance LegUp so it can also target Stratix IV. Main thing seems to be how the Tiger uP system talks to the off-chip memory on DE4 board (which is DDR2 ?).
  • Concurrent processor/accelerator accelerator. James is working on this for his thesis. We need benchmarks that contain computations that can be executed in parallel.
  • Memory enhancements (current architecture):
    • Look into using Eric LaForest's multi-ported memory to reduce memory contention when accelerators/processor work concurrent.
    • Look into adapting LaForest's work so that each port can have a separate clock.
  • New memory architectures:
    • Based on profiled memory access patterns in software, automatically synthesize a “good” memory architecture for the hybrid system. Perhaps includes auto-synthesis of cache coherency hardware. This is a BIG project.
  • GUI
    • Summer student could build a GUI front-end to LegUp to illustrate the system architecture trade-offs, help with design space exploration.
    • Increases the “sell-ability” and “marketability” of LegUp
    • Perhaps the GUI can allow the C code to be connected with the synthesized HW
  • FPGA architecture-specific HLS:
    • Research how the underlying FPGA architecture should influence HLS algorithms, such as resource sharing.
    • Look into whether sharing is useful in any cases: for example, there may be cases wherein the MUX'ing logic needed to facilitate sharing can be “rolled into” the operating being shared.
  • Add floating/fixed point support
  • Research alternative HW architectures (besides FSM/datapath):
    • Systolic arrays
    • Hand-shaking pipelining
    • “No instruction” processor.
  • Impact of compiler-like optimizations:
    • Do all of the LLVM compiler optimizations make sense for LegUp?
  • Speculative execution in the synthesized HW:
    • Early prefetching of memory items.
    • Early operator execution based on most common branch outcomes.
    • Borrowing ideas from the computer architecture realm – bring into LegUp to improve the average-case execution performance.

Tasks - Summer 2010

  • Adding new C benchmarks to the suite (for example, choose a benchmark or two from SPEC or elsewhere and get it through leg-up [will also give experience using legup]).
    • Victor
  • Build parameterizable area/delay/power models for the most common high-level synthesis operators implemented in ALTR FPGAs: addition, subtraction, shift, multiply. We also need models for wide MUXes.
    • Ahmed
  • Defining/designing the processor/computer-accelerator interface. Evaluate ALTR Avalon, IBM CoreConnect, Wishbone, etc.
    • James
  • Find RTL implementations that are equivalent to the chstone benchmarks: dfadd, mips, dfmul, etc. Compare the performance/area/power to the legup synthesized RTL. They should use the test vectors from chstone, so we will get a feel for integrating hardware blocks with C code.
  • Profile the chstone benchmarks to determine H/W and S/W partitioning
  • Use the legup infrastructure to implement the algorithms in these papers:
    • Optimal Allocation and Binding in High-Level Synthesis
    • Force-Directed Scheduling for the Behavioral Synthesis of ASIC's
    • An Efficient and Versatile Scheduling Algorithm Based on SDC Formulation
  • Legup GUI that would help users visualize the hardware. Could allow parameterized sweeps of timing constraints to illustrate the tradeoff between performance and area/power.
    • Li

Potential Milestones

Andrew:

  • April 30 - Scheduling infrastructure done
  • May 7 - Allocation infrastructure done
  • May 14 - Binding infrastructure done
  • May 21 - C pragma infrastructure done
  • May 28 - H/W and S/W partitioning infrastructure done
  • June 4 - Force-directed scheduling done
  • June 11 - Optimal binding/allocation done
  • June 18 - Optimizing performance for chstone done
  • June 25 - Optimizing area for chstone done
  • July 2 - Release Legup. Results for paper complete.

Mark:

  • May 7 - adpcm running on mips processor
  • May 14 - all of chstone running on mips processor
  • May 21 - profile chstone in software for H/W and S/W partitioning
  • May 28 - build simple hardware profiler
  • June 4 - hardware profiler results for mips chstone benchmark
  • June 11 - hardware profiler results for all of chstone
  • June 18 - integrate profiler with legup
  • June 25 - improvements to hardware profiler
  • July 2 - release profiler with legup.

Victor:

  • May 14 - get a new C benchmark running through legup and add to test suite.
  • May 21 - find RTL versions of chstone
  • May 28 - allow integration of hardware blocks with C code
  • June 4 - results of RTL vs Legup and detailed analysis
  • June 11 - use analysis to improve scheduler
  • June 18 - Legup GUI to illustrate hardware
  • June 25 - Legup GUI parameterized sweeps with tradeoff graphs
  • July 2 - release legup.

Ahmed:

  • May 7 - evaluate ALTR Avalon, IBM CoreConnect, Wishbone, etc.
  • May 14 - propose interconnect and integration method
  • May 21 - demonstrate a simple H/W block integrated with MIPS processor
  • May 28 - gather metrics on altera operators: +, -, «, …
  • June 4 - create parameterizable model in legup
  • June 11 - update parameterizable model to include muxes
  • June 18 - modify scheduling to account for parameterizable model
  • June 25 - modify binding to account for parameterizable model
  • July 2 - release legup

The rest of the summer can be spent documenting features, coming up with novel ideas, and writing the paper.

milestones.txt · Last modified: 2011/04/14 12:59 by zhangvi1