Difference between revisions of "Leon3"
Tom Golubev (Talk | contribs) (→Design Questions) |
Tom Golubev (Talk | contribs) (→Design Questions) |
||
(20 intermediate revisions by the same user not shown) | |||
Line 18: | Line 18: | ||
(1) Command Used: make distclean && time `make synplify && make quartus-synp` | (1) Command Used: make distclean && time `make synplify && make quartus-synp` | ||
+ | |||
+ | |||
+ | ==Adding ADCDAC Tutorial== | ||
+ | 1) Edit config.in to add options to xconfig | ||
+ | |||
+ | 2) add to leon3XX.vhd in the designs/your_board directory | ||
+ | library gleichmann; | ||
+ | use gleichmann.hpi.all; | ||
+ | use gleichmann.miscellaneous.all; | ||
+ | use gleichmann.multiio.all; | ||
+ | use gleichmann.dac.all; | ||
+ | use gleichmann.sspi.all; | ||
+ | use gleichmann.ge_clkgen.all; | ||
+ | use gleichmann.ac97.all; | ||
+ | |||
+ | 3) Add signals for ADC in structure (leon3XX.vhd) | ||
+ | 4) Add generate section into structure (leon3XX.vhd) | ||
+ | 5) Removed gleichman from LIBSKIP (Makefile) | ||
+ | 6) Copied leon3hpe.h file into project dir, because config.vhd.in file imported it | ||
+ | |||
+ | |||
+ | |||
+ | ==Programming Board== | ||
+ | using grmon: | ||
+ | add to path - #export PATH=$PATH:/home/tom/leon3/grmon-eval/linux | ||
+ | add libjtag.so from quartus 32 bit #export LD_LIBRARY_PATH=/software/altera/quartus/linux | ||
+ | |||
+ | envoke grmon, #grmon-eval -altjtag | ||
+ | |||
+ | To use grmon-rcp, libwidget_gtk2.so 32 bit needs to be present | ||
+ | |||
+ | A 32-bit Java also need to be used | ||
+ | Install 32 bit openjdk | ||
+ | Switch from 64 to 32 bit java | ||
+ | #alternatives --config java | ||
+ | #alternatives --config javac | ||
+ | |||
+ | ==Hardware in the Loop (HIL) Consideration== | ||
+ | |||
+ | The idea for the ML510 Board is outlined: | ||
+ | |||
+ | 1) Have the board run a Leon3 core, with many peripherals. | ||
+ | |||
+ | 2) Run an MMU-less Linux version (MMU-less so that software is SCOORE compatible) | ||
+ | |||
+ | 3) Create a AHB component that will allow communication with a DUT (module under test). | ||
+ | |||
+ | 4) Extend ATC to allow VPI testbenches to run on the Leon3 | ||
+ | |||
+ | 5) Current testbenches can test modules within the FPGA, running at native speeds | ||
+ | |||
+ | 6) On error, transfer state information back to computer. | ||
+ | |||
+ | 7) Rerun testbench with current state information -- essentially resuming from where the Leon3 left off. | ||
+ | |||
+ | 8) Fix error, resynthesize, offload to back Leon3, resume testing. | ||
+ | |||
+ | |||
+ | Essentially, an HDL module would be tested in hardware. This allows speeds in excess of 100MHz, much faster than the 10KHz simulation speed of PC based simulators. | ||
+ | By saving state information for the last 1000 cycles for example, the hope is that by "rewinding" back that far, the bug can be reproduced. | ||
+ | Using VCS or Modelsim for simulation, debugging can ensue. | ||
+ | |||
+ | The hardware is used to allow acceleration of HDL verification. If more transparency is needed, either FPGA based scopes such as ChipScope can be used, or switching | ||
+ | to the PC simulator can happen almost transparently. The end goal is to save time in discovering the presence of bugs, and to focus mostly on debugging the bugs themselves. | ||
+ | |||
+ | A mechanism such as DSP-Builder's Hardware in the Loop (HIL) allows for the instantiation of a module within the fpga, while the logic to interface with it is created | ||
+ | by the tool automatically. The main difference is that our design executes the testbench on the FPGA as well, saving bandwidth and potential problems in interfacing the PC to the module. | ||
+ | |||
+ | ==HIL Ideas== | ||
+ | Use internal RAM blocks to provide input vectors and store output vectors. ( | ||
+ | Use Fifo within the Sim module to increase bandwidth / allow decoupling of clocks. | ||
+ | |||
+ | When using FIFOs, does size matter | ||
+ | |||
+ | ==Hardware in the Loop (HIL) Consideration== | ||
+ | The case for HILs | ||
+ | |||
+ | The number one motivation is speed. Imagine running SpecInt 2000 on a simulated processor. VCS might very chug along at an ultra fast speed of ~100 Cycle per second. SpecInt takes Billions of instructions. Achieving a speedup of several magnitudes of order greater is no small feat. | ||
+ | Scalability is a major simulator disadvantage. Simulating a complete processor at the RTL level | ||
+ | |||
+ | Read the HIL article from embedded.com [ http://www.embedded.com/15201692 ] | ||
+ | |||
+ | Example product: PCI accelerator card for Simulink DSP. [ http://www.gidel.com/ProcHils.htm ] | ||
+ | |||
+ | ==HIL Downsides== | ||
+ | HIL is a tool, which has its pros and cons. Strengths and weaknesses should be weighed when determining if it is right for you. | ||
+ | |||
+ | Big HDL designs may take ~20 hours to synthesize. A redesign may necessitate a full recompile, requiring a full day. | ||
+ | Intel implemented a full Pentium processor in a Virtex 4. The synthesis took 20-24 hours. Source [ http://portal.acm.org/ft_gateway.cfm?id=1331901&type=pdf ] | ||
+ | It achieved 25MHz, while utilizing 46% of the slices of a V4LX200. | ||
+ | |||
+ | Multiple clocks and clock gating poses a challenge. | ||
+ | |||
+ | Links: | ||
+ | (1) http://www.altera.co.jp/literature/ug/ug_dsp_design_flow.pdf (See HIL section) | ||
==Design Questions== | ==Design Questions== | ||
+ | |||
1) How easy is it to have mixed VHDL / Verilog Design using Synplify? | 1) How easy is it to have mixed VHDL / Verilog Design using Synplify? | ||
+ | ->Very easy | ||
2) How much throughput is needed? | 2) How much throughput is needed? | ||
− | + | ||
3) Which bus to use? AHB, APB? or tightly couple to proc (difficult) | 3) Which bus to use? AHB, APB? or tightly couple to proc (difficult) | ||
Wishbone / AHB / CoreConnect http://es.elfak.ni.ac.yu/Papers/ICEST%20'06.pdf | Wishbone / AHB / CoreConnect http://es.elfak.ni.ac.yu/Papers/ICEST%20'06.pdf | ||
− | BEE2 Leon2 ahb bus | + | BEE2 Leon2 ahb bus explanation http://cadlab.cs.ucla.edu/software_release/bee2leon3port/files/Timothy_Wong-MS_Report.pdf |
==AMBA BUS Notes:== (From 1999 ARM ihi 0011A Document) | ==AMBA BUS Notes:== (From 1999 ARM ihi 0011A Document) | ||
Line 43: | Line 140: | ||
Nios II / Leon2 Comparison CoSCPU.pdf | Nios II / Leon2 Comparison CoSCPU.pdf | ||
Leon2 / MicroBlaze / OpenRISC1200 Comparison Evaluation_of_synthesizable_CPU_cores.pdf | Leon2 / MicroBlaze / OpenRISC1200 Comparison Evaluation_of_synthesizable_CPU_cores.pdf | ||
+ | |||
+ | the Linux boot itself lasts about 1 billion cycles - http://eve-team.com/demos/linux-boot-demo.php | ||
+ | |||
+ | 3 Reviews of Tharas accelerator - http://www.deepchip.com/items/0408-03.html | ||
+ | |||
+ | Slides of EVE innovations - http://university.eve-team.com/files/ASAP2009.pdf | ||
==Links== | ==Links== | ||
− | Comparison of many soft cpus - http://www.1-core.com/library/digital/soft-cpu-cores/ | + | |
− | Evaluation of synthesizable CPU cores | + | VHDL For Verilog Designers Cheat Sheet: vhdl4verilog_summary.pdf |
+ | |||
+ | Comparison of many soft cpus - http://www.1-core.com/library/digital/soft-cpu-cores/ | ||
+ | |||
+ | Great Class, APB bus verilog LEON peripheral, includes verilog Leon top & much more http://classes.engineering.wustl.edu/cse465/ | ||
+ | |||
+ | SOC Leon Verilog APB & simulation HOWTO Includes AMBA verilog code - http://vcag.ecen.okstate.edu/projects/scells/download/MOSIS_SCMOS/socks/socks_documentation.doc | ||
+ | |||
+ | Evaluation of synthesizable CPU cores - Paper of several cores | ||
+ | |||
+ | NIOS2 to Leon2 - http://www.altera.com/literature/dc/2007/in_2007_multiproc.pdf | ||
+ | |||
+ | Tutorial of adding AHB Slave to Leon2 - http://www.ece.ncsu.edu/muse/soc_information/tutorial/leon/ | ||
+ | |||
+ | Another paper of Leon2 APB addon - http://www.iberchip.org/iberchip2006/ponencias/45.pdf | ||
+ | |||
+ | Dutch university research into SOCs, includes NIOS2 Cyclone Dev board - http://emsys.denayer.wenk.be/?project=empro&page=about | ||
+ | Porting to AHB / Wishbone buses report - http://emsys.denayer.wenk.be/empro/OCIDEC%20case%20report.pdf | ||
+ | |||
+ | Leon 2 dev board seller (mostly Xilinx parts) - http://www.pender.ch/products.shtml | ||
+ | |||
+ | Stratix 1S80 Leon board adaptation - http://www.nouiz.org/leon2.html | ||
==Print== | ==Print== |
Latest revision as of 11:46, 28 August 2009
Contents
Tom Golubev's Leon3 Based Logic Analyzer
Using NIOSII Cyclone dev board
Leon3 grlib-gpl-1.0.20-b3403
FPGA Usage (1): EP1C20F400C7 Configuration: LUT Usage (%) Mem Usage (%) System MHz (1) Default: noFpu, noMMU 56 % 36 % 135.9 (2) noFPU, MMU 65 % 37 % 134.8 (3) Bare 38 % 18 % 126.1
(1) Command Used: make distclean && time `make synplify && make quartus-synp`
Adding ADCDAC Tutorial
1) Edit config.in to add options to xconfig
2) add to leon3XX.vhd in the designs/your_board directory library gleichmann; use gleichmann.hpi.all; use gleichmann.miscellaneous.all; use gleichmann.multiio.all; use gleichmann.dac.all; use gleichmann.sspi.all; use gleichmann.ge_clkgen.all; use gleichmann.ac97.all;
3) Add signals for ADC in structure (leon3XX.vhd) 4) Add generate section into structure (leon3XX.vhd) 5) Removed gleichman from LIBSKIP (Makefile) 6) Copied leon3hpe.h file into project dir, because config.vhd.in file imported it
Programming Board
using grmon: add to path - #export PATH=$PATH:/home/tom/leon3/grmon-eval/linux add libjtag.so from quartus 32 bit #export LD_LIBRARY_PATH=/software/altera/quartus/linux
envoke grmon, #grmon-eval -altjtag
To use grmon-rcp, libwidget_gtk2.so 32 bit needs to be present
A 32-bit Java also need to be used Install 32 bit openjdk Switch from 64 to 32 bit java #alternatives --config java #alternatives --config javac
Hardware in the Loop (HIL) Consideration
The idea for the ML510 Board is outlined:
1) Have the board run a Leon3 core, with many peripherals.
2) Run an MMU-less Linux version (MMU-less so that software is SCOORE compatible)
3) Create a AHB component that will allow communication with a DUT (module under test).
4) Extend ATC to allow VPI testbenches to run on the Leon3
5) Current testbenches can test modules within the FPGA, running at native speeds
6) On error, transfer state information back to computer.
7) Rerun testbench with current state information -- essentially resuming from where the Leon3 left off.
8) Fix error, resynthesize, offload to back Leon3, resume testing.
Essentially, an HDL module would be tested in hardware. This allows speeds in excess of 100MHz, much faster than the 10KHz simulation speed of PC based simulators.
By saving state information for the last 1000 cycles for example, the hope is that by "rewinding" back that far, the bug can be reproduced.
Using VCS or Modelsim for simulation, debugging can ensue.
The hardware is used to allow acceleration of HDL verification. If more transparency is needed, either FPGA based scopes such as ChipScope can be used, or switching to the PC simulator can happen almost transparently. The end goal is to save time in discovering the presence of bugs, and to focus mostly on debugging the bugs themselves.
A mechanism such as DSP-Builder's Hardware in the Loop (HIL) allows for the instantiation of a module within the fpga, while the logic to interface with it is created by the tool automatically. The main difference is that our design executes the testbench on the FPGA as well, saving bandwidth and potential problems in interfacing the PC to the module.
HIL Ideas
Use internal RAM blocks to provide input vectors and store output vectors. ( Use Fifo within the Sim module to increase bandwidth / allow decoupling of clocks.
When using FIFOs, does size matter
Hardware in the Loop (HIL) Consideration
The case for HILs
The number one motivation is speed. Imagine running SpecInt 2000 on a simulated processor. VCS might very chug along at an ultra fast speed of ~100 Cycle per second. SpecInt takes Billions of instructions. Achieving a speedup of several magnitudes of order greater is no small feat. Scalability is a major simulator disadvantage. Simulating a complete processor at the RTL level
Read the HIL article from embedded.com [ http://www.embedded.com/15201692 ]
Example product: PCI accelerator card for Simulink DSP. [ http://www.gidel.com/ProcHils.htm ]
HIL Downsides
HIL is a tool, which has its pros and cons. Strengths and weaknesses should be weighed when determining if it is right for you.
Big HDL designs may take ~20 hours to synthesize. A redesign may necessitate a full recompile, requiring a full day. Intel implemented a full Pentium processor in a Virtex 4. The synthesis took 20-24 hours. Source [ http://portal.acm.org/ft_gateway.cfm?id=1331901&type=pdf ] It achieved 25MHz, while utilizing 46% of the slices of a V4LX200.
Multiple clocks and clock gating poses a challenge.
Links: (1) http://www.altera.co.jp/literature/ug/ug_dsp_design_flow.pdf (See HIL section)
Design Questions
1) How easy is it to have mixed VHDL / Verilog Design using Synplify?
->Very easy
2) How much throughput is needed?
3) Which bus to use? AHB, APB? or tightly couple to proc (difficult)
Wishbone / AHB / CoreConnect http://es.elfak.ni.ac.yu/Papers/ICEST%20'06.pdf BEE2 Leon2 ahb bus explanation http://cadlab.cs.ucla.edu/software_release/bee2leon3port/files/Timothy_Wong-MS_Report.pdf
==AMBA BUS Notes:== (From 1999 ARM ihi 0011A Document)
AMBA signal names
All AMBA signals are named such that the first letter of the name indicates which bus the signal is associated with. A lower case n in the signal name indicates that the signal is active LOW, otherwise signal names are always all upper case. Test signals have a prefix T regardless of the bus type. More information on test signals can be found in Chapter 6 AMBA Test Methodology.
Important Sources / Documents: Nios II / Leon2 Comparison CoSCPU.pdf Leon2 / MicroBlaze / OpenRISC1200 Comparison Evaluation_of_synthesizable_CPU_cores.pdf
the Linux boot itself lasts about 1 billion cycles - http://eve-team.com/demos/linux-boot-demo.php
3 Reviews of Tharas accelerator - http://www.deepchip.com/items/0408-03.html
Slides of EVE innovations - http://university.eve-team.com/files/ASAP2009.pdf
Links
VHDL For Verilog Designers Cheat Sheet: vhdl4verilog_summary.pdf
Comparison of many soft cpus - http://www.1-core.com/library/digital/soft-cpu-cores/
Great Class, APB bus verilog LEON peripheral, includes verilog Leon top & much more http://classes.engineering.wustl.edu/cse465/
SOC Leon Verilog APB & simulation HOWTO Includes AMBA verilog code - http://vcag.ecen.okstate.edu/projects/scells/download/MOSIS_SCMOS/socks/socks_documentation.doc Evaluation of synthesizable CPU cores - Paper of several cores
NIOS2 to Leon2 - http://www.altera.com/literature/dc/2007/in_2007_multiproc.pdf
Tutorial of adding AHB Slave to Leon2 - http://www.ece.ncsu.edu/muse/soc_information/tutorial/leon/
Another paper of Leon2 APB addon - http://www.iberchip.org/iberchip2006/ponencias/45.pdf
Dutch university research into SOCs, includes NIOS2 Cyclone Dev board - http://emsys.denayer.wenk.be/?project=empro&page=about Porting to AHB / Wishbone buses report - http://emsys.denayer.wenk.be/empro/OCIDEC%20case%20report.pdf
Leon 2 dev board seller (mostly Xilinx parts) - http://www.pender.ch/products.shtml
Stratix 1S80 Leon board adaptation - http://www.nouiz.org/leon2.html
GRADCDAC - tmtc.pdf (13-24)
AMBA Spec IHI0011A_AMBA_SPEC.pdf
GRADCDAC RTEMS Driver rtems-gaisler-drivers-1.1.99.4.0.pdf (92-101)
Gaisler IP grip.pdf (533-558,563-570)
Sources
(ASAP 2006). S. Tillich and J. Großsch¨adl. Instruction Set Extensions for Efficient AES Implementation on 32-bit Processors. In Cryptographic Hardware and Embedded Systems — CHES 2006, vol. 4249 of Lecture Notes in Computer Science, pp. 270–284. Springer Verlag, 2006.
A. Hodjat, I. Verbauwhede : Interfacing a high speed crypto accelerator to an embedded CPU. In: Proceedings of the 38th Asilomar Conference on Signals, Systems, and Computers, vol. 1, pp. 488–492. IEEE, New York (2004) P. Schaumont, K. Sakiyama, A. Hodjat, and I. Verbauwhede.
Embedded Software Integration for Coarse-Grain Reconfigurable Systems. In Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), pp. 137–142, IEEE CS Press, 2004.
An AES Tightly Coupled Hardware Accelerator in an FPGA-based Embedded Processor Core