The Design Cycle
Developing a coprocessing system involves maintaining two pieces of C/C++ code, and compiling each of them separately. While the host program is compiled with a regular compiler for a plain user-space program (e.g. gcc), the handling of the synthesized function is significantly longer.
We’ll start with running through the procedure for compiling C into an FPGA binary, and then look at how and where to make the modifications for your own application.
If it’s not already open, start Vivado HLS, and open the HLS project: Pick “Open Project” on the welcome page, navigate to where the HLS project bundle was unzipped to, and choose the folder with the name “coprocess”.
Change the project's part number and target frequency: Pick Solution > Solution Settings... > Synthesis and change the "Part Selection" to the FPGA target of the (non-HLS) Vivado project.
Set the Clock Period to match the frequency of bus_clk of the relevant FPGA bundle as follows:
- All Xillinux (Zynq) targets: 10 ns
- Artix-7 targets: 8 ns
- Others: 4 ns
Leave the Clock uncertainty blank (default).
Compile (”synthesize”) the project by picking Solution > Synthesis > Active Solution (or click on the corresponding icon on the toolbar). A lot of text will appear on the console, including several warnings (which is normal). No errors should occur.
A successful compilation is easily recognized by the following message among the last few lines in HLS’ console tab:
@I [HLS-10] Finished generating all RTL models.
or possibly just (depends on HLS version)
Finished C synthesis.
A synthesis report will also appear above the console tab only when the synthesis was successful.
For more information about Vivado HLS, please refer to its user guide.
Integration with the FPGA project
Vivado HLS produced a number of files in Verilog. In this next step we’ll integrate these files with the project that generated the Xillybus FPGA binary. One step in that direction was made when the xillydemo.v was modified in the previous part.
The first step is to export the HLS project into a Design CheckPoint (DCP) file, which consists of the logic that was generated during the HLS compilation in a format which is easily included in an FPGA project.
In Vivado HLS, select Solution > Export RTL and pick "Synthesized Checkpoint (.dcp)" as Format Selection. For "Evaluate Generated RTL" choose Verilog, and don't check either checkboxes under this. Click OK.
Vivado HLS responds with launching a (non-HLS) Vivado synthesis, which breaks down the Verilog files into simpler logic elements, and ultimately packs the result as a xillybus_wrapper.dcp.
This can take several minutes, and ends with something like
#=== Final timing === CP required: 10.000 CP achieved: 8.049 Timing met INFO: [Common 17-206] Exiting Vivado at Wed Dec 28 09:51:39 2016... Finished export RTL.
The figures may vary, if it doesn't say "Timing met", there might be a problem completing the implementation on the next stage.
In (non-HLS) Vivado, open the xillybus bundle project that was set up previously. Add the DCP file to the project as follows:
Choose File > Add Sources... and pick "Add or create design sources" (and click Next). Click on "Add Files" and navigate to the HLS project's directory, and to coprocess/example/impl/ip from there. Pick xillybus_wrapper.dcp. Verify that the "Copy sources into project" checkbox is unchecked. And Finish.
To be on the safe side, reset the project, so it's fresh from the previous build: At the bottom, pick the "Design Runs" tab, right-click somewhere in the region above the tabs, and pick "Reset Runs".
Next, click "Generate Bitstream" in the left bar. This kicks off a build process, which typically takes about the same time as the previous one (typically 10-20 minutes).
Lots of output will appear on the console pane, including warning messages, which is normal. No errors should occur. The process should terminate successfully.
Using the bitfile
The bitfile is used as before, when the plain Xillybus bundle was implemented. For PCIe-based platforms, the bitfile is loaded through JTAG to the FPGA (note that the host should be turned off while this is done). If a Xillinux platform is targeted, replace xillydemo.bit on the SD card's boot partition.
Applying your own C/C++ code
It's recommended to develop applications on basis of the sample code in the sample project bundle, and the host program shown at the bottom of part II. Changes in the synthesized function is up to the application, but as far as data transport goes, it's important to stick to the use of *in++ and *out++ operations only for communicating with the host. The number of such operations should be changed to match the desired amount of data for transmission, of course. The host program is adjusted to expect the same amount of data through the pipes accordingly.
The sample C code resides in coprocess/example/src. In particular, the wrapper function and the small function that demonstrates the coprocessing is in main.c in this directory. This file can (and probably should) be edited to implement the desired algorithm. Additional source files can be added to the project, just like any C/C++ project.
When the C/C++ sources are modified as necessary, just start over from "HLS compilation" above and go all the way to implementation with Vivado. That is:
- Vivado HLS: Compile the project in HLS. The HLS synthesizer always cleans up the files generated in previous compilations before starting a new one.
- Vivado HLS: Export into a DCP file
- Vivado: Reset desing runs
- Vivado: Generate bitstream
When updating the DCP file, It's recommended to verify that the bitstream is generated based upon the correct one: Sometimes a slight mistake in Vivado's project settings causes it to work on a local copy of the DCP file, ignoring the one exported from HLS recently. The easy way to check this is deleting the old DCP file before exporting the new one, and attempting to generate a bitfile. This should fail, as a crucial file is missing. If it doesn't, delete the DCP file from the project's source list, and re-add it after exporting the new one.
Before including another coprocessing function, it’s recommended to test its syntax and functionality with a plain compiler that generates binaries running on a computer, preferably gcc. The debugging cycle with the FPGA is relatively long.
Xilinx’ HLS guide suggest performing an intermediate verification stage, called a cosimulation. This is not part of the flow suggested here, since the debugging functions allow printing out intermediate results, should the synthesized function behave differently on a processor and on an FPGA. Accordingly, no C/C++ test benches are included in the project bundle, as these belong to the cosimulation flow.
Removing debugging output
When not needed anymore, the calls to the debug functions should be removed (xilly_* functions) from any C/C++ that is fed into the HLS compiler. It’s also possible, even though not necessary, to remove the #include header for xilly_debug.h.
When no debug functions are used, the HLS compiler automatically removes the Verilog ports that are related to sending data to the host via Xillybus. As these ports are referenced in xillydemo.v, this will lead to implementation errors, unless these references are removed.
Accordingly, the following three lines should be commented out from xillydemo.v when no debug calls are made. The syntax for comments in Verilog is the same as in C, so a // marker at the beginning of each of these lines will do the trick.
.debug_ready(!debug_out_full || !user_r_read_8_open), .debug_out(debug_out_din), .debug_out_ap_vld(debug_out_write),
If these lines are not removed, Vivado will fail during the synthesis stage, complaining that it can't find debug_ready, debug_out and debug_out_ap_vld in xillybus_wrapper().
Note that Vivado HLS generates an instantiation template for xillybus_wrapper as coprocess/example/impl/ip/xillybus_wrapper.veo.