The guide to Xillybus Block Design Flow for non-HDL users (deprecated)

3 Integrating with application logic

3.1 The basics

Integration with application logic is done by using Vivado’s block design GUI: IP blocks are added to the block design and connected as required.

For integration of IP blocks that are generated by Vivado’s High Level Synthesis (HLS), please refer to section 6.

It should be noted that even though it’s often says in Xillybus’ documentation, that the application logic in the FPGA communicates with the host through FIFOs, this is not the case with the Block Design Flow (but only for the Verilog / VHDL design flow). The glue logic in Xillybus’ IP Core that generates the AXI Stream interfaces already includes FIFOs (among others for clock domain crossing between bus_clk and ap_clk). As a result, the application logic is not required to deploy any FIFOs in order to interface with Xillybus’ IP core when the Block Design Flow is used, unlike the VHDL / Verilog design flow.

For data exchange between the FPGA and host, connect the application logic to the dedicated AXI Stream ports (possibly after disconnecting loopbacks). These ports present only the TDATA, TVALID and TREADY and in particular not the TLAST signal. Consequently, each AXI Stream stream embodies an infinite data stream (as opposed to a packet interface, which the TLAST signal would have allowed). This is consistent with the infinite stream nature of Xillybus’ device files in general.

Xillybus streams can be used to exchange packets between the FPGA and the host, as explained in section 6.3 in any of these two guides:

3.2 Clocking

3.2.1 General

For the sake of simplicity, all signals connecting the user application logic to the Xillybus IP Core must be driven by a single clock, which is generated by the Clocking Wizard block in the block design. This clock (the “application clock”) is the Clocking Wizard’s clk_out1, which is also the Xillybus IP Core block’s ap_clk input.

It’s often convenient to drive the entire user application block with this single clock, so all of its internal logic as well as the interface depends on it. For example, logic which is generated by Vivado HLS’ synthesizer has a single clock input (named ap_clk). Connecting this clock input to the Clocking Wizard’s output guarantees that the AXI Stream port connections with Xillybus IP Core’s block works properly.

Note that FPGA tools sometimes refer to a clock’s frequency in terms of the frequency itself, typically in MHz, and sometimes as the clock period, commonly in ns. The clock’s frequency is the reciprocal of the clock period, so e.g. 100 MHz is equivalent to a clock period of 10 ns.

3.2.2 Setting the application clock

The application clock’s frequency can be set to increase performance or as a step towards achieving a working bitfile: A faster clock yields a higher processing throughput (unless some other bottleneck limits the performance) but also demands more from the FPGA’s logic elements and its utilization by AMD’s tools.

If the application clock’s frequency is chosen too high, the compilation of the project into an FPGA bitstream file fails on the grounds of not meeting timing constraints. This is also referred to as a “timing failure”. This situation means that the tools that carry out the implementation failed to utilize the logic in such a way that ensures the reliable operation, while the logic is driven by the clock’s frequencies as defined. The “timing constraints” in this context are the requirements on the frequencies of the clocks in the system.

Reducing the application clock’s frequency is always allowed (within the clock generator’s limits), but slows down the operation of the logic it drives.

In order to set the frequency of the application clock, double-click the block of the Clock Wizard (stream_clk_gen) in the block design view. A configuration window will be opened in Vivado. Choose the “Output Clocks” tab and change the “Output Clock Requested” frequency for clk_out1. The frequency in the “Actual” column shows the frequency that will be generated by the clock synthesizer. It may be slightly different from the requested frequency, since the output clock is derived by multiplying the input clock by a rational number, which is picked from a limited set of allowed values.

A small diversion from the requested frequency is harmless when the clock is used only for the application logic and its interface with the Xillybus IP core.

Other parameters in the Clocking Wizard should not be changed.

3.2.3 The bus_clk signal

Xillybus IP Core’s internal logic is driven by bus_clk, which is exposed in the block design merely to allow the derivation of the application clock from bus_clk. There is usually no other use for this signal, since the application logic only needs ap_clk for its internal logic and for interfacing with the Xillybus IP Core.

bus_clk’s frequency may however be of interest for the sake of spotting throughput bottlenecks. For example, if bus_clk runs at 100 MHz, the maximal theoretic bandwidth that may go through a 32-bit wide data interface is 400 MB/s, since Xillybus’ internal data pipe runs at bus_clk’s rate. If ap_clk runs at a higher frequency and data is pushed on each cycle of ap_clk, it’s likely that the data pace will be slowed down by virtue of the AXI Stream flow control signals (TREADY and TVALID).

For this reason, bus_clk’s frequency should be taken into consideration when attempting to maximize an application’s throughput, in particular if the data interfaces are expected to contain long bursts of (or continuous) data traffic.

The frequency of bus_clk can be found under the “Clocking Options” tab, as the frequency of the primary input clock, which is clk_in1. This parameter informs the Clocking Wizard what frequency to expect at its input, and can therefore be used for knowing the frequency of bus_clk for a specific Xillybus bundle.