4 Modifications

4.1 Integration with custom logic

The Xillybus demo bundle is constructed for easy integration with application logic. The place for connecting data is the xillydemo.v or xillydemo.vhd file (depending on the preferred language). All other HDL files in the bundle can be ignored for the purpose of using the Xillybus IP core for transporting data between the host (Linux or Windows) and the FPGA.

Additional HDL files with custom logic designs may be added to the project that was prepared as described in paragraph 3.5 or 3.3, and then rebuilt by clicking “Generate Programming File” or “Generate Bitstream”. There is no need to repeat the other steps of the initial deployment, so the development cycle for logic is fairly quick and simple.

When attaching the Xillybus IP core to custom application logic, it is warmly recommended to interact with the Xillybus IP core only through FIFOs, and not attempt to mimic their behavior with logic, at least not in the first stage.

An exception for this is when connecting memories or register arrays to Xillybus, in which case the example that is shown in the xillydemo module should be followed.

In the xillydemo module, FIFOs are used to perform a data loopback. In other words, the data that arrives from the host is sent back to it. Both of the FIFO’s sides are connected to the Xillybus IP core, so the core is both the source of the data and the consumer of the data.

In a real-life usage scenario, only one of the FIFO’s sides is connected to the Xillybus IP core. The FIFO’s other side is connected to application logic, which supplies or consumes data.

The FIFOs that are used in the xillydemo module work with only one common clock for both sides, as both sides are driven by Xillybus’ main clock. In a real-life application, it may be desirable to replace them with FIFOs that have separate clocks for reading and writing. This allows to driving the data sources and data consumers with a clock other than bus_clk. By doing this, the FIFOs serve not just as mediators, but also for proper clock domain crossing.

Note that the Xillybus IP core expects a plain FIFO interface, (as opposed to First Word Fall Through, FWFT) for streams from the FPGA to the host.

The following documents are related to integrating custom logic:

4.2 Inclusion in a custom project

If desired, it’s possible to include the Xillybus IP core in an existing Vivado / ISE project, or create a new project from scratch.

If the project doesn’t exist already, start a new project, and set it up as based upon your preferred HDL language and intended FPGA.

To include the Xillybus IP core in a Vivado project, it’s recommended to edit xillydemo-vivado.tcl to reflect the custom project’s source files and settings, and create a fresh project by running this script.

To include the Xillybus IP core in an ISE project:

  • When working with FPGAs belonging to the Virtex-5/6 or series-7 families, the PCIe wrapper files need to be generated separately, as detailed in paragraph 3.4 (and also paragraph 4.3.5 for Virtex-5 FPGAs). The generated Verilog files should be added in the custom project (but not the XCO file).

  • Add all files in one of the two src/ subdirectories (depending on your language preference) into the project.

  • Add a directory to the Macro search path: In the process menu, under “Implementation”, right-click “Translate” and choose “Process Properties...”. Add the ’core’ subdirectory to the Macro Search Path property (browsing with the button to the far right). Failing to set this property will make the Translate stage to fail during the implementation, because it won’t find the xillybus_core.ngc file.

  • If the xillydemo module isn’t the top level module of the projects, connect its ports to the top level.

  • To attach the Xillybus IP core to custom application logic, edit the xillydemo module, replacing the existing application logic with the desired one.

4.3 Using other boards

4.3.1 General

When working with a board which doesn’t appear in the list of demo bundles, some slight modifications in the bundle are necessary.

The core generates a few GPIO LED outputs. It’s recommended to connect these to LEDs on the board, if there are any such vacant.

4.3.2 Using Xillybus for PCIe

Most purchased boards have their own example of an FPGA design, which shows how the PCIe interface is used on that board. It’s often easiest to locate the relevant pin assignments in the intended board’s XDC / UCF file, and modify the pins’ names to those that are used in Xillybus’ XDC / UCF file. Then it’s possible to replace the relevant rows in the XDC / UCF file that is used in Xillybus’ project.

The details about how to place the pins are given below.

Note that the most common mistake is with the reference clock of the PCIe bus. Connecting just any clock with the same frequency will not work: The tiny frequency difference between the motherboard’s clock and any other clock will make the transceiver lose lock sporadically, resulting in unreliable communication, and possibly a failure to detect the FPGA as a PCIe device.

Since the Xillybus core is based upon AMD’s PCIe core, AMD’s user guide is a valid source for considerations that are related to PCIe’s physical layer.

4.3.3 Working with Spartan-6 PCIe boards

For Spartan-6, the Xillybus core interfaces with the host through a PCIe bus port, which consists of 7 physical wires as follows:

  • A pair of differential wires for the reference clock, with the names PCIE_250M_P and PCIE_250M_N: A clock with a frequency of 125 MHz (despite the net’s name), which is derived from the PCIe bus clock, is expected on these wires. If a different clock is applied, the AMD PCIe Coregen core (defined by pcie.xco in the bundle) must be reconfigured to expect the real clock frequency. Additionally, a timing constraint must be updated, so that the TS_PCIE_CLK specification reflects the change.

  • The host’s master bus reset on PCIE_PERST_B_LS

  • Serial data input pair, PCIE_RX0_P and PCIE_RX0_N

  • Serial data output pair, PCIE_TX0_P and PCIE_TX0_N

These pins’ assignments are set according to the board’s wiring.

4.3.4 Working with Virtex-6 PCIe boards

For Virtex-6 the wiring is similar:

  • A pair of differential wires for the reference clock, with the names PCIE_REFCLK_P and PCIE_REFCLK_N: A clock with a frequency of 250 MHz, which is derived from the PCIe bus clock, is expected on these wires. If a different clock is applied, the AMD PCIe Coregen core (defined by pcie_v6_4x.xco in the bundle) must be reconfigured to expect the real clock frequency. Such a change may also involve changes in the constraints. Please refer to the example UCF file, generated by Coregen.

  • The host’s master bus reset on PCIE_PERST_B_LS

  • Serial data input vector pair, PCIE_RX_P and PCIE_RX_N (4 wires each)

  • Serial data output vector pair, PCIE_TX_P and PCIE_TX_N (4 wires each)

The pin assignment is made implicitly by placing the transceiver logic. The constraints defining the GTX placements in the UCF file force a certain pinout. Likewise, the placement of the reference clock’s pins is implicitly set by constraining the position of the clock buffer (pcieclk_ibuf). The UCF file in Xillybus’ bundle contains guiding comments.

The UCF file must be edited so that the pin placements of these match those of the intended board.

4.3.5 Working with Virtex-5 PCIe boards

There are two groups of devices within the Virtex-5 family, each requiring a slightly different PCIe interface. To handle this simply, there are two different XCO files in the ’blockplus’ subdirectory, only one of which should be used.

Accordingly, it’s necessary to rename a file in that subdirectory as follows before building the PCIe core:

  • For Virtex-5 LX or Virtex-5 SX: Rename pcie_v5_gtp.xco to pcie_v5.xco

  • For Virtex-5 FX or Virtex-5 TX: Rename pcie_v5_gtx.xco to pcie_v5.xco

IMPORTANT:
The version of PCIe Block Plus generator should be 1.14, and definitely not 1.15. ISE 13.1 has the correct version for this purpose, but the one that arrives with ISE 13.2 will produce faulty code.

If a version other than ISE 13.1 is desired for the overall implementation, it’s possible to generate the Verilog files with the correct version of PCIe Block Plus (included in ISE 13.1). The implementation of the entire project can then be done in the preferred version of ISE.

The UCF file has guiding comments on how the pins should be set up. The placement of the PCIe pins is implicit, and is forced by the constraint on the position of the GTP/GTX component.

A clock with a frequency of 100 MHz is expected on the PCIE_REFCLK wire pair. If a different clock is applied, the AMD PCIe Coregen core (defined by pcie_v5.xco) must be reconfigured to expect the real clock frequency. Additionally, a timing constraint must be updated, so that the TS_MGTCLK specification reflects the change.

4.3.6 Working with Kintex-7, Virtex-7 and Artix-7 boards (PCIe)

All FPGAs in the series-7 family have the same PCIe interface.

  • A pair of differential wires for the reference clock, with the names PCIE_REFCLK_P and PCIE_REFCLK_N: A clock with a frequency of 100 MHz, which is derived from the PCIe bus clock (or connected directly), is expected on these wires.

    If a different clock is applied, the PCIe block (defined by pcie_k7_vivado.xci or similar in the demo bundle) must be reconfigured to expect the real clock frequency. This file appears in the project’s list of sources. Such a change may also involve changes in the timing constraints. Please refer to the example XCF file, generated by AMD’s tools.

    If ISE is used, AMD’s PCIe core is defined by e.g. pcie_k7_8x.xco. The UCF file, rather than XDC, may need adjustment.

  • The host’s master bus reset on PCIE_PERST_B_LS

  • Serial data input vector pair, PCIE_RX_P and PCIE_RX_N (8 or 4 wires each)

  • Serial data output vector pair, PCIE_TX_P and PCIE_TX_N (8 or 4 wires each)

The pin assignment is made implicitly by placing the transceiver logic. The constraints defining the GTX placements in the UCF/XDC file force a certain pinout. Likewise, the placement of the reference clock’s pins is implicitly set by constraining the position of the clock buffer (pcieclk_ibuf). The UCF / XDC file in Xillybus’ demo bundle contains guiding comments.

The UCF/ XDC file must be edited so that the pin placements of these match those of the intended board.

4.3.7 Working with Ultrascale and Ultrascale+ boards (PCIe)

All of these FPGAs have the same PCIe interface.

  • A pair of differential wires for the reference clock, with the names PCIE_REFCLK_P and PCIE_REFCLK_N: A clock with a frequency of 100 MHz, which is connected directly to the PCIe bus’ clock.

    If a different clock is applied, the PCIe block (defined by pcie_ku_vivado.xci or similar in the demo bundle) must be reconfigured to expect the real clock frequency, and the timing constraint for this clock must be updated in xillydemo.xdc. These files appears in the project’s list of sources.

  • The host’s master bus reset on PCIE_PERST_B_LS

  • Serial data input vector pair, PCIE_RX_P and PCIE_RX_N (8 or 4 wires each)

  • Serial data output vector pair, PCIE_TX_P and PCIE_TX_N (8 or 4 wires each)

The pin assignment is made implicitly by placing the transceiver logic. The constraints defining the GTX placements in the XDC file force a certain pinout. Likewise, the placement of the reference clock’s pins is implicitly set by constraining the position of the clock buffer (pcieclk_ibuf). The XDC file in Xillybus’ demo bundle contains guiding comments.

The XDC file must be edited so that the pin placements of these match those of the intended board.

4.3.8 Working with Versal ACAP boards (PCIe)

These FPGAs have the following PCIe interface:

  • A pair of differential wires for the reference clock, with the names PCIE_REFCLK_P and PCIE_REFCLK_N: A clock with a frequency of 100 MHz, which is connected directly to the PCIe bus’ clock.

    If a different clock is applied, the PCIe controller in the CPM block (which is inside the CIPS IP in the pcie_versal block design) must be reconfigured to expect the real clock frequency.

  • Serial data input vector pair, PCIE_RX_P and PCIE_RX_N (8 wires each)

  • Serial data output vector pair, PCIE_TX_P and PCIE_TX_N (8 wires each)

The host’s master bus reset is connected directly to PMC MIO 38. If another MIO pin is used for this signal, the CIPS IP must be configured accordingly (see Xillybus’ tutorial page for more information).

The XDC doesn’t contain any constraints that are related to the PCIe block. Both the timing constraints and the pin placement constraints are supplied by the CIPS IP implicitly. Note that the pin placements can’t be moved, since the CPM is used for the PCIe interface.

The procedure for changing the PCIe block’s parameters is different from other FPGAs: The PCIe part is implemented as a block design. In this block design, there is a block that is named pcie_block_support. This block contains the transceivers and clock resources that are used by the PCIe block. It is therefore required to update pcie_block_support after changing the number of lanes or the link speed. It’s not enough to only update pcie_block.

The method for updating pcie_block_support is to delete this block and let Vivado regenerate it. This way, the block is created with the updated parameters of the PCIe block.

It is recommended to create a visual copy of the block design before starting this procedure. This will be helpful, because it is required to reconstruct the same block design with the updated pcie_block_support block.

The steps for this procedure are as follows:

  • Delete the pcie_block_support block and all its external ports. This means to delete ports that are not connected to anything (e.g. “pcie_mgt”), and also delete ports that are connected to a block (e.g. “m_axis_cq_0”).

  • Delete the connection of the reset signal between versal_cips_0 and pcie_block.

  • In response to the previous step, Vivado should suggest “Run Block Automation”. Click on that that suggestion. Vivado will open pop-up windows in response. Verify that the PCIe parameters are correct, and click “OK”. Note that Vivado will also suggest “Run Connection Automation”, however this option is not sufficient.

  • Vivado will add a new pcie_block_support block and make several connections.

  • Remove the external port that is related to “sys_reset”. Instead, connect versal_cips_0’s pl_pcie0_resetn to the sys_reset inputs of two blocks: pcie_block and pcie_block_support. After doing this, “sys_reset” is connected like it was before.

  • Select all pins of the PCIe block that aren’t connected to anything (it’s possible to use CTRL-click for this purpose). Make these ports external, by using right-click>Make External. Vivado will create ports for all pins. The name of each external port will be like the net’s name, with a “_0” suffix added.

4.3.9 Working with XillyUSB

XillyUSB can be used on other boards that have an SFP+ interface. In this case it’s just a matter of setting the design’s constraints to use the MGT that is wired to the SFP+ connector.

The board should also supply a 125 MHz reference clock with low jitter for the MGT. Despite the requirement in the USB specification, Spread Spectrum Clocking (SSC) should not be enabled (if such option exists): The MGT doesn’t lock properly on the received signal if an SSC reference clock is used.

For custom boards, it’s recommended to refer to the sfp2usb module’s schematics, as the pins of the SFP+ connector are connected directly to the FPGA’s MGT. It is optional to swap the SSRX wires, as done on the sfp2usb module. This is recommended only if it simplifies the PCB design.

Swapping the SSTX wires is also possible, if desired. This requires editing the *_frontend.v file so that the polarity of the transmitted bits is reversed, and hence compensates for the wire pair swap. Note that there’s a good chance that the USB connection will operate properly even without this edit, since the USB specification requires the link partners to work properly even with a polarity swap. It’s however recommended to not rely on this.

4.4 PRSNT pins for indicating the number of PCIe lanes

According to the PCIe spec, there is one or more pins on the PCIe connector, which indicate the presence of the peripheral in the PCIe slot, as well as the number of lanes. These are the PRSNT pins. Most development boards have DIP switches for adjusting how many lanes the host is informed about, by virtue of these pins.

The typical default setting of these pins is the maximal number of lanes that is possible with the board. This setting usually works, even if less lanes are actually used. This is because an initial negociation between the host and the peripheral (which is required by the PCIe spec) ensures the correct detection of the actual number of lanes.

Please refer to your board’s reference manual about how to set these DIP switches. It’s important not to set these DIP switches to less lanes than are actually used, since some hosts may ignore lanes as a result of a faulty setting.

4.5 Changing the number of PCIe lanes and/or link speed

4.5.1 Introduction

IMPORTANT:
Changing the link’s parameters may require adjustments in the timing constraints. Failing to pay attention to this issue can lead to a PCIe link that works, but in an unreliable manner.
Always be sure to have properly adjusted the timing constraints, if necessary, after making changes. This topic is detailed below.

Xillybus’ FPGA demo bundles are typically set to the maximal number of lanes available on the intended board, and a link speed of 2.5 GT/s (Gen1).

The rationale is that if an FPGA board fits into the PCIe connector of a motherboard, one can expect that all lanes will be used in the connection with the host. On the other hand, in almost all cases, the bandwidth that is achieved by these lanes is higher than the Xillybus IP core may utilize, even with 2.5 GT/s, and it’s hence pointless to set a higher link speed.

As the PCIe specification requires a fallback capability to lower speeds from all bus components that are involved, picking 2.5 GT/s ensures a uniform behavior on all motherboards.

It’s however often desired to change the number of lanes and their link speed, in particular when using the Xillybus IP core on a custom board. Less lanes with higher link speed is a common requirement.

The Xillybus IP core relies on AMD’s PCIe block for the low-level interface with the PCIe bus. Accordingly, the IP core works properly as long as AMD’s PCIe block operates properly, regardless of the number of lanes or their link speed.

If the PCIe block is configured with a low number of lanes, combined with a link speed, such that its bandwidth capability is lower than Xillybus IP core’s, it will still work properly. In this case, the aggregate bandwidth offered by Xillybus’ streams approximately equals the bandwidth limit imposed by the settings of the PCIe block. It’s a question of what becomes the bottleneck.

4.5.2 The work procedure

For Versal ACAP FPGAs, please refer to section 4.3.8. The description below is intended for all other FPGAs.

In principle, changing the number of lanes and/or the link speed consists of making changes in the configuration of the PCIe block as desired. However there are a few issues to pay attention to:

  • The modification may influence other parameters of the PCIe block, that may cause it to fail operating correctly. Among others, there’s a bug in AMD’s GUI tool to watch out for, as detailed below.

  • The modification may change the frequencies of the clocks driving the PCIe block (the PIPE clocks), and hence requires changes in timing constraints.

  • The said change in clock frequencies may also require changes in Verilog code that supports the PCIe block (the instantiation of the pipe_clock module, where applicable), or else the PCIe block will not function.

The stages are hence as follows:

  1. Make a copy of the XCO or XCI file on an active project (i.e. after Vivado or ISE has upgraded the IP as necessary). This will allow comparing the changes with a diff tool afterwards, and spot unwanted changes, if such occur.

  2. Open the IP of the PCIe block in Vivado (or ISE) for configuration (with Versal FPGAs, open the CPM unit in the CIPS IP, which is the only block in the pcie_versal block design).

  3. Change the number of lanes and/or Maximum Link Speed as desired, while paying attention not to change the (AXI) interface width. If it’s possible to avoid changing the (AXI) Interface Frequency by choosing a combination of lanes and link speed that is adequate for the task, that is preferable.

  4. After making the desired changes, verify that the Vendor ID and Device ID haven’t changed (neither the Subsystem counterparts). Some revisions of Vivado may reset some parameters to their defaults as a result of unrelated modifications (this is a bug).

  5. Confirm the changes (typically click “OK” at the bottom of the dialog box). There is no need to generate output products, if this is suggested following the confirmation.

  6. Compare the updated and previous XCO or XCI files with a textual diff tool, and verify that only relevant parameters have changed. More on this below.

  7. Adjust the signal vector width of PCIE_* in xillybus.v and xillydemo.v/.vhd, so they reflect the new number of lanes.

  8. Adjust the PIPE clock module’s instantiation if necessary, as described in paragraph 4.5.3 below.

  9. Adjust timing constraints if necessary, as described in paragraph 4.5.4 below.

  10. Update the PIPE clock module, as explained in paragraph 4.5.5.

The three last steps are not required when working with Ultrascale and later.

When comparing new and old XCI files, PARAM_VALUE.Device_ID should get special attention, as it’s often changed accidentally.

The differences in the parameters of the XCI files should match those desired. This is a short list of possible parameters for which changes are acceptable in accordance with the changes made in Vivado. The names of the parameters should be taken with a grain of salt, as different revisions of Vivado (and hence different revisions of the PCIe block) may represent the attributes of the PCIe block with different XML parameters.

  • Related to the number of lanes:

    • PARAM_VALUE.Maximum_Link_Width

    • MODELPARAM_VALUE.max_lnk_wdt

  • Related to the link speed:

    • PARAM_VALUE.Link_Speed

    • PARAM_VALUE.Trgt_Link_Speed

    • MODELPARAM_VALUE.c_gen1

    • MODELPARAM_VALUE.max_lnk_spd

  • Related to the the interface frequency. A change in these parameters is a strong indication that the stages in paragraphs 4.5.3 and 4.5.4 are necessary.

    • PARAM_VALUE.User_Clk_Freq

    • MODELPARAM_VALUE.pci_exp_int_freq

4.5.3 Has the PIPE frequency changed?

When working with Ultrascale FPGAs and later, the considerations and actions below are unnecessary, as their PCIe block supplies the timing constraints as an integral part of the IP itself. Same goes for the PIPE module.

As for other FPGA families, it’s important to verify that the PIPE clock settings are correct as follows:

Generate an example project for the PCIe block after the changes, and run a synthesis of that project. In Vivado, this is typically done by right-clicking the PCIe block in the project’s source hierarchy and select “Open IP Example Design...”. After selecting a location for the design, and it has been generated, launch the synthesis by clicking “Run Synthesis” at the left column.

Next, obtain the PIPE clock module’s instantiation parameters in the synthesis report (in Vivado, it’s found as something like pcie_example/pcie_example.runs/synth_1/runme.log). In this report, search for a segment like:

INFO: [Synth 8-638] synthesizing module 'example_pipe_clock' [...]
    Parameter PCIE_ASYNC_EN bound to: FALSE - type: string
    Parameter PCIE_TXBUF_EN bound to: FALSE - type: string
    Parameter PCIE_CLK_SHARING_EN bound to: FALSE - type: string
    Parameter PCIE_LANE bound to: 4 - type: integer
    Parameter PCIE_LINK_SPEED bound to: 3 - type: integer
    Parameter PCIE_REFCLK_FREQ bound to: 0 - type: integer
    Parameter PCIE_USERCLK1_FREQ bound to: 4 - type: integer
    Parameter PCIE_USERCLK2_FREQ bound to: 4 - type: integer
    Parameter PCIE_OOBCLK_MODE bound to: 1 - type: integer
    Parameter PCIE_DEBUG_MODE bound to: 0 - type: integer

The parameters in this report must match those in the instantiation of pipe_clock, as they appear in xillybus.v, which is of the form:

  pcie_[...]_pipe_clock #
     (
       .PCIE_ASYNC_EN                               (   "FALSE" ),
       .PCIE_TXBUF_EN                               (   "FALSE" ),
       .PCIE_LANE                                   (   6'h08 ),
       .PCIE_LINK_SPEED                           ( 3   ),
       .PCIE_REFCLK_FREQ                            (   0 ),
       .PCIE_USERCLK1_FREQ                          (   4 ),
       .PCIE_USERCLK2_FREQ                          (   4 ),
       .PCIE_DEBUG_MODE                             (   0 )
       )
     pipe_clock
        (
          [ ... ]
         );

The three parameters to compare are PCIE_LINK_SPEED, PCIE_USERCLK1_FREQ and PCIE_USERCLK2_FREQ, which must match. If they do (as shown in the example), all is set correctly, including the timing constraints. If not, two actions must be taken:

  • The instantiation parameters in xillybus.v must be updated to match those in the example project’s synthesis report.

  • The timing constraints must be adapted to the example project’s. This is more difficult, because failing to do this correctly doesn’t necessary cause a problem immediately, but may impact the design’s reliability.

If the PCIE_LANE parameter in xillybus.v is larger than the example project’s, there is no problem leaving it that way, and it’s often easier to do so.

4.5.4 Adapting the timing constraints

It’s mandatory to adjust the timing constraints to reflect changes in the clocks of the PCIe block, if such have occurred.

As the constraints depend on the chosen FPGA as well as Vivado’s revision, it may be somewhat difficult to get this done correctly. Avoiding this adjustment is the main motivation for attempting to keep the PIPE clock’s frequency unchanged by selecting a combination of link speed and number of lanes (when possible). However even if the PIPE clock’s frequency remains unchanged, updating the constraints may still be necessary.

Once again, when an FPGA of the Ultrascale family (and later) is used, there is no need to deal with timing constraints, because the IPs of their PCIe block handle this internally.

In order to adjust the timing constraints, first find the constraints of the example project. With Vivado, it’s typically as a file of the form
example.srcs/constrs_1/imports/example_design/xilinx_*.xdc.

It’s highly recommended to generate one example project with the PCIe block’s setting before the changes in number of lanes and/or link speed, and one example project after these changes. A simple diff between the two example projects’ constraint files gives the definite answer to whether adaption of the constraints is required, and if so, in what way.

Compare the constraint file’s “Timing constraints” section with those in xillydemo.xdc. The example project selects logic elements by their absolute position in the logic hierarchy, so some editing is necessary. For example, suppose a timing constraint like this in the example project:

set_false_path -to [get_pins {pcie_vivado_support_i/pipe_clock_i/
pclk_i1_bufgctrl.pclk_i1/S0}]

In xillydemo.xdc it should be written as:

set_false_path -to [get_pins -match_style ucf */pipe_clock/
pclk_i1_bufgctrl.pclk_i1/S0]

The main differences are with the relative paths used in Xillybus’ constraints. There might also be other slight differences, as some constraints are necessary with earlier revisions of AMD’s tools, and become superfluous with later ones.

After making the changes in the timing constraints, it’s important to verify that they took effect on logic by reviewing the timing report after the design’s implementation.

Finally, it’s worth explaining the following two constraints, which are present in xillydemo.xdc of some demo bundles:

set_case_analysis 1 [get_pins -match_style ucf */pipe_clock/
pclk_i1_bufgctrl.pclk_i1/S0]
set_case_analysis 0 [get_pins -match_style ucf */pipe_clock/
pclk_i1_bufgctrl.pclk_i1/S1]

These constraints are required for Gen1 PCIe blocks, when using quite old revisions of Vivado, as explained in Xilinx AR #62296. Hence they may be omitted with recent AMD tools.

4.5.5 Updating the PIPE clock module

As mentioned above, this step isn’t required for Ultrascale FPGAs and later.

In some cases, in particular when increasing the link speed from or to 2.5 GT/s (Gen1), it’s required to update the pcie_*_vivado_pipe_clock.v file, which resides in the vivado-essentials/ directory.

This file is generated automatically as part of the initial project setup by virtue of executing xillydemo-vivado.tcl. It may change slightly depending on the configuration of the PCIe block, in particular if it’s limited to 2.5 GT/s or not.

The recommended way is to regenerate the Vivado project with the xillydemo-vivado.tcl script. Namely, start from a fresh demo bundle, and copy the files that were changed during the previous stages into it:

  • The XCI file of the PCIe block

  • xillydemo.xdc

  • The Verilog / VHDL files that were edited for adapting to the new number of lanes and link speed.

With these files copied into the new demo bundle, generating the project with the xillydemo-vivado.tcl script ensures that the PIPE clock module is in accordance with the settings of the PCIe block, and also that the project doesn’t depend on any leftover files from before the change.

Alternatively, update the pcie_*_vivado_pipe_clock.v file from the example project that was created in paragraph 4.5.3. The file to be used has exactly the same name as the one in vivado-essentials/, and is typically found deep into the example project’s file hierarchy. Copy this file into vivado-essentials/ (overwriting the existing one).

4.6 Changing the FPGA part number

When migrating from one FPGA family to another, it’s necessary to start from a different demo bundle. There are differences (which are sometimes subtle) that relate to the PCIe block, (as well as the MGT block with XillyUSB). These differences require a different Xillybus IP core as well as different wrapper modules.

Attempting to just change the project’s part in Vivado / ISE may result in a project with errors during its implementation. Even if the implementation is successful, the logic may not work, or work unreliably.

However when remaining within the same FPGA family, changing the part number is often sufficient (along with the considerations mentioned above regarding pin placement and constraints).

It’s important to note that for some FPGA families (Ultrascale in particular), the position (site) of the PCIe block in the logic fabric is an attribute of the PCIe block itself, and it may therefore require modification. Moreover, each specific FPGA, and each specific package, has its own set of valid sites. Hence the change of FPGA may reset the PCIe IP’s attributes, if the site that is selected for the PCIe block doesn’t exist on the new FPGA.

Vivado’s reaction to an invalid site in the PCIe block’s position (site) is quite destructive. If this is the case, the “upgrade” of the PCIe block (the operation which is always required to unlock the IP after changing the FPGA) results in resetting several attributes of the PCIe block, to arbitrary values. While doing so, a Critical Warning like the following is generated:

CRITICAL WARNING: [IP_Flow 19-3419] Update of 'pcie_ku' to current project
options has resulted in an incomplete parameterization. Please review the
message log, and recustomize this instance before continuing with your design.

Attempting an implementation of the project without paying attention to this issue is completely pointless, and leads not just to a large number of misleading warnings, Critical Warnings and possibly errors, but the result, if any, will be far from being functional.

The solution is to assign a PCIe block site that is valid on the new FPGA, before changing the part number attribute for the project. This might require editing the XCI file manually if there is no common site between the FPGA part before and after the change (because in this case, it will not be possible to make this change in the GUI, which limits the setting to the sites allowed on the current FPGA part).