Xilinx Partial Reconfiguration over PCIe / USB 3.x with Xillybus

Published: 4 March 2022

Xillybus and Dynamic Function eXchange (DFX) / Partial Reconfiguration

This page describes how to use a stream of Xillybus / XillyUSB IP cores to partially reconfigure a Xilinx device. This boils down to generating a .bin file from the .bit file for partial configuration, with a Vivado Tcl command like

write_cfgmem -format BIN -interface SMAPx32 -loadbit "up 0x0 partial.bit" partial.bin

and then use the bitstream file with a single command at shell prompt:

$ cat partial.bin > /dev/xillybus_write_32

An equivalent command can be used in Windows, as well as any simple computer program on Linux or Windows, that copies data from one file to another.

For this to work, the FPGA to be reconfigured needs to be installed as a PCIe device on to the said computer, with a Xillybus IP core implemented in its logic. Alternatively, it's connected to a USB 3.x port of the computer, with a XillyUSB core included in the FPGA's logic.

Since both the configuration logic and the IP core are implemented in the FPGA's logic fabric, a full configuration is not possible — it would halt the logic's operation at an early stage of the configuration, and hence never finish it. Partial Reconfiguration can nevertheless be used for programming virtually the entire FPGA, as any logic element can be reprogrammed (on Ultrascale devices and later), except for the PCIe interface, the Xillybus IP core and the reconfiguration logic. For non-Ultrascale FPGAs, only CLBs, memories and multipliers can be replaced this way, which is often enough for practical purposes.

Setting up a Vivado project for obtaining bitstreams for Partial Reconfiguration is not a trivial task. Xilinx' user guide, UG909, is the authoritative resource on this matter, and there's also a comprehensive tutorial that is easier to start with.

This page is written assuming knowledge in basic Verilog and the usage of Vivado (in particular generating IPs for clocking and a FIFO).

The ICAP interface

There are several ways to supply a configuration bitstream to a Xilinx FPGA. The most commonly used are the JTAG interface for lab operation, and the SPI interface for flash memories, which the FPGA uses to load the bitstream on powerup. A slightly less common interface is SelectMAP, which allows an external microprocessor to push a bitstream into the FPGA through parallel data pins that form an 8-, 16- or 32-bit wide word, a configuration clock and a few other control signals.

For partial reconfiguration from within the FPGA itself, Xilinx devices have a block called Internal Configuration Access Port (ICAP), which can be instantiated as a primitive in logic designs. The ICAP essentially implements the SelectMAP interface, with some differences. The important notion regarding external SelectMAP vs. ICAP is that the data transmission protocol, as well as the data format of the bitstream, are exactly the same.

As for differences, the most notable one is that the ICAP primitive pinout has different pins for input and output parallel data, while the external SelectMAP's pins are bidirectional. This is quite natural, given that bidirectional pins is an unnecessary complication inside the logic fabric, and a common method to save external, physical pins. Regardless, reading data from the configuration interface is unnecessary for the sake of configuring or reconfiguring a device. Reading from this interface is primarily used for bitstream verification.

When not used for reading data, the ICAP's output pins present status information, as elaborated below.

Modifying the xillydemo project

The goal is to integrate a configuration port (ICAP) into xillydemo.v (the top-level Verilog module in Xillybus demo bundles) so that the "cat" shell command shown above can be used for Partial Reconfiguration.

The starting point for this procedure is a fresh demo bundle for the targeted FPGA, as can be downloaded from this page for PCIe and this page for USB 3.x.

For those without experience with Xillybus, it's highly recommended to first implement the demo bundle as is, following the Getting Started Guide for Xilinx. Once done, perform the "Hello, world" test in the Getting Started Guide for Linux or section 4.3 (or similar) in the Getting Started Guide for Windows, as applicable. Completing this will get the Xillybus related preparations out of the way, allowing to focus on the relevant topics.

The stages for modifying the demo bundle, each elaborated in the following sections, are as follows:

Generating a clock for the ICAP module
Creating a dual-clock FWFT FIFO for insertion between the Xillybus IP core and the ICAP module
Removal of Xillydemo's loopback FIFO (fifo_32x512), and instantiation of the new FWFT FIFO instead
Instantiation of the ICAP module
Add logic for reporting configuration status (optional).

This guide utilizes two Xillybus streams that are already in place in the demo bundle, xillybus_write_32 and xillybus_read_32. For a real-life application, it's recommended to set up a custom IP core with dedicated Xillybus streams for this purpose. This allows not only to select meaningful names for the streams (e.g. xillybus_reconfig) but also to set their attributes to better suit their purpose.

Note that code segments are given throughout this page as part of the explanations, however the Verilog code for convenient copy-paste is given at the end of this page.

1. Clock for ICAP module

Xillybus' bus_clk on demo bundles is 250 MHz on all series-7 and Ultrascale FPGAs, except for Artix-7, for which bus_clk is 125 MHz.

The ICAP's maximal clock is however 100 MHz for series-7 FPGAs, and 200 MHz for Ultrascale (See UG909's Table 17). A separate clock, icap_clk, is therefore required. Any clock can be used as long as its frequency is within the allowed limit for the ICAP.

For simplicity, Vivado's Clocking Wizard IP can be used to generate a PLL/MMCM module, which relies on bus_clk as a reference. The Verilog code to add looks like this:

   wire        icap_clk;
   wire        icap_clk_locked;

   icap_clk_gen icap_clk_gen_ins
     (
      .clk_out1(icap_clk),
      .reset(quiesce),
      .locked(icap_clk_locked),
      .clk_in1(bus_clk));

The clock generator is reset by Xillybus' "quiesce" signal, so bus_clk is guaranteed to be stable when it goes out of reset.

Note that using bus_clk as the reference for icap_clk will make Vivado consider them as related, and unnecessarily apply timing constraints to paths between these two. This is likely to cause a failure to meet timing on paths that cross clock domains. Hence it's necessary to add this (or equivalent) to xillydemo.xdc in order to define these two clocks as independent:

set_clock_groups -asynchronous \
    -group [get_clocks -include_generated_clocks -of_objects [get_pins -hier -filter {name=~*icap_clk_gen_ins/clk_in1}]] \
    -group [get_clocks -include_generated_clocks -of_objects [get_pins -hier -filter {name=~*icap_clk_gen_ins/clk_out1}]]

2. Creation of FWFT FIFO

The demo bundle includes a FIFO named fifo_32x512, which is used to loop back data from xillybus_write_32 to xillybus_read_32. It's unsuitable for interfacing with the ICAP port, in particular as it is a single-clock FIFO.

Hence a new FIFO needs to be generated with Vivado's FIFO Generator, having the following attributes:

Named icap_fwft_fifo
Set as First Word Fall Through (FWFT)
Independent clocks (rd_clk, wr_clk)
32 bits wide
With asynchronous reset input port

The implementation type (block RAM, built-in FIFOs etc.) doesn't matter. A depth of 512 words is fine, but this has a minor significance.

3. Replacing FIFOs

Remove the instantiation of fifo_32x512 (instantiated as fifo_32), and instead add this code:

   wire [31:0] icap_data;
   wire        icap_data_empty;

   icap_fwft_fifo icap_fwft_fifo_ins
     (
      .rst(!icap_clk_locked),
      .wr_clk(bus_clk),
      .rd_clk(icap_clk),
      .din(user_w_write_32_data),
      .wr_en(user_w_write_32_wren),
      .full(user_w_write_32_full),
      .rd_en(!icap_data_empty),
      .dout(icap_data),
      .empty(icap_data_empty)
      );

For those not familiar with Xillybus: The connection of the user_w_write_32_* wires between the Xillybus IP core and icap_fwft_fifo makes sure that data that is written to the xillybus_write_32 device file fills the FIFO reliably. In particular, a data flow mechanism ensures that the FIFO never overflows, and no data is lost.

Note that rd_en is assigned with !icap_data_empty, so the FIFO displays its content as soon as it's available. As icap_clk is always slower than bus_clk, the FIFO can get full nevertheless, which is fine.

4. ICAP instantiation and logic for obtaining status

With all preparations done, the ICAP module can be instantiated. This is the instantiation for series-7 FPGAs:

   wire [31:0] icap_out;

   ICAPE2 #(.ICAP_WIDTH("X32")) icap_ins
     (
      .O(icap_out),
      .CLK(icap_clk),
      .CSIB(icap_data_empty),
      .I(icap_data),
      .RDWRB(1'b0) // Always write
      );

For Ultrascale FPGAs and later, the ICAPE3 primitive is instantiated as follows:

   wire [31:0] icap_out;

   ICAPE3 icap_ins
     (
      .O(icap_out),
      .CLK(icap_clk),
      .CSIB(icap_data_empty),
      .I(icap_data),
      .RDWRB(1'b0), // Always write
      .AVAIL(),
      .PRDONE(),
      .PRERROR()
      );

As evident from comparing the two ICAP primitives, the latter has three additional ports, none of which is essential.

The meaning of the ICAP's ports are the same as the physical SelectMAP interface, as described in the relevant FPGA's Configuration Guide. However the bidirectional data pins appear separately as I and O in the ICAP primitive.

Note that because the FIFO is FWFT, icap_data_empty is '0' when the icap_data contains valid data, and '1' otherwise. CSIB, which is the ICAP's chip select input, active low, is hence asserted when there is valid data on its data input. The FIFO's rd_en connection ensures each word is visible for only one clock cycle.

The RDWRB input is driven low to request write cycles only (no readbacks). icap_out will be used for obtaining configuration status.

Going this far is enough for successful reconfiguration of the FPGA. The rest of this page is dedicated to ensuring that the reconfiguration was indeed successful, as well as other aspects of carrying out this task reliably.

5. Add status checks

If the Verilog code has been set up as described, and the bitstream files are correctly prepared, there's no reason why the Partial Reconfiguration would fail. The overwhelming majority of failures are the result of human error.

Status checking may be an unnecessary complication of the procedure for those who suffice with a quick solution that just works. The main advantage of implementing it is to easier recover from mistakes made when creating and using the bitstream files.

The part to add to xillydemo.v for status monitoring is as follows:

   reg [3:0]   dalign_count;
   wire        dalign;
   reg [3:0]   dalign_d;
   reg [3:0]   icap_status_d;
   reg [3:0]   icap_status_sync;

   assign dalign = icap_out[6];
   always @(posedge bus_clk)
     begin
	icap_status_d <= icap_out[7:4];
	icap_status_sync <= icap_status_d;

	dalign_d <= { dalign_d, dalign }; // Crossing clock domains + delay
	if (!dalign_d[3] && dalign_d[2])
	  dalign_count <= dalign_count + 1;
     end

   assign user_r_read_32_empty = 0;
   assign user_r_read_32_eof = 0;
   assign user_r_read_32_data = { dalign_count, icap_status_sync };

This piece of code all revolves around icap_out[7:4], which are the ICAP module's status outputs. The idea is that since the data pins of a physical SelectMAP interface are either inputs or outputs, there is no selectMAP-related definition for the ICAP primitive's output port during write cycles. Accordingly, these four of the 32 bits are used to report the state of the configuration port.

The meaning of these four bits is given briefly in Table 20 of UG909, but only two of these are of interest:

DALIGN = icap_out[6], copied to icap_status_sync[2]: Is '1' (active high) when the ICAP considers itself synced, i.e. in a configuration session.
CFGERR_B = icap_out[7], copied to icap_status_sync[3]: Is '0' (active low) when the ICAP's internal error flag is asserted.

The next section explains how the ICAP module treats the data stream that is written to it, and further clarifies how these two important flags behave.

For now, observe that user_r_read_32_data is assigned with a copy of the status bits (icap_status_sync, which is synchronized to bus_clk) and dalign_count, which increments each time dalign goes from '0' to '1'. Why the latter is useful is also explained in the next section.

As user_r_read_32_empty and user_r_read_32_eof are tied low, the value of icap_status_sync is available by reading /dev/xillybus_read_32. In fact, it's repeated indefinitely as a 32-bit word. There is however no point reading further than the first byte. As mentioned above, a real-life application is best made with a custom IP core, which would define an 8-bit stream for this purpose, and with better suited parameters.

The procedure for writing the bitstream with status checks is as follows. Why this makes sense is explained in the next section.

Open the xillybus_read_32 device file for read, read a byte, and close the file.
Check bit 2 of this byte (DALIGN). It it's '1', stop and report an error: The configuration port is already in the middle of a session, and therefore it's unclear how the arriving data might be interpreted.
Open the xillybus_write_32 device file for write, write the .bin file into it, and close the file. This is effectively the "cat" command at the top of this page.
Once again, open the xillybus_read_32 device file for read, read a byte, and close the file.
Check bits [7:4] of the recently read byte. Their value should be the value of the same bits in the previously read byte, plus one (or another constant that is typical to the FPGA). In other words, bits [7:4] should have incremented. If this is not the case, report an error: The bitstream was ignored by the configuration port, most likely because it's of the wrong format.
Else, check bit 3 of this byte (CFGERR_B). If it's '0', report a configuration bitstream error.
Else, check bit 2 of this byte (DALIGN). If it's '1', report an error: The bitstream didn't arrive completely (the writing processes died in the middle or the bitstream file itself was cut short).

Note that this sequence changes slightly if an ABORT condition is added after each configuration, as suggested below.

The status check sequence explained

In order to understand the status checks along the bitstream programming sequence, a look on the bitstream's format is required. Recall that ICAP is an internal SelectMAP port, so the same bitstream file can be used for either.

A configuration .bin bitstream file consists of a series of 32-bit words, which can be broken into the following segments (for the purpose of understanding the status signals):

Several dummy pad words (0xffffffff).
Two words that allow SelectMAP's bus word width autodetect (0x000000BB and 0x11220044).
Two more dummy pad words (0xffffffff).
The Sync Word (0xaa995566). This sets DALIGN to '1', meaning that the data words from now on are treated as configuration commands by the ICAP port.
Some general configuration commands, most notably an RCRC = Reset CRC register. This deasserts CFGERR_B, i.e. turns it to '1'.
The body of programming commands. This is the part that most of the .bin file consists of. Commands that request CRC checks appear in this section several times.
The command before the last is START, which triggers the startup sequence for waking up the logic that has been reconfigured.
The last command is DESYNC, which resets DALIGN to '0'. Command words coming after this are ignored (except for the next Sync Word).
Several NOP commands (0x20000000) may follow in order to pad the bitstream.

The most important thing to note is that the ICAP ignores the data written to it unless it begins with the synchronization sequence shown above. This means that if some junk data is accidentally written to it before the actual bitstream, odds are that this will have no effect. It also means that if the bitstream is ill-formed, it's likely to be completely ignored. Most notable is a bitstream that was produced directly by write_bitstream with the -bin_file flag: The output is a legal .bin file indeed, however with the bits appearing in the wrong order. As a result, the Sync Word appears incorrectly, is ignored, and nothing happens when such bitstream is applied. The correct command for generating a .bin file is given at the beginning of this page.

Another thing to note is that the bitstream contains the Sync Word which asserts DALIGN at the beginning, and the DESYNC command at the end that deassert it. DALIGN is hence asserted only during the configuration process, and not after it. However if the data flow stops in the middle of configuration, DALIGN will remain '1', and any data that arrives afterwards is considered configuration data.

This is why the logic has the dalign_count register, which increments when DALIGN goes from '0' to '1': A successful bitstream programming session starts with DALIGN at '0' and ends with DALIGN at '0'. The counter allows verifying that is was '1' as the configuration data was written to the ICAP, or else it was completely ignored.

There are two other ways that DALIGN can turn '0':

An ABORT sequence, which is discussed further below.
An error during configuration, in particular a CRC error.

Note that some partial bitstreams have more than one occurrence of a Sync Word and DESYNC pair, so the counter will increment more than once during a successful reconfiguration session. The software checking this counter should hence verify the correct difference in the counter's value.

The next issue to look at is errors, as reported with CFGERR_B due to a failed CRC check or some other configuration error. The FPGA's INIT_B is also pulled low when CFGERR_B is asserted, so there is often a visible indication for this by virtue of a LED on the PCB.

The CRC checks are made by the FPGA in response to explicit commands in the bitstream. Even though it's possible to generate a bitstream without CRC checks, the default and common case is that there's at least a CRC check towards the end of the bitstream, covering all configuration data.

It's also possible to request a CRC check after each partial reconfiguration frame by setting the PerFrameCRC property for the bitstream (See UG909).

As the RCRC at the beginning of the bitstream deasserts CFGERR_B, the only reason it might be asserted (that is, '0') after writing a bitstream is an error in that bitstream. That's why the CFGERR_B bit is checked only after the attempted configuration — if it was asserted before the sequence, it's due to a previous error.

Note that the configuration port responds to other errors (e.g. mismatching device ID code) in the same way as a CRC error: CFGERR_B is asserted, INIT_B pulled low, DALIGN is deasserted, and RCRC clears the error condition.

Motivation for adding an ABORT condition

The status check sequence outlined above has a fundamental problem: If DALIGN is '1' to begin with, it does nothing. This will be the case if a previous bitstream file was written partially, leaving the configuration port waiting for more data. The brute force way to recover from this state is to write a chunk of just anything to xillybus_write_32 with the purpose of causing a configuration error, and hence DALIGN goes back to '0'.

This is a safe method for recovering from a failed full configuration through the external SelectMAP interface, because the FPGA is kept inactive during the configuration process, and brought active only by an explicit START command. Since any sane bitstream has a CRC check command before the START command, the FPGA will never get active accidentally.

This is however irrelevant for ICAP configuration, which can only be used for partial reconfiguration. In this case, the static logic of the FPGA is active as the logic elements are reprogrammed. Hence if junk data is treated as configuration commands, the effect can be visible and possibly damaging, in particular if the uncontrolled outputs from the FPGA can cause harm.

The DALIGN status check before writing bitstream data is hence a safety precaution, but it doesn't offer a way to recover from this problematic state.

Note that the ICAP primitive has no reset input. Instead, it offers a simple pattern for aborting an operation: If its RDWRB input changes value from one CLK cycle to another, while while CSIB is asserted (i.e. '0'), the ICAP enters a four-clock abort sequence that ends with DALIGN at '0', so a fresh bitstream sequence can start.

The next section shows how to use this abort condition to bring the ICAP block to a safe mode after each bitstream session.

Implementing ABORT after a bitstream write

The Verilog code presented next generates an ABORT condition when the xillybus_write_32 device file is closed and the FIFO is empty. Or, in simpler words, after each session of writing data to the ICAP. As a result, DALIGN is guaranteed to be '0' after the session is complete, and it's therefore always safe to start a new one.

   reg [1:0]   open_d, icap_clk_locked_d;
   reg [1:0]   abort_state;
   reg 	       abort_cs, abort_rd;

   always @(posedge icap_clk)
     begin
	open_d <= { open_d, user_w_write_32_open }; // Crossing clock domains
	icap_clk_locked_d <= { icap_clk_locked_d, icap_clk_locked };

	abort_cs <= (abort_state == 1) || (abort_state == 2);
	abort_rd <= (abort_state == 1);

	if (!icap_clk_locked_d[1])
	  abort_state <= 3;
	else if (open_d[1])
	  abort_state <= 0;
	else if ( (abort_state != 3) &&
		  (icap_data_empty || (abort_state != 0)) )
	  abort_state <= abort_state + 1;
     end

And then the instantiation of the ICAP primitive changes to

   ICAPE2 #(.ICAP_WIDTH("X32")) icap_ins
     (
      .O(icap_out),
      .CLK(icap_clk),
      .CSIB(icap_data_empty && !abort_cs),
      .I(icap_data),
      .RDWRB(abort_rd)
      );

(the change for ICAPE3 is similar)

The new piece of logic controls the ICAP module with abort_cs and abort_rd: When the former is '1', CSIB is asserted regardless of icap_data_empty, and abort_rd is used to toggle RDWRB in order to create an ABORT condition.

The abort_state register is held at the value 3 at reset, and then held at 0 when the xillybus_write_32 device file is open. When the device file is closed, abort_state begins incrementing until reaching 3, and remains there.

The ABORT pattern is generated as abort_cs becomes '1' when abort_state travels through the values 1 and 2. Likewise, abort_rd becomes '1' following abort_state == 1.

All in all, the incrementing sequence of abort_state creates two icap_clk cycles where CSIB is asserted artificially: The first is a read cycle, and the second a write cycle. As the second cycle constitutes the condition for an ABORT, no write data cycle occurs on its behalf, so this method has no side effects.

Note however that because DALIGN is brought to '0' like this as xillybus_write_32 closes, the procedure for writing a bitstream, as outlined above, will not detect the case where DALIGN remained '1' at the end of the bitstream, i.e. the bitstream was too short (or more precisely, a DESYNC command was not reached). This is easily resolved by changing the order of closing the device files: xillybus_write_32 should be closed only after a byte has been read from xillybus_read_32. By doing so, the obtained status reflects the state before the ABORT is issued.

It's worth mentioning that this implementation is just one possible choice on when to issue an ABORT. It can be called for explicitly by the host as necessary (e.g. if DALIGN is '1' to begin with) or always before starting to write a bitstream.

Full listings

For convenience, this is the Verilog code given in pieces above. To adopt either of the following two versions, remove the instantiation of fifo_32x512 and add the code shown below (for one of the versions).

Note that ICAPE2 is instantiated in these listings. For Ultrascale devices, replace this with the instantiation of ICAPE3 as shown above.

Either way, add this to xillydemo.xdc:

set_clock_groups -asynchronous \
    -group [get_clocks -include_generated_clocks -of_objects [get_pins -hier -filter {name=~*icap_clk_gen_ins/clk_in1}]] \
    -group [get_clocks -include_generated_clocks -of_objects [get_pins -hier -filter {name=~*icap_clk_gen_ins/clk_out1}]]

The version without ABORT implementation:

   wire        icap_clk;
   wire        icap_clk_locked;
   wire [31:0] icap_data;
   wire [31:0] icap_out;
   wire        icap_data_empty;
   reg [3:0]   dalign_count;
   wire        dalign;
   reg [3:0]   dalign_d;
   reg [3:0]   icap_status_d;
   reg [3:0]   icap_status_sync;

   assign user_r_read_32_empty = 0;
   assign user_r_read_32_eof = 0;
   assign user_r_read_32_data = { dalign_count, icap_status_sync };

   icap_fwft_fifo icap_fwft_fifo_ins
     (
      .rst(!icap_clk_locked),
      .wr_clk(bus_clk),
      .rd_clk(icap_clk),
      .din(user_w_write_32_data),
      .wr_en(user_w_write_32_wren),
      .rd_en(!icap_data_empty),
      .dout(icap_data),
      .full(user_w_write_32_full),
      .empty(icap_data_empty),
      .wr_rst_busy(),
      .rd_rst_busy()
      );

   ICAPE2 #(.ICAP_WIDTH("X32")) icap_ins
     (
      .O(icap_out),
      .CLK(icap_clk),
      .CSIB(icap_data_empty),
      .I(icap_data),
      .RDWRB(1'b0) // Always write
      );

   assign dalign = icap_out[6];
   always @(posedge bus_clk)
     begin
	icap_status_d <= icap_out[7:4];
	icap_status_sync <= icap_status_d;

	dalign_d <= { dalign_d, dalign }; // Crossing clock domains + delay
	if (!dalign_d[3] && dalign_d[2])
	  dalign_count <= dalign_count + 1;
     end

   icap_clk_gen icap_clk_gen_ins
     (
      .clk_out1(icap_clk),
      .reset(quiesce),
      .locked(icap_clk_locked),
      .clk_in1(bus_clk));

And the version implementing ABORT on closing xillybus_write_32:

   wire        icap_clk;
   wire        icap_clk_locked;
   wire [31:0] icap_data;
   wire [31:0] icap_out;
   wire        icap_data_empty;
   reg [3:0]   dalign_count;
   wire        dalign;
   reg [3:0]   dalign_d;
   reg [3:0]   icap_status_d;
   reg [3:0]   icap_status_sync;

   reg [1:0]   open_d, icap_clk_locked_d;
   reg [1:0]   abort_state;
   reg 	       abort_cs, abort_rd;

   assign user_r_read_32_empty = 0;
   assign user_r_read_32_eof = 0;
   assign user_r_read_32_data = { dalign_count, icap_status_sync };

   icap_fwft_fifo icap_fwft_fifo_ins
     (
      .rst(!icap_clk_locked),
      .wr_clk(bus_clk),
      .rd_clk(icap_clk),
      .din(user_w_write_32_data),
      .wr_en(user_w_write_32_wren),
      .rd_en(!icap_data_empty),
      .dout(icap_data),
      .full(user_w_write_32_full),
      .empty(icap_data_empty),
      .wr_rst_busy(),
      .rd_rst_busy()
      );

   ICAPE2 #(.ICAP_WIDTH("X32")) icap_ins
     (
      .O(icap_out),
      .CLK(icap_clk),
      .CSIB(icap_data_empty && !abort_cs),
      .I(icap_data),
      .RDWRB(abort_rd)
      );

   assign dalign = icap_out[6];
   always @(posedge bus_clk)
     begin
	icap_status_d <= icap_out[7:4];
	icap_status_sync <= icap_status_d;

	dalign_d <= { dalign_d, dalign }; // Crossing clock domains + delay
	if (!dalign_d[3] && dalign_d[2])
	  dalign_count <= dalign_count + 1;
     end

   always @(posedge icap_clk)
     begin
	open_d <= { open_d, user_w_write_32_open }; // Crossing clock domains
	icap_clk_locked_d <= { icap_clk_locked_d, icap_clk_locked };

	abort_cs <= (abort_state == 1) || (abort_state == 2);
	abort_rd <= (abort_state == 1);

	if (!icap_clk_locked_d[1])
	  abort_state <= 3;
	else if (open_d[1])
	  abort_state <= 0;
	else if ( (abort_state != 3) &&
		  (icap_data_empty || (abort_state != 0)) )
	  abort_state <= abort_state + 1;
     end

   icap_clk_gen icap_clk_gen_ins
     (
      .clk_out1(icap_clk),
      .reset(quiesce),
      .locked(icap_clk_locked),
      .clk_in1(bus_clk));