3 I/O programming practices
3.1 Overview
Xillybus works properly with any programming language which is able to access files, and any API for accessing files is suitable.
In this guide there’s an emphasis on the low-level API set, based upon functions such as open(), read(), write() and close(). This set is chosen over other well-known sets (e.g. fopen(), fwrite(), fprintf() etc.) because the low-level API’s functions have no extra layer of buffers. These buffers can have a positive effect on performance, but with them there’s no control over the actual I/O operations.
This is less important when data is transmitted constantly and no direct relation is expected between software operations and the I/O with the hardware.
An extra buffer layer can also cause confusion, making it look like there’s a software bug where there isn’t. For example, a function call to fwrite() can merely store the data in a RAM buffer without performing any I/O operation until the file is closed. A developer not aware of this may be mislead to think that the fwrite() failed because nothing happened on the FPGA side, when in fact the data is waiting in the buffer.
This section describes the recommended UNIX programming practices, using the low-level C run-time library functions. This elaboration is given here for the sake of completeness, as there is nothing specific to Xillybus about any of these practices.
The code snippets are taken from the demo applications described in Getting started with Xillybus on a Linux host. The device file names in these examples are those of the Xillybus IP core for PCIe / AXI. For XillyUSB, the prefix is xillyusb_00_* instead of xillybus_*.
3.2 Guidelines for reading data
Assuming that the variables have been declared as follows:
int fd, rc; unsigned char *buf;
The device file is opened with the low-level open (the file descriptor is in integer format):
fd = open("/dev/xillybus_ourdevice", O_RDONLY);
if (fd < 0) {
perror("Failed to open devfile");
exit(1);
}
A “Device or resource busy” (errno = EBUSY) error will be issued if the device file is already opened for read by another process (non-exclusive file opening is available on request). If “No such device” (errno = ENODEV) occurs, it’s most likely an attempt to open a write-only stream.
With the file opened successfully and buf pointing at an allocated buffer in memory, data is read with:
while (1) {
rc = read(fd, buf, numbytes);
numbytes is the maximal number of bytes to read.
The returned value, rc, contains the number of bytes actually read (or a negative value if the function call completed abnormally).
Note that read() always returns immediately if the amount of data that was requested in numbytes is available. Otherwise, it will return after about 10 ms if there is any data available. If no data at all is available, read() sleeps until it can return with data.
The driver checks the availability of data in the sense that the IP core has received that data from the application logic in the FPGA. The mechanism of DMA buffers is transparent to the caller of the function read(), and never delays the delivery of data to the read() function call because a DMA buffer isn’t full, as explained in section A.3.5 of the Appendix.
IMPORTANT:
There is no guarantee that all requested bytes were read from the file, even on a successful return of read(). It’s the caller’s
responsibility to make another function call to read(), if the completed amount of data was unsatisfactory.
The function call to read() should be followed by checking its return value as shown below (“continue” and “break” statements assume a while-loop context):
if ((rc < 0) && (errno == EINTR))
continue;
if (rc < 0) {
perror("read() failed");
break;
}
if (rc == 0) {
fprintf(stderr, "Reached read EOF.\n");
break;
}
// do something with "rc" bytes of data
}
The first if-statement checks if read() returned prematurely because of a signal. This is a result of the process receiving a signal from the operating system.
This is not an error really, but a condition that forces the driver to return control to the application immediately. The use of the EINTR error number is just a way to tell the function’s caller that there was no data read. The program responds with a “continue” statement, resulting in a renewed attempt to call the function read() with the same parameters.
If there is some data in the buffer when the signal arrives, the driver will return the number of bytes already read in rc. The application will not know that a signal has arrived, and according to UNIX programming convention, it has no reason to care: If the signal requires action (e.g. SIGINT resulting from a CTRL-C on keyboard), the responsibility for this action is either on the operating system, or a registered signal handler.
Note that some signals shouldn’t have any effect on the execution flow, so if signals aren’t detected as shown above, the program may suddenly report an error for no apparent reason.
Handling the EINTR scenario is also necessary to allow the process to be stopped (as with CTRL-Z) and resumed properly.
The second if-statement terminates the loop if a real error has occurred after reporting a user-readable error message.
The third if-statement detects if end of file has been reached, which is indicated by a return value of zero. When reading from a Xillybus device file, the only reason for this to happen is that the application logic has raised the stream’s _eof pin (which is part of the IP core’s interface on the FPGA).
3.3 Guidelines for writing data
Assuming that the variables have been declared as follows:
int fd, rc; unsigned char *buf;
The device file is opened with the low-level open (the file descriptor is in integer format):
fd = open("/dev/xillybus_ourdevice", O_WRONLY);
if (fd < 0) {
perror("Failed to open devfile");
exit(1);
}
A “Device or resource busy” (errno = EBUSY) error will be issued if the device file is already opened for write by another process (non-exclusive file opening is available on request). If “No such device” (errno = ENODEV) occurs, it’s most likely an attempt to open a read-only stream.
With the file opened successfully and buf pointing at an allocated buffer in memory, data is written with:
while (1) {
rc = write(fd, buf, numbytes);
numbytes is the maximal number of bytes to be written.
The returned value, rc, contains the number of bytes actually written (or a negative value if the function call completed abnormally).
IMPORTANT:
There is no guarantee that all requested bytes were written to the file, even on a successful return of write(). It’s the caller’s
responsibility to make another function call to write(), if the completed amount of data was unsatisfactory.
The function call to write() should be followed by checking its return value as shown below (“continue” and “break” statements assume a while-loop context):
if ((rc < 0) && (errno == EINTR))
continue;
if (rc < 0) {
perror("write() failed");
break;
}
if (rc == 0) {
fprintf(stderr, "Reached write EOF (?!)\n");
break;
}
// do something with "rc" bytes of data
}
The first if-statement checks if write() returned prematurely because of a signal. This is a result of the process receiving a signal from the operating system.
This is not an error really, but a condition that forces the driver to return control to the application immediately. The use of the EINTR error number is just a way to tell the function’s caller that there was no data written. The program responds with a “continue” statement, resulting in a renewed attempt to call the function write() with the same parameters.
If some data was written before the signal arrived, the driver will return the number of bytes already written in rc. The application will not know that a signal has arrived, and according to UNIX programming convention, it has no reason to care: If the signal requires action (e.g. SIGINT resulting from a CTRL-C on keyboard), the responsibility for this action is either on the operating system, or a registered signal handler.
Note that some signals shouldn’t have any effect on the execution flow, so if signals aren’t detected as shown above, the program may suddenly report an error for no apparent reason.
Handling the EINTR scenario is also necessary to allow the process to be stopped (as with CTRL-Z) and resumed properly.
The second if-statement terminates the loop if a real error has occurred after reporting a user-writable error message.
The third if-statement detects if the end of file has been reached, which is indicated by a return value of zero. When writing to a Xillybus device file, this should never happen.
3.4 Performing flush on asynchronous downstreams
As mentioned in paragraph 2.4, data written to an asynchronous stream on a PCIe / AXI IP core is not necessarily sent immediately to the FPGA, unless a DMA buffer is full (there are several DMA buffers). This behavior improves performance by making sure that the allocated buffer space is utilized. This also improves the efficiency of the packets sent on the PCIe / AXI bus.
As also mentioned already, XillyUSB IP cores send the data virtually right away, even when the stream is asynchronous, as there’s an efficient arrangement for that with the USB interface. Performing flush has therefore a significance with XillyUSB IP cores only when it involves waiting for the transmission to complete.
Streams to the FPGA undergo a flush automatically when closing the file descriptor, however this is a best-effort mechanism that can’t be relied upon. The function call to close() is delayed until all data has arrived at the FPGA in a manner similar to the way write() function calls are delayed on synchronous streams. The significant difference is that close() waits up to one second for the flush to complete. If the flush isn’t completed by then, close() returns anyhow, and issues a warning message in the system log. Note however that in some rare scenarios, the last few words of remaining data may be lost without any warning while closing a file descriptor.
It’s also possible to request a flush of an asynchronous stream explicitly, by calling the function write() with a buffer that has a length of zero, i.e.
while (1) {
rc = write(fd, NULL, 0);
if ((rc < 0) && (errno == EINTR))
continue; // Interrupted. Try again.
if (rc < 0) {
perror("flushing failed");
break;
}
break; // Flush successful
}
Please note the following:
-
The manual page for UNIX doesn’t define what a write() function call should do when the count is zero, leaving the choice to each device driver. This method for flushing is specific to Xillybus.
-
Unlike close(), a write() as shown above returns immediately, regardless of when the data is consumed on the FPGA.
-
Because of this, this kind of write() is pointless with XillyUSB. It has nothing to do, and indeed does nothing: The data is sent virtually immediately anyhow, and the write() function call wouldn’t wait in any case.
-
Since no data is read from the buffer, the buffer argument in the write() function call can take any value, including NULL, as demonstrated above.
-
Using higher-level API, with a buffer with zero length, may not have any effect at all. For example, calling the function fwrite() to write zero bytes may simply return with nothing done, since what this function usually does is adding the data to a buffer created by the C run-time library.
-
fflush() is irrelevant: It performs a flush of the higher-level buffer, but doesn’t send a flush command to the low-level driver.
-
There is no need perform a flush on streams in the other direction (from FPGA to host), and there’s no way to do so. This is because a flush of such streams is automatically performed when a host’s attempt to read data is about to put the process to sleep (i.e. block).
3.5 select() and nonblocking I/O
Even though not recommended, the Xillybus driver for Linux supports nonblocking calls and the select() function. Note that the driver for Windows doesn’t support anything similar, so using this functionality makes the application harder to port if necessary. The recommended way to handle multiple sources is with multiple threads (and preferably RAM FIFOs) as demonstrated in the fifo.c example program, discussed in paragraph 4.4.
Function calls to select(), pselect() and poll() can be used like with any UNIX file descriptor, for read and write alike.
The nonblocking calls and select() features are not enabled in Xillybus IP cores that have been set up for as “Windows only” in the IP Core Factory.
For the sake of completeness, we shall revisit the code outline for reading data in paragraph 3.2, using nonblocking reads. This code merely demonstrates the conventional method for any nonblocking read from a file in UNIX.
The file is opened with the O_NONBLOCK flag:
fd = open("/dev/xillybus_ourdevice", O_RDONLY | O_NONBLOCK);
if (fd < 0) {
perror("Failed to open devfile");
exit(1);
}
There is no difference in how the file is read, the arguments or the meaning of the return value:
while (1) {
rc = read(fd, buf, numbytes);
But there is now another check on the return values: If rc is negative and EAGAIN is given as the error code, this means there was nothing to read. More precisely, there is no data in the driver’s buffers, and the FIFO in the FPGA is empty.
if ((rc < 0) && (errno == EINTR))
continue;
if ((rc < 0) && (errno == EAGAIN)) {
// do something else
continue;
}
if (rc < 0) {
perror("read() failed");
break;
}
if (rc == 0) {
fprintf(stderr, "Reached read EOF.\n");
break;
}
// do something with "rc" bytes of data
}
Note that the code above doesn’t make sense unless something meaningful is done when the function call returns with an EAGAIN. Otherwise it just wastes CPU time by spinning in the while loop, instead of sleeping when there is no data to read.
For nonblocking writing, make the respective changes in the example in paragraph 3.3.
3.6 Monitoring the amount of data in driver’s buffers
This topic is discussed in Xillybus FPGA designer’s guide, in the section named “Monitoring the amount of buffered data”.
3.7 XillyUSB: The need to monitor the quality of the physical data link
Unlike PCIe, the physical data link that is used with USB 3.0 has been observed generating bit errors. This is uncommon, and indicates a problem with one of the involved components, most likely the host’s USB port or the cable.
The USB protocol provides a variety of mechanisms for overcoming bit errors when such occur, however the random nature of these errors puts the link protocol in states that are rarely reached. As a result, this may reveal bugs in the host’s USB controller. Such bugs, to the extent that they exist, are normally hidden, and cause a variety of weird behaviors.
Hence if the physical data link suffers from frequent bit errors, there’s a significant risk that the USB connection will become stuck, spontaneously disconnected, or in rare cases, even cause errors in the application data.
XillyUSB provides a means for monitoring the health of the physical data link, by virtue of a dedicated device file, /dev/xillyusb_NN_diagnostics. The showdiagnostics utility (explained on this web page) exposes the information collected on this matter.
It’s highly recommended that applications based upon XillyUSB continuously monitor the first five counters that are displayed by the showdiagnostics utility (relating to bad packets, errors detected and Recovery requests), and ensure that they don’t increase. If they do so, and in particular if they increase repeatedly, the application software should suggest corrective actions, possibly one of the following:
-
Disconnect and reconnect the USB plug to another port. This may help, because some motherboards have different ports connected to different brands of USB host controllers (usually to support later versions of the USB 3.x protocol).
-
Disconnect and reconnect the USB plug on the same port. This might help if the analog signal equalizer (which cancels attenuations and reflections caused the physical signal path) ends up in a suboptimal state.
-
Attempt using a different USB cable.
It’s quite likely that an application continues to work flawlessly even in the presence of bit errors. The suggestion for corrective actions is therefore best done while taking into account that the user probably doesn’t experience any visible problem.
The showdiagnotics.pl utility is a Perl script which can be used as reference code. Alternatively, the diagnostic utility for Windows, which is given as C source code, can be referred to.
Note that none of these problems is specific to XillyUSB. Rather, these issues are as likely to affect any USB 3.0 device, however XillyUSB offers means to detect them. Also, it’s worth reiterating that PCIe links are not known to suffer from any similar issues, most likely due to the better controlled physical connections and signal routings.
