Introduction

This is a walkthrough of the initial events as they typically appear on a USB 3.0 PHY when a USB device is attached to a host. The different link establishment stages are shown, with the typical signals and data detailed: The LFPS pulses at the very beginning, and later on the data on the 5 GT/s MGT (Multi Gigabit Transmitter) link stream until the beginning of packet transmission (with a slight glimpse on link packets).

The USB 3.0 specification defines these stages as Polling sub-states, however as the spec details all possible corner cases, it may be a bit difficult to envision what it should really look like.

Therefore it's probably a good idea to first look at this page for a better understanding of the involved topics.

The data samples shown below were made from the host-to-device link, however they relate to the upstream and downstream ports alike, as the handshake is symmetric during the initial stages of a typical link bringup.

The link of USB 3.1 and later (SuperSpeedPlus) is completely different in terms of the signaling and data sequences, and is not covered here.

How data is displayed here

For simplicity, data is shown in 8-bit hex dump format. As 8b/10b encoding involves K symbols, these are shown in their 8-bit hex representation in bold font. The 8b/10b symbol encodings are listed in Appendix A of the USB spec (and in several other documents available).

In particular, note that K28.5 (Comma) appears as 0xbc in bold.

Also for simplicity, the different patterns are shown in hex dumps starting with offset 0. This alignment is artificial -- there is no guarantee for any 16-bit nor 32-bit alignment, or at all, on a real data link. The USB 3.0's stream granularity is a link symbol.

Also omitted are SKP ordered sets, which consist of two consecutive K28.1 symbols (decoded as 0x3c). These can be inserted by the transmitting side in certain places (see section 6.4.3.1 in the spec) or virtually anywhere by the receiver's PHY to maintain its elasticity buffer. The examples shown below are clean from SKP symbols.

Stage I: Polling LFPS pulses

This is the first thing seen after the physical connection of a USB 3.0 plug. On the wires, they are preceded by receiver detection pulses, which are visible on an oscilloscope, but they produce no output from the PHY's receiver interface.

It's worth to note that no reset signal is issued, in particular no LFPS reset. Those familiar with USB 1.1 / 2.0 might expect one or several resets, but this is not the case with USB 3.0.

The Polling LFPS bursts are detected by the PHY's Electrical Idle being deasserted for 1 us each time, with 9 us between these (it can be seen as a pulse rate of 10 us with a duty cycle of 10%).

Once both sides have seen the other side's Polling LFPS pulses, they stop generating those, and turn on their MGT transmitters. There should be at least 16 such pulses according to the spec, but sometimes fewer are detected. Fewer pulses than the 16 required for transmission is a slight breach of the spec, since as few as two pulses are enough for reception on the other side.

Stage II: TSEQ ordered sets

The first thing transmitted on the MGT link are TSEQ ordered sets. Following the USB spec's section 6.4.1.1.3, the TSEQ ordered set is transmitted 65536 times (three sets are shown below).

00000000  bc ff 17 c0 14 b2 e7 02  82 72 6e 28 a6 be 6d bf  |.........rn(..m.|
00000010  4a 4a 4a 4a 4a 4a 4a 4a  4a 4a 4a 4a 4a 4a 4a 4a  |JJJJJJJJJJJJJJJJ|
00000020  bc ff 17 c0 14 b2 e7 02  82 72 6e 28 a6 be 6d bf  |.........rn(..m.|
00000030  4a 4a 4a 4a 4a 4a 4a 4a  4a 4a 4a 4a 4a 4a 4a 4a  |JJJJJJJJJJJJJJJJ|
00000040  bc ff 17 c0 14 b2 e7 02  82 72 6e 28 a6 be 6d bf  |.........rn(..m.|
00000050  4a 4a 4a 4a 4a 4a 4a 4a  4a 4a 4a 4a 4a 4a 4a 4a  |JJJJJJJJJJJJJJJJ|
[ ... ]

As each set is 32 symbols long, this sequence is 2097152 (2^21) symbols long, and lasts about 4.2 ms. The number of symbols received is usually slightly smaller, as some symbols are lost until the receiver's MGT's is locked.

Note that symbols 1-15 in the TSEQ ordered set are the scrambler's first symbols after its reset. Compare with the last table in Appendix B in the USB 3.x spec.

Stage III: TS1 ordered sets

After the TSEQ sequence, TS1 ordered sets are transmitted. There is no condition for switching from TSEQ to TS1 -- it's just that the required number of TSEQ ordered sets have been sent.

The TS1 ordered sets are 16 symbols each (four sets shown below). All symbols are fixed except the sixth, which is the Link Functionality field, marked red below. The meaning of this field is given in Table 6-6 of the spec. A Hot Reset is signaled by setting bit 0 of this symbol as '1'. The other bits are used to request loopback, turning the scrambler off etc., and are rarely used outside the testing lab.

The 0x4a symbols identify this ordered set as an TS1 (and not an TS2).

00000000  bc bc bc bc 00 00 4a 4a  4a 4a 4a 4a 4a 4a 4a 4a  |......JJJJJJJJJJ|
00000010  bc bc bc bc 00 00 4a 4a  4a 4a 4a 4a 4a 4a 4a 4a  |......JJJJJJJJJJ|
00000020  bc bc bc bc 00 00 4a 4a  4a 4a 4a 4a 4a 4a 4a 4a  |......JJJJJJJJJJ|
00000030  bc bc bc bc 00 00 4a 4a  4a 4a 4a 4a 4a 4a 4a 4a  |......JJJJJJJJJJ|
[ ... ]

Stage IV: TS2 ordered sets

The TS2 ordered set is sent to acknowledge the reception of eight consecutive and identical TS1 or TS2  ordered sets from the link partner. The repeated 0x45 symbol indicates it's an TS2.

00000000  bc bc bc bc 00 00 45 45  45 45 45 45 45 45 45 45  |......EEEEEEEEEE|
00000010  bc bc bc bc 00 00 45 45  45 45 45 45 45 45 45 45  |......EEEEEEEEEE|
00000020  bc bc bc bc 00 00 45 45  45 45 45 45 45 45 45 45  |......EEEEEEEEEE|
00000030  bc bc bc bc 00 00 45 45  45 45 45 45 45 45 45 45  |......EEEEEEEEEE|
[ ... ]

Stage V: Idle handshake

This is the final stage of the physical link bringup: No more ordered sets, but the data channel itself, with (scrambled) idle symbols (0x00) to begin with.

The condition for jumping to this stage is the reception of eight consecutive and identical TS2 ordered sets, and having transmitted TS2 while those were received, and eight other TS2's after that. In other words, the first TS1 handshake stage is surely behind us, and the TS2 has been up and running enough time for both sides to move on.

The idle stage is shown below starting at offset 0x10, after a single TS2 ordered set. The data bytes immediately after the TS2 ordered set is always as shown: Since the idle (0x00) symbol is transmitted at this stage, the data consists of the scrambler's output, beginning from its 13th octet after a reset. For reference, the scrambler's first outputs after reset are given in the last table of the spec's Appendix B (along with a C program implementing it). It's the same scrambler used for PCIe,  Displayport and several other standards.

The reason the scrambler's 13th octet is the first visible, is that section 6.3.1.3 of the spec requires that the scrambler should advance on each symbol (except for SKP) and reset on each COMMA symbol. Consequently, the scrambler is reset by each of the four COMMA symbols at the beginning of the TS2 ordered set, and then advances on the 12 remaining symbols of it. When switching the ordered set off after the last symbol of TS2, we have the scrambler at the 13th position after reset.

00000000  bc bc bc bc 00 00 45 45  45 45 45 45 45 45 45 45  |......EEEEEEEEEE|
00000010  be 6d bf 8d be 40 a7 e6  2c d3 e2 b2 07 02 77 2a  |.m...@..,.....w*|
00000020  cd 34 be e0 a7 5d 24 b1  9b a1 bd 22 d4 45 1d d3  |.4...]$....".E..|
00000030  d7 ea 76 ee 2c da 1a fa  28 2d 36 3b 3a 0e 6f 67  |..v.,...(-6;:.og|
[ ... ]

An interesting point is that the scrambler is never reset again: There are no COMMA symbols in the following states, unless the link is retrained (e.g. brought into a lower power state and back into Recovery, or just into Recovery because the link didn't function properly). So the last COMMA in the last TS2 is the one that determines the scrambler's state as long as the link remains established.

In the example above, all bytes shown are idle symbols. The spec requires each side to transmit idle symbols (i.e. the scrambler's output) until it has received eight consecutive idle symbols, and then continue to send another eight idle symbols. In effect, this ensures that the receiving and transmitting scramblers are in sync. It will usually take more than 16 symbols, as there are usually some buffers on an MGT link transmission path. The time to complete this stage is 2ms, which is 1 Msymbols. But it usually ends by far quicker.

Link is up (the U0 state)

After typically less than a microsecond, link layer packets will begin to appear. Packets start (and sometimes end) with K symbol sequences, consisting of symbols listed in table 6-2 in the spec:

  • SHP = K27.7, coded as 0xfb with K flag set
  • SDP = K28.2, coded as 0x5c with K flag set
  • END = K29.7, coded as 0xfd with K flag set
  • EPF = K23.7, coded as 0xf7 with K flag set
  • SLC = K30.7, coded as 0xfe with K flag set

Spotting these packets is easy. For example, the header packet begins with four K-symbols: SHP, SHP, SHP and EPF (that is, hex codes  fb fb fb f7), and is followed by 16 non-K symbols. When data follows that packet, it follows the header packet immediately, and begins with SDP, SDP, SDP and EPF (5c 5c 5c f7) and ends with END, END, END and EPF (fd fd fd f7).

There are also Link Commands, which appear as SLC, SLC, SLC, EPF (fe fe fe f7) followed by four non-K symbols.

K-symbols are not scrambled, but the non-K (data) symbols are, so while it's easy to find the packets' boundaries, decoding their content requires descrambling. Which is essentially XORing with the scrambler's output.

For example, a header packet:

00000000  fb fb fb f7 14 60 fb 63  a0 8a 0c ff e6 78 b3 bb  |.....`.c.....x..|
00000010  0e 62 08 2e

The content of this packet is can't be obtained in the absence of the scrambler's state.

And this an example of four link commands, followed by several idle symbols (one can tell they're idle because there's no packet header):

00000000  fe fe fe f7 f6 0f 94 75  fe fe fe f7 f8 8e e3 ef  |.......u........|
00000010  fe fe fe f7 c2 34 ec 62  fe fe fe f7 e9 77 c9 d8  |.....4.b.....w..|
00000020  05 e5 dd 68 0d 78 4c 53  8b d6 86 57 b2 aa 1a 80  |...h.xLS...W....|

Once again, the content of these link commands can't be obtained without descrambling.

From TS2 to decoding a packet

Taking a bit broader look, let's go back to the last TS2 ordered set, through the idle handshake, and the first packets transmitted:

00000000  bc bc bc bc 00 00 45 45  45 45 45 45 45 45 45 45  |......EEEEEEEEEE|
00000010  be 6d bf 8d be 40 a7 e6  2c d3 e2 b2 07 02 77 2a  |.m...@..,.....w*|
00000020  cd 34 be e0 a7 5d 24 b1  9b a1 bd 22 d4 45 1d d3  |.4...]$....".E..|
00000030  d7 ea 76 ee 2c da 1a fa  28 2d 36 3b 3a 0e 6f 67  |..v.,...(-6;:.og|
00000040  cf 06 4c 26 d3 e9 3a cd  27 76 30 fc 94 8b 03 de  |..L&..:.'v0.....|
00000050  d3 06 52 f6 4f 88 80 95  c4 6a 66 f2 9f 0c a1 35  |..R.O....jf....5|
00000060  e2 41 cf 27 74 40 7e 9e  a5 58 fe 84 09 60 08 a9  |.A.'t@~..X...`..|
00000070  f1 0b 6f 62 17 43 5c ed  48 39 3f d4 5a f5 0e b3  |..ob.C\.H9?.Z...|
00000080  c7 03 9d 9b 8b 0d 8e 5c  fe fe fe f7 2a c4 0c 56  |.......\....*..V|
00000090  da 0b 42 7a 7c d1 cf a8  1c 12 ee 41 c2 3f 38 7a  |..Bz|......A.?8z|
000000a0  0d 69 f4 01 da 31 72 c5  a0 d7 93 0e dc af a4 55  |.i...1r........U|
000000b0  e7 f0 72 16 68 d5 38 84  dd 00 cd 18 9e ca 30 59  |..r.h.8.......0Y|
000000c0  4c 75 1b 77 31 c5 ed cf  91 64 6e 3d fe e8 29 04  |Lu.w1....dn=..).|
000000d0  cf 6c fc c4 0b 5e da 62  ba 5b ab df 59 b7 7d 37  |.l...^.b.[..Y.}7|
000000e0  5e e3 1a c6 88 14 f5 4f  8b c8 56 cb d3 10 42 63  |^......O..V...Bc|
000000f0  04 8a b4 f7 84 01 a0 01  83 49 67 ee fe fe fe f7  |.........Ig.....|
00000100  f6 0f 94 75 fe fe fe f7  f8 8e e3 ef fe fe fe f7  |...u............|
00000110  c2 34 ec 62 fe fe fe f7  e9 77 c9 d8 05 e5 dd 68  |.4.b.....w.....h|

Packet transmissions are marked in red above. In this example, the first packet appears 120 symbols after the end of the last TS2 ordered set. Note that even though all packets start on a 32-bit work aligned position, this should be considered a coincidence (albeit a frequent one).

Now let's decode the first packet, which is a link command. Its four data bytes, 2a c4 0c 56, begin at position 0x8c of the hex dump above. Recall that the scrambler was reset by the COMMA words of the last TS2 ordered set, so its first word appears at offset 0x4. The relevant scrambler offset is hence 0x8c - 0x04 = 0x88. Looking at the initial scrambler outputs at the USB spec's Appendix B, the scrambler produces 2d ac 0b 3e at this offset.

The packet's content is the XOR of the data bytes above and the scrambler. i.e. { 2a c4 0c 56 } XOR { 2d ac 0b 3e } = { 07 68 07 68 }. Link commands consist of one 16-bit word repeated twice, so it's no surprise that we got this repeated sequence. The 16-bit word, 0x6807, consists of a payload in bits 10:0, and a CRC5 in 15:11. The payload is hence 0x6807 AND 0x7ff = 0x0007. According to Table 7-4 in the USB 3.0 spec, this is an LGOOD_7. This is hence a Header Sequence Number Advertisement (see section 7.2.4.1.1 in the spec).

Lane polarity reversal

The USB 3.x spec requires the receiver to be tolertant to lane polarity reversal, i.e. that the P and N differential wires are swapped. If this happens, all K codes in use are received without change (the 8b/10b code makes them appear to be the same), but D10.2 (0x4a) arrives as D21.5 (0xb5), and D5.2 (0x45) arrives as D26.5 (0xba).

As the TSEQ, TS1 and TS2 ordered sets contain repeated 0x4a / 0x45 symbols, it's easy to spot their polarity reversed counterparts, in particular as the comma (K28.5) arrives intact.

Conclusion

Even though the USB 3.0 spec is not the simplest to read because of its detailed nature, the link establishment sequence it describes is quite simple to follow. Having said that, there is no substitute for knowing the fine details when implementing a USB 3.0 component, as proper handling of the unexpected cases is mandatory for reliable operation.