Friday 12 February 2021

How to fix intermittent errors when booting from a USB drive

Around 1998 we got introduced to USB 1.0. The pure data bit rate could reach an amazing 12Mbits/s (approx 1MByte/s in real life). Then we got USB 2 at the start of the new Millenium which gave us up to 60MBytes/s (much faster than slow-spinning CDs!). 

Later still (after 2008) we got USB 3, 3.1 and 3.2 with up to 2.4GBytes/s and very recently we have USB 4 (based on Thunderbolt 3) and are promised speeds of up to a staggering 40GBytes/s.

The names which were given to these different technologies (and seem have been randomly assigned with little forethought) are:

  • USB 1 - Low Speed
  • USB 1 - Full Speed
  • USB 2 - High Speed
  • USB 3 - Super Speed
  • USB 3.1 Gen 1 Super Speed
  • USB 3.1 Gen 2 Super Speed+
  • USB 3.2 Gen 1x1
  • USB 3.2 Gen 2x1
  • USB 3.2 Gen 2x2
  • USB 4 Gen 2x2
  • USB 4 Gen 3x2 - USB4 routing for tunnelling of USB3.x, DisplayPort 1.4a and PCI Express traffic and host-to-host transfers, based on the Thunderbolt 3 protocol
So what started out as 1MB/s wired protocol has ended up as having a 40+GB/s Input/Output transfer protocol zipping along at radio frequencies!

Now when considering booting from a PC or Notebook, the BIOS/Firmware driver is responsible for USB I/O communication. Most computers which have USB 3 ports, have both USB 3 and USB 2 drivers embedded in the firmware. However, these BIOS drivers will probably not be fully featured like full-blown Windows or Linux drivers. Hence, they may not fully implement the Super Speed protocols and error handling may be limited.

The transfer protocols used by the driver will determine how fast I/O access is and thus how fast we can boot from the USB drive. Moving large blocks of data using a high-speed USB block-transfer protocol under Windows can give us those high speeds, but the BIOS firmware is probably using a more primitive and slower protocol to access the USB drive.

Another point to keep in mind is what BIOS code is used. If we Legacy\MBR-boot from the USB drive it will use the real-mode IBM-compatible BIOS USB driver, but if we UEFI-boot then a completely different UEFI USB driver will be used by the BIOS.

Because the BIOS\Firmware knows what USB chipset is present, it is possible for the BIOS developers to make sure that the USB speeds are optimized. However, most BIOS developers just use generic USB 2/3 drivers to add into their Firmware whilst others may spend some time and effort tweaking the USB BIOS code (e.g. Apple).

Gotcha's

USB 2 uses a completely different set of 'contacts' from USB 3. 

USB 3+ uses the five-contact red set shown below whereas USB 1&2 uses only the four-contact blue set. When you plug a USB 3 device into a USB 3 port, typically both sets are connected, but the firmware/driver will detect that USB 3 is available and use the USB 3 pins and USB 3 protocols. If you connect a USB 2 device to a USB 3 port (or vice versa), then it will use the USB 2 connections.



The USB 3 cable and connector specification is very specific and goes into great detail because there are lots of ways to 'get it wrong'! Even when directly plugging a USB 3 Flash drive into a computer's USB 3 port, the designers need to pay special attention to signal integrity and ensure that there are no ringing or ground issues, etc.
 

1. Physical issues

Metal shield
Most USB devices and computers use a metal shield or shroud around the contacts. This not only helps in physically protecting the connector, EMI/RFI shielding and robustness, but it also ensure a high degree of physical pressure can be exerted to ensure that the two sets of spring contacts are pushed together firmly and thus make good connections.

You should aware that some USB sockets used in laptops, etc. do not have this metal shield. Also, some USB devices use a thin plastic shroud instead of a metal one. These types of connectors will cause problems and I highly recommend that you never buy any computer or device unless they have good quality USB connectors with sturdy metal shields (I know this from bitter experience!). The USB connector specification has tolerances of better than 0.05mm - there is no way a flimsy plastic USB connector can be kept within that specification!

The plastic USB connector on the right does not provide enough force
on the contacts and will cause you problems - never buy USB devices without metal connectors!

A good quality USB 3 PCB-mounted socket has a fully enclosed
metal shield which also helps to reduce EMI

If good contact is not made as the BIOS 'powers-up', a USB 3 device may be detected as a USB 2 device and from then on use USB 2 to 'talk' to the device (until you boot to an OS). Sometimes a USB 2 device may drop-down to USB 1 speeds if the BIOS detected I/O problems.


2. Electrical issues

Electro-magnetic Interference (EMI) and Radio Frequency Interference (RFI)

Super Speed USB signals exceed frequencies of 200MHz - 5GHz. At these radio frequencies we need to be aware that USB devices and cables can not only absorb these radio frequencies (which can cause signal integrity issues) but USB 3 devices can also radiate at these radio frequencies and interfere with other devices! 

In fact, USB 3 devices can cause both conducted and radiated interference to WiFI and bluetooth devices. A typical scenario is when you have a USB bluetooth wireless mouse dongle plugged into your PC or laptop and it is very near to your USB 3 Flash drive. You may find that your wireless mouse becomes erratic or unresponsive whenever you are copying large files to your USB 3 drive (see here for details of a real life case that I discovered).

Another real-life incident occurred when I kept downloading a known-good .zip file from the internet one evening, only to find that the file was nearly always corrupt. However, when I downloaded the same file the next morning, it was not corrupted. Later that night, I downloaded the same .zip file again and found that it was again corrupt! Eventually I traced the cause. The USB WiFi dongle that I was using, was connected via a USB 2 cable to my PC. This dongle was stuck to my wall (to get better reception from my router) and was near the room's dimmer light switch. As soon as I moved the USB WiFi dongle away from the dimmer switch (which of course was only on at night) or I switched off the room light, I stopped getting corrupted files on download! The dimmer unit was interfering with the dongle and apparently there was no error detection/correction mechanism employed. Cheap  mains-powered LED light bulbs are also well known as sources of radio interference to Bluetooth and WiFi signals due to their poorly designed AC-DC power circuits and lack of RFI shielding - if you notice more WiFi problems during the evening, then try turning off your lights and see if things get better!

Front Panel/PCB traces

When you connect a USB Flash drive to your PC's front USB port, the USB signals need to travel between the chip inside the USB Flash drive and the USB chip on the PC's mainboard - like this:

USB flash chip - USB flash drive contacts - PC front panel USB contacts - front panel PCB - front panel wires - mainboard USB connector - mainboard PCB traces - mainboard USB chip

If you have also used a USB 3 cable to connect the USB device, this adds another source of variability into the mix!

High speed signals must have the correct series termination in order to avoid signal integrity problems. Assuming that the PCB designer has made a reasonable attempt to get the series termination resistors right, the most common problems I come across is due to poorly designed front USB cable assemblies and front panel PCBs in PC cases. The case's internal USB cables should be shielded and contain twisted pair wires of the correct specification for USB 3 SS. The front panel PCB should also be well grounded and shielded.

The USB front PCB should be at least a 3 layer board with a ground plane (for GHz high frequency noise suppression), and should contain noise suppression, smoothing and decoupling capacitors. The PCB needs to suppress the wide spectrum of noise that may be injected by RFI or by any other USB devices that are connected into the other front USB ports.

I have also found brand new notebooks (e.g. an Acer Aspire R3-131T)  that have shown data problems when booting from it's USB 3 port using a high speed USB 3 flash drive (the USB drives gave me no problems on any other device). When I connected it to a problem notebook via a USB 2 cable, the problem went away. I could even get I/O errors from USB 3 devices connected to that same USB port when testing under Windows. The new notebook was returned - as far as I could tell, it had a design fault!

Voltages

Another common problem is that either the internal or external USB cables can cause a voltage drop due to the resistance of the internal wires used. USB devices may not work as well if they are running at too low a voltage.

You can buy cheap USB cables, but they may not be suitable for charging your devices or running Super Speed USB devices (or USB hubs) which require larger currents (500mA+ say). A YouTube video shows how much different brands of USB cables can vary and obviously, the shorter the USB cable, the better!


A good USB tester is useful to measure voltage - e.g. the UM34C by Ruideng (#Amazon ad). PDF for UM34C here.

Ground\Earth

As well as a good 5V supply, the device also needs a good ground and 0V connection. Some PC front panel PCBs fall down in this area. The USB metal shield is often not connected to I/O ground and this can cause EMI problems.

Check the connectivity using a good Ohm meter and measure the resistance between the metal shield to the chassis as well as the 0V rail and the 5V rail on the USB port itself.

Braiding/shielding/twisted pair

USB Cables must contain a braided outer shield. USB 3 Cables must contain a power pair, a UTP D+/D- twisted data-pair and two shielded SuperSpeed (SS) data pairs. Note that even for USB 2, the D+/D- green\white unshielded pair should be twisted inside the cable.

Good quality USB 3 cables should have shielded twisted-pair wires inside.

It is easy for some manufacturers to skimp on this when making USB 3 cables, but due to the high frequencies involved, good-quality cables are essential. Note that just because a USB 3 cable has a nylon cloth outer braiding, it does not necessarily mean that the connectors, internal metal braiding and wires are of good quality!

If you experience USB 3 I/O issues, try another cable!

3. Software issues

BIOS POST detection and enumeration
Try this - first disconnect your USB Flash drive from the computer and then switch it on and go into the BIOS Menu settings screen. Of course, your USB Flash drive is not connected, so it won't be listed as a connected device. Now plug in your USB Flash drive and you will see that the USB drive is still not detected by the BIOS. So this proves that the BIOS must detect and enumerate USB devices at some early stage following power-on or reset/reboot.

Not only does the BIOS have to detect if a USB device is connected, but it also needs to determine what type of device it is (e.g. USB CD, USB DVD, USB Floppy drive, USB Flash (Removable) drive, USB HDD, USB keyboard, USB Bluetooth KBD dongle, USB Bluetooth mouse dongle, etc.) and also which of the two sets of contacts to use to talk to it (i.e. the USB 1&2 or the USB 3 contacts).

You should therefore be aware that the BIOS may detect the type and speed of a USB device on power-up only, whilst other BIOSes may re-detect the type/speed on a 'warm reset', some other BIOSes may not. The safest thing to do, is to switch off - connect the USB device - switch on

Note that unplugging and re-connecting the USB device each time you boot up may also help, because many USB ports still have 5V power even when the PC/Notebook is 'off'. By disconnecting the USB device we can be sure it has been 'reset' and not stuck in a strange state.

USB transfers can take place using a variety of transfer protocols. For instance, large amounts of data can be transferred more efficiently using a block transfer protocol, but this protocol is generally slower when used to transfer smaller blocks such as single sectors. So the BIOS USB driver software also has an affect on USB speeds depending on how the driver code has been written.

Of course, unlike Windows/Linux USB drivers, retries, error detection and correction may or may not be supported by the relatively simple BIOS code used by the BIOS developers...

Testing

The Easy2Boot utilities menu has a 'Measure BIOS USB performance' menu entry.

Using this, you can compare the performance of different ports, computers and USB drives.

You can also try loading the grub4dos USB 2 driver from the UTILITIES menu. This supports some USB 2 controllers but not the more modern ones or USB 3 devices. On older systems, this grub4dos driver is often faster than the BIOS driver. 

Tip: hold down the SHIFT key whilst E2B starts to boot and it will automatically load the grub4dos USB 2 driver. I don't recommend always loading the grub4dos USB 2 drive because it can cause issues on some systems.

Since the performance test uses grub4dos, it will test the relative speed of an MBR\Legacy BIOS, however we can write a simple lua script which will run under agFM (and both MBR\Legacy and UEFI boot to agFM).

Running the lua script (using agFM)


File access lua script

x = os.clock()
for i=1,500 do 
grub.file_exist ("(hd0,1)/menu.lst")
grub.file_exist ("(hd0,2)/efi/boot/bootx64.efi")
end
print("Time for 1000 file accesses: ", (os.clock() - x) / 1000)
print("Press 1 and ENTER to run 10,000 file accesses")
print("or ENTER to finish")
line = input.read ()
if ( line == "1" )
then
x = os.clock()
for i=1,5000 do 
grub.file_exist ("(hd0,1)/menu.lst")
grub.file_exist ("(hd0,2)/efi/boot/bootx64.efi")
end
print("Time for 10,000 file accesses: ", (os.clock() - x) / 1000)
end


These tests should never 'hang', so you can also use them to check if there are I/O problems. The results should be consistent and reproducible and they should never cause the test to lock up or produce an error.

For a longer test, change the 5000 counter to a larger number. Accessing two different partitions forces grub2 to flush its buffers/cache each time.

This script can be found in the UTILITIES menu folder in E2B v2.08e Beta and later versions (Alternate Downloads Area).

Note that because the UEFI USB2 and 3 drivers will be completely different from the BIOS Legacy USB 2 and 3 drivers, the Legacy and UEFI timings will not necessarily be the same.

Tips

  1. If you need to use a USB cable, then use a good quality USB 3 cable.
  2. Try all the USB ports on the computer (front/side/rear).
  3. Try different computers (to see if the fault follows the USB device).
  4. Remove as many unwanted USB devices from the computer as possible - they may be interfering with your device. This especially applies to USB 3 devices.
  5. Check the USB port voltage (with the USB device connected).
  6. Check that good contact is being made (don't use plastic USB connectors!) - sometimes using a USB 3 extension cable can cure a contact problem because the cable has better USB connectors than the device or computer.
  7. When testing, always switch off the computer - unplug the USB device - reconnect the USB device - switch on the computer. This sequence ensures that the USB device is reset and the BIOS POST will detect the USB device correctly on power-up. Do NOT rely on a warm reboot (CTRL-ALT-DEL) - the BIOS may not re-detect your USB device correctly!
  8. You can force a USB 3 port to detect a USB 3 device as a USB 2 device by connecting it to the PC with a USB 2 cable. Booting may be more reliable at USB 2 speeds.
  9. If there are USB 3 shielding or ground issues within the USB 3 socket or USB 3 device, sometimes connecting the USB 3 device via a good quality USB 3 cable can solve intermittent I/O problems. There are typically three reasons for this - 1) you are moving the USB 3 device away from the computer and other USB 3 devices which may be a source of EMI\RFI, and 2) the shielding in the cable helps to absorb high-frequency noise on the signal wires. 3) Better contact is made by the cable's USB connectors.

4 comments:

  1. I appreciate the time you take to share your findings. Very useful.

    Thanks.

    ReplyDelete
  2. @Steve Si please i got a diferent issue regarding usb drives. when i connect my usb3 pin into a pc with old bios and usb2.0 ports i got very low speed reading from usb . i have to navigate to install custom usb driver included in e2b everytime to get usb working at full speed . how can i set usb driver to be installec automatically each time i boot tom my e2b disk??

    ReplyDelete
    Replies
    1. Just hold down the SHIFY key as E2B boots to load the grub4dos USB 2 driver.

      Delete