USB Reverse Engineering Without a Logic Analyzer: The usbmon Approach

Most people associate protocol reverse engineering with hardware: a logic analyzer clipped onto data lines, a USB sniffer dongle sitting between host and device, a scope probe and a soldering iron. The assumption is that you need to intercept signals before the OS layers abstract them away. For USB, that assumption is wrong.

The Linux kernel has shipped a USB monitoring subsystem called usbmon since 2.6.15. It exposes every USB transaction that passes through any host controller on the machine as a character device, readable by userspace. Wireshark speaks directly to it. On Windows, USBPcap provides the same thing via a kernel-mode filter driver. The result is that full USB traffic capture, including the enumeration sequence that exposes device descriptors, requires nothing more than loading a kernel module and opening Wireshark.

A recent walkthrough at crescentro.se illustrates this clearly with a practical reverse engineering exercise. The article is a good demonstration of what the tools look like in use. What it opens up, though, is worth exploring in more depth: the actual USB protocol model, where the meaningful data lives in a capture, and what you can do once you have it.

Getting the Capture Running

On Linux, the setup is two commands:

sudo modprobe usbmon
lsusb

The first loads the monitoring module. The second tells you which bus your target device is on. If lsusb shows your device on Bus 003, the capture interface in Wireshark is usbmon3. The special usbmon0 interface captures all buses simultaneously.

The permissions story is slightly awkward. By default, /dev/usbmon* is root-only, which means either running Wireshark as root (bad) or setting up a udev rule:

echo 'SUBSYSTEM=="usbmon", GROUP="wireshark", MODE="0640"' \
  | sudo tee /etc/udev/rules.d/99-usbmon.rules
sudo udevadm control --reload

On Windows, USBPcap installs as part of the Wireshark installer. After a reboot, the interfaces appear as USBPcap1, USBPcap2, and so on, each corresponding to a USB host controller.

The URB Model

Every packet Wireshark shows you from a USB capture is a URB: a USB Request Block, which is the kernel’s internal data structure for a single USB transaction. Understanding the URB fields is what separates a confusing wall of packets from a readable protocol trace.

The two most important fields are usb.urb_type and usb.transfer_type. The type is either S (Submit, host to device) or C (Complete, the device’s response or acknowledgment). The transfer type tells you the USB transfer class:

Control (type 0): Used exclusively for enumeration, descriptor fetching, and class-level commands. Every device has a control endpoint at address 0.
Interrupt (type 3): The transport for HID devices. The host polls the endpoint at a fixed interval; the device returns data when it has something to report. This is where keyboard, mouse, gamepad, and drawing tablet data lives.
Bulk (type 2): Mass storage, USB serial adapters, printers. High throughput, no latency guarantee.
Isochronous (type 1): Audio and video. Guaranteed bandwidth, no retries.

For most USB RE work involving input devices, the interesting traffic is interrupt transfers. The Wireshark display filter to isolate them is:

usb.transfer_type == 0x03 && usb.urb_type == 0x43 && usb.data_len > 0

The 0x43 is the ASCII code for C, matching Complete URBs. The data_len > 0 excludes the empty polling requests that return nothing. What remains is every HID report the device sent to the host.

Enumeration and the HID Report Descriptor

Before the interrupt stream makes sense, you need to understand how the device describes itself. USB enumeration is a control-transfer conversation where the host reads a hierarchy of descriptors from endpoint 0: the device descriptor (VID, PID, device class), the configuration descriptor, interface descriptors, and endpoint descriptors.

For HID-class devices, there is one more descriptor that matters above all others: the HID Report Descriptor. This is a variable-length byte array, fetched via GET_DESCRIPTOR with wValue = 0x2200, that encodes exactly how to parse every report the device sends. It uses a compact tag/size/value encoding defined in the HID specification.

A minimal three-button mouse report descriptor looks like this:

05 01  # Usage Page (Generic Desktop)
09 02  # Usage (Mouse)
A1 01  # Collection (Application)
  05 09  # Usage Page (Buttons)
  19 01  # Usage Minimum (1)
  29 03  # Usage Maximum (3)
  15 00  # Logical Minimum (0)
  25 01  # Logical Maximum (1)
  95 03  # Report Count (3 fields)
  75 01  # Report Size (1 bit each)
  81 02  # Input (Data, Variable, Absolute) -- 3 button bits
  95 01  # Report Count (1)
  75 05  # Report Size (5 bits)
  81 03  # Input (Constant)               -- 5 padding bits
  05 01  # Usage Page (Generic Desktop)
  09 30  # Usage (X)
  09 31  # Usage (Y)
  15 81  # Logical Minimum (-127)
  25 7F  # Logical Maximum (127)
  75 08  # Report Size (8 bits)
  95 02  # Report Count (2)
  81 06  # Input (Data, Variable, Relative) -- X and Y
C0 C0

This tells you the report is 3 bytes: the first byte has button bits in positions 0-2 (with 5 padding bits above), and bytes 1 and 2 are signed X and Y deltas. With this, every subsequent interrupt report is unambiguous.

You can fetch and decode the report descriptor from a live capture by filtering for usb.setup.bRequest == 0x06 && usb.setup.wValue == 0x2200, then following the corresponding Complete URB. For Linux systems, the hid-tools package provides a cleaner path:

sudo hid-recorder /dev/hidraw0

This prints the decoded report descriptor and streams live reports in the same session, which is often faster than parsing a full Wireshark capture when you just need to understand a HID device.

The Empirical Correlation Workflow

Not every device plays by the HID rulebook. Devices that use bInterfaceClass = 0xFF (vendor-defined) have no report descriptor at all; you are entirely on your own mapping bytes to behavior. Drawing tablets from non-Wacom vendors frequently work this way. So do custom hardware dongles, proprietary input devices, and anything using a vendor-specific protocol over bulk transfers.

The methodology here is systematic and slightly tedious. You start a capture, perform an isolated action (press one button, move the axis to a specific position, activate one feature), stop the capture, and look at which bytes changed. Repeat for each input. The tshark command-line tool extracts the raw payload bytes to a text file:

tshark -r capture.pcapng \
  -Y 'usb.transfer_type==3 && usb.urb_type==0x43 && usb.data_len>0' \
  -T fields -e usb.capdata \
  > payloads.txt

A short Python script turns that into a diff feed:

reports = []
with open("payloads.txt") as f:
    for line in f:
        reports.append(bytes.fromhex(line.strip().replace(':', '')))

ref = reports[0]
for i, r in enumerate(reports[1:], 1):
    changed = [j for j in range(min(len(ref), len(r))) if ref[j] != r[j]]
    if changed:
        print(f"Report {i}: offsets changed = {changed}")
    ref = r

Once you have a byte offset map, you can write a pyusb script to interact with the device directly without needing its original driver:

import usb.core

dev = usb.core.find(idVendor=0xABCD, idProduct=0x1234)
dev.set_configuration()
interface = dev[0][(0, 0)]
ep = interface[0]

while True:
    data = dev.read(ep.bEndpointAddress, ep.wMaxPacketSize)
    buttons = data[0] & 0x07
    x = data[1]
    y = data[2]
    print(f"buttons={buttons:03b} x={x} y={y}")

When You Need to Send Data Back

Many devices are not just data sources; they accept configuration commands from the host. Illumination control on keyboards, DPI settings on mice, and feature toggles on stream decks all come through HID Feature reports sent via SET_REPORT (bRequest = 0x09). These appear in a Wireshark capture as control transfers early in the session, typically issued by the vendor’s driver during initialization.

Reproducing them with libusb:

libusb_control_transfer(
    handle,
    0x21,    // bmRequestType: host-to-device, class, interface
    0x09,    // bRequest: HID SET_REPORT
    0x0300,  // wValue: Feature report, ID 0
    0x0000,  // wIndex: interface 0
    data, len, 1000
);

This is how projects like python-elgato-streamdeck and the various unofficial Wooting keyboard utilities work: capture the initialization sequence from the official driver, replay it from userspace, then send your own commands.

The Broader RE Ecosystem

Once you have a protocol mapped, a Wireshark Lua dissector turns future captures into labeled, human-readable traces instead of raw hex. You register it by vendor/product ID, and Wireshark will automatically apply it:

local proto = Proto("mydevice", "My Device Protocol")
local f_cmd = ProtoField.uint8("mydevice.cmd", "Command")
local f_value = ProtoField.int16("mydevice.value", "Value")
proto.fields = {f_cmd, f_value}

function proto.dissector(buf, pinfo, tree)
    if buf:len() < 3 then return end
    pinfo.cols.protocol = proto.name
    local t = tree:add(proto, buf())
    t:add(f_cmd, buf(0, 1))
    t:add_le(f_value, buf(1, 2))
end

DissectorTable.get("usb.product"):add(0xabcd1234, proto)

For more adversarial work, the Facedancer framework lets you emulate USB devices in Python, running on a GreatFET or similar hardware. This makes it possible to replay captures from a different host, test how drivers respond to malformed descriptors, or fuzz the USB host stack. The USBFuzz research project, which found vulnerabilities across Linux, macOS, Windows, and FreeBSD, used this approach.

The throughline from a curiosity about what your USB device is doing to serious security research is shorter than it looks. The tools are already on your machine; the kernel is already capturing. All that is needed is knowing where to look.