A nearly year-long debugging journey at Oxide Computer Company uncovering a race condition in the STM32H7 ethernet DMA driver. The bug caused the management network to lock up during firmware updates. The root cause: the DMA hardware reads descriptors in order (word 0 first), while user code writes the buffer address to word 0 before setting the OWN bit in word 3. When hardware VLAN tag stripping is enabled, the hardware overwrites word 0 with the VLAN tag (0x301), creating a race where hardware could read the stale VLAN tag value as a buffer address. The fix requires using the hardware's tail pointer register to properly synchronize descriptor ownership. As a bonus, the author demonstrates that with nested VLANs enabled, an attacker can craft packets to write arbitrary data anywhere in RAM. ST's HAL driver and several RTOS ethernet drivers (FreeRTOS+LwIP, Zephyr) were found to mismanage the tail pointer, though they avoid the vulnerability by not enabling VLAN tag stripping. ST has since published a technical advisory and updated their HAL driver.

15m read timeFrom mattkeeter.com
Post cover image
Table of contents
Context"Ethernet's haunted"Mysteries in the DMA registersThe jumpscareVLANs in the management networkTo the codeA horrifying realizationFun with VLAN tagsThe fixCommunication with the vendorFurther readingPost-publication updates

Sort: