Hacking the Devicetree to Achieve the Linux QSPI Boot Trifecta
This is a tale of pain, grief, and redemption when working through strange Linux behavior and boot failures, and what happens when the devicetree doesn't match the underlying hardware.
I recently programmed my QSPI with a new Linux OS image and now it won't reboot. It booted the first time from QSPI, but now won't reboot. Say what? How can that be?
First, let's take a short trip in the PetaLinux BSP time machine. Back in PetaLinux 2019.2 (and previous) we had a simple way to describe the QSPI Flash in the Linux devicetree for the MiniZed, MicroZed, and PicoZed SOMs in our PetaLinux BSPs. This allowed us to boot from the QSPI Flash (MicroZed example):
&qspi {
flash0: flash@0 {
compatible = "micron,n25q128a13";
};
};
Nobody ever noticed (oops, we never tested?!) that although the SBC or SOM could boot from QSPI, the actual Flash partitions weren't visible once the Linux OS booted. One of my FAE colleagues brought this to my attention and this was fixed by changing the devicetree description for the QSPI node to:
&qspi {
#address-cells = <1>;
#size-cells = <0>;
flash0: flash@0 {
compatible = "micron,m25p80";
reg = <0x0>;
#address-cells = <1>;
#size-cells = <1>;
};
};
Now we had a solution that worked for booting from QSPI and viewing the Flash partitions once Linux had booted (MicroZed example):
# cat /proc/mtd
Once again nobody ever noticed (oops, we never tested?!) that, at least for MicroZed, even though the SOM would boot Linux the first time after the QSPI had been programmed with the OS image, it would fail to reboot from QSPI. Not just a oops-kernel-crash kind of failure. This was a no-lights-what-happened-this-board-is-dead sort of (re)boot failure. No DONE LED turned on, no messages on the UART. Nothing. This clearly meant that something had failed during the Zynq First Stage Boot Loader (FSBL), but what? Why? After much head scratching and Googling and forum digging I stumbled across this Xilinx forum post:
I found this nugget of a clue in that forum post:
"Power-up booting from QSPI worked well, but when trying to reboot it would fail to load FSBL. Turns out Linux changed the SPI flash mode, which the BootROM did not handle."
Hmmm...this sounded like my issue. In my case the FSBL was failing (for the sake of this story let's say this is analogous to the bootROM). Could something be changing the QSPI from 4-bit data width mode to 1-bit data mode like the forum post suggested? The forum post goes on to say that this was supposed to have been fixed in the Linux kernel, and since this forum post is a few years old by now (2018) the kernel is not the likely issue. That got me thinking, though, that the devicetree has a big influence on what Linux does with devices during boot. Could that be the issue? Could there be something in the QSPI node in the devicetree that was allowing the OS to boot OK the first time from QSPI, but then changing the QSPI so subsequent boots would fail? After more Googling and head scratching I stumbled into the "U-boot QSPI Driver" page on the Xilinx wiki:
https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18842465/U-Boot+QSPI+Driver
A-ha! Eureka! Here was a known-good example of a QSPI description in a devicetree that should allow for boot and reboot from QSPI:
This devicetree QSPI description was very similar to what I had in my devicetree except for the partition descriptions (and those aren't really needed because they are described elsewhere in the system-conf.dtsi file) and the "compatible =" string named a different QSPI device. That was a huge clue. I recalled from previous experience - recall earlier the difference between the devicetree that simply worked for QSPI boot vs. the devicetree that allowed me to see the QSPI partitions in Linux - that simply changing the device named on the "compatible =" string made a huge difference in how Linux and the QSPI behaved. I decided to change the QSPI description in my devicetree to (mostly) match the example from the Xilinx wiki:
/* QSPI partitions are defined with petalinux-config and described in system.conf.dtsi */
&qspi {
#address-cells = <1>;
#size-cells = <0>;
status = "okay";
is-dual = <0>;
num-cs = <1>;
flash0: flash@0 {
compatible = "n25q128a11";
reg = <0x0>;
spi-tx-bus-width = <1>;
spi-rx-bus-width = <4>;
spi-max-frequency = <50000000>;
#address-cells = <1>;
#size-cells = <1>;
};
};
That worked! After rebuilding the OS image and reprogramming the QSPI I was then able boot and more importantly reboot from QSPI and still see the QSPI partitions in Linux. The trifecta achieved! You can also be sure now that all three QSPI uses - boot, partitions, and reboot - will be tested in future PetaLinux BSP releases.
Buy MiniZed Evaluation KitBuy MiniZed Evaluation Kit
Buy MicroZed Evaluation KitBuy MicroZed Evaluation Kit