EXOTIC SILICON
“Putting a BSD system in your pocket”
Investigating errors from sxirsb
WARNING
The information presented on these pages is NOT intended to be followed as a guide to installing OpenBSD on your own Pinephone device, and must not be used for this purpose.
Unlike most SBCs, the Pinephone contains a rechargeable battery intended to power the device. Correct configuration of the charging circuits, including various safety features such as thermal protection will not be enabled by the current OpenBSD kernel as of the time of writing.
Series navigation
THIS IS PART FOUR OF SEVEN (4/7) - CHECK OUT THE INDEX
Part 3
Part 5
Website themes
The Exotic Silicon website is available in ten themes, but you haven't chosen one yet!
First steps at debugging
If you've read part two of this series, you'll know that two error messages were repeatedly flooding the console when I first booted the pinephone into the OpenBSD kernel:
Errors from sxirsb
sxirsb0: RD8 failed for run-time address 0x2d
sxirsb0: WR8 failed for run-time address 0x2d
From the manual page for sxirsb, we learn, (if we didn't already know), that sxirsb is a driver for the Reduced Serial Bus, commonly known as RSB. This is similar in concept to I2C, which is used to inter-connect devices like sensors to chips that can process their data. In the Pinephone, it's used to connect to the power management chip, or PMIC, which itself appears as a axppmic device to the OpenBSD kernel.
The source code for the sxirsb driver in OpenBSD is in /usr/src/sys/dev/fdt/sxirsb.c. If you didn't know that, you could easily find out by running:
Finding the kernel source code for the sxirsb driver
# grep -R sxirsb /usr/src/sys/
This code has been in the tree for four years, and remained functionally unchanged between revision 1.2 which was included in OpenBSD 6.3 and revision 1.3 in OpenBSD 7.0. The file is only 340 lines long, so even if your C programming ability is limited, it shouldn't be too difficult to follow enough of the logic to see what is going on.
Reading the code, we can immediately see that each of the two error messages above is only generated by a single funcion. In the case of the RD8 message, it's from rsb_read_1, and the WR8 message is from rsb_write_1. Both functions call sxirsb_do_trans, and if they receive a return value other than zero, then the function prints it's corresponding error message and in the read case, returns a value of 0xFF to it's calling function.
So we obviously need to look at the code for sxirsb_do_trans to find out in what circumstances it would return a non-zero value.
Looking at sxirsb_do_trans, which is only 20 lines long, we can see two codepaths which return non-zero. One returns ETIMEDOUT, and the other returns EIO. Even without further understanding the code at this point, we can add two printf calls to identify which path is being followed. I also shortened the original error message to reduce the amount of text being sent the console. Then we just need to re-compile the kernel, reboot and observe the output.
Since we're only using a single CPU core clocked at 816 Mhz, you might expect the kernel compile to take a fair time to complete, and to be fair, it isn't exactly fast. However, after modifying the GENERIC kernel config and removing a lot of un-necessary code from the build, the Pinephone compiled the kernel in about 15 minutes, which was tollerable.
The results showed that the code was in fact returning EIO, and never returning ETIMEDOUT. This can only happen if the value of 'stat' isn't the expected RSB_STAT_TRANS_OVER, which is defined as 1 at the beginning of the file. The value of 'stat' is returned from the HREAD4 macro, also defined near the beginning of the file, which calls bus_space_read_4, which is defined separately for each hardware architecture in the architecture specific bus.h file.
At this point, it's fairly obvious that the value is coming more or less directly from a hardware device, and the name 'stat' suggests that it's some kind of status register. Looking at documentation for the RSB, we can see that there is indeed a status register in the RSB controller, which should be set to 1 after a successful transaction, which matches what the code checks for with RSB_STAT_TRANS_OVER.
In the case of an I/O error, some of the upper bits of this status register are supposed to indicate exactly what failed, although in the case of a single byte transfer this isn't particularly useful information. Adding more debugging code and re-compiling the kernel again, I was able to see the actual values returned:
Additional debugging information from a modified kernel
RD8 failed with: 0x103
WR8 failed with: 0x003
Interestingly, the value for the write command only has the lower two bits set. Looking closely at the code again, we can see that stat is defined as a uint16_t, so bits 16-31 from the status register would be being lost.
Changing stat to uint32_t reveals the complete value:
Even more debugging information from a further modified kernel
RD8 failed with: 0x00000103
WR8 failed with: 0x00010003
Clearly, either some aspect of the RSB protocol isn't being followed sufficiently correctly by the RSB controller driver code in the OpenBSD 7.0 kernel, or the hardware in the Pinephone is configured differently to what might be expected. Or both.
Comparison with NetBSD 9.2
NetBSD also has code to access the RSB controller, in sys/arch/arm/sunxi/sunxi_rsb.c. A quick look at the source code for both implementations suggested to me that there was no code-sharing, and both seem to have been developed completely independently. As a result, I decided to install NetBSD 9.2 on the Pinephone, and see amoungst other things, whether communication with the RSB controller was successful or not.
During boot, the NetBSD kernel returned the following output regarding the RSB controller:
Boot-time output from NetBSD 9.2 regarding the RSB controller
[ 1.0000060] sunxirsb0 at simplebus1: RSB
[ 1.0000060] sunxirsb0: interrupting on GIC irq 71
[ 1.0000060] iic0 at sunxirsb0: I2C bus
[ 1.0000060] axppmic0 at iic0 addr 0x3a3: AXP803
[ 1.0000060] sunxirsb0: transfer error, id 0x00
[ 1.0000060] sunxirsb0: SRTA failed, flags = 8, error = 5
[ 1.0000060] axppmic0: couldn't read chipid
Obviously we're seeing a similar error here. Error 5 is EIO, and looking at the source code, this output is what we might expect if the status register returned the same value as it was returning whilst running OpenBSD.
Unfortunately, I wasn't really able to use NetBSD 9.2 on the Pinephone, as there seemed to be a problem with the serial driver code, making the console virtually unusable. Output from the Pinephone was received fine over the serial link, but sending data in the other direction, my keystrokes were often received incorrectly.
Initial conclusions
Before testing the code from NetBSD 9.2, I had wondered if the errors reported by the OpenBSD 7.0 kernel could have been the result of a trivial bug. However it seems unlikely that the same bug could have been introduced independently in both codebases, and more plausible that the specific hardware configuration in the Pinephone requires strict adherence to a detail in the RSB protocol that many or most other devices do not.
Series navigation
Part 1 - Building the installation media and installing.
Part 2 - Booting the completed installation and initial information gathering.
Part 3 - Starting to debug USB issues.
Part 4 - Investigating errors from sxirsb.
Part 5 - Controlling the LEDs and vibration motor.
Part 6 - PMIC and battery charging.
Part 7 - External keyboard battery.