machine.Pin('P15') crash debugging
I noticed that
machine.Pin('P15')causes a crash on LoPy firmware <=0.9.4.b1:
>>> p=machine.Pin('P15') Guru Meditation Error of type IllegalInstruction occurred on core 0. Exception was unhandled. Register dump: PC : 400d68b8 PS : 00060930 A0 : 800d6e81 A1 : 3ffdd370 A2 : 3ffd5cd0 A3 : 00000000 A4 : 00000001 A5 : 00000000 A6 : ffffffff A7 : 3ffdd530 A8 : 800d6e3b A9 : 3ffdd340 A10 : 00000026 A11 : 00000028 A12 : 00000027 A13 : 00000040 A14 : 00000000 A15 : 00000020 SAR : 00000007 EXCCAUSE: 00000000 EXCVADDR: 00000000 LBEG : 4000c46c LEND : 4000c477 LCOUNT : ffffffff CPU halted.
machine.Pin.module.P15exists and doesn't crash to access. So the bug is isolated to some pins when the Pin class is called (which normally would involve creating a new instance). I expect the pin instances are singletons, since id(machine.Pin.module.P19) == id(machine.Pin('P19')). P15 and P16 crash in this manner.
The particular pin I'm debugging, P15, is also known as G0 and GPI38 (P16=GPI39=G3 is similar, but has the Vbat connection on the expansion board). That places it in two unusual groups: above 32 and input only. It's input only because it's one of the dedicated analog inputs, which also means it's an RTC pin.
init()on the pin causes an identically reported crash, suggesting the issue is within
pin_obj_init_helper, which is called by both
objdump -d build/LOPY/release/application.elfindicates that the crash address is in
machpin_disable_pull_up. The odd thing about it is that it gives the precise start of that function, which should be a fully general instruction, hardly invalid. A0 would point at a return instruction in
pin_config, except for the part where it starts with 8 not 4. Not sure if that's at all relevant. d6e3b, found in A8, looks like a jump target in
pin_obj_configure. Those are reasonable places to have visited, though the disassembly seems to have failed to realign after a realigning set of jumps. The compiler generated assembly does not show such confusion.
I'm tired for now. I will probably resume looking at this tomorrow.
@Ralph No worries. It turned out those two were the same subtle bug, and my 3 letter fix was applied not only where I found it but a handful more places. Commit fc4dd58 contains that fix, and reenables a bit of pin initialization which I'm guessing was commented out because it hit the same bug.
Hi @LoneTech. Thanks for helping out here, the embedded team merged your pull request yesterday.
I hear from @daniel that they are planning to completely re-write the interrupt code from scratch. You might want to consider that before you dive in too deep :)
The bug I'm trying to hunt down right now is a little more distinct. The
irqmethod on pins numbered higher than 32 produces irq objects that link to their lower numbered counterparts; so for instance trying to set an interrupt handler for P19=GPIO32 instead affects P2=GPIO0. They produce separate irq objects but both affect the registers for GPIO 0.
sakis last edited by
@LoneTech No problem. I'll create the GitHub issue anyway and if it's fixed by the pull request we can close it. Thanks for your contributions
@sakis Sorry, I'm confusing myself. It's the non-issue of the bit shifts that I did a pull request for, I haven't tracked down the crash. Still learning to get around the code, I'm aiming at the oddity where interrupts seem to map wrong for GPIO>=32 at the moment.
sakis last edited by
@LoneTech Then no need to create a new issue right? Thanks
This post is deleted!
sakis last edited by sakis
Thanks for the detailed report. I confirm that this is a bug and I'll create a GitHub issue.
Minor hint for others who choose to inspect the assembly, since objdump doesn't realign properly (the reason it needs to is perhaps a linker weakness). If you run make with
V=1, it prints the exact commands run. You can then pick a particular object file and replace
-c -o filename.owith
-S -fverbose-asm -o filename.Sto generate somewhat more readable assembly.
Department of pseudobugs: In
pin_set_value, the expression
1 << pin_numberis used for numbers larger than 32. This is undefined behaviour in C, but happens to work because the
sslinstruction only reads 5 bits. GCC seems unaware of this, since adding a bitmask to make the behaviour defined adds an unnecessary
extuiinstruction. Edit: Apparently only because it didn't optimize fully. Doing it in the same expression instead of to a new variable worked better.