CRITICAL: Guru mediation erases the data from flash
I got the below guru mediation after I received a Lora message. The problem is that after the board is restarted, all the py files are erased also the boot.py and main.py files are replaces to default. The critical fact is that If the board is installed somewhere, I need to go an recopy the files on it.
Guru Meditation Error: Core 1 panic'ed (Cache disabled but cached memory region accessed) Register dump: PC : 0x400dbe9c PS : 0x00060034 A0 : 0x80084c5c A1 : 0x3ffc1390 A2 : 0x3ffc7678 A3 : 0x00000001 A4 : 0x8008bdcf A5 : 0x3ffc13a0 A6 : 0x00000020 A7 : 0x00000020 A8 : 0x80084a4c A9 : 0x3ffc1370 A10 : 0x00000012 A11 : 0x3ffc1391 A12 : 0x3ffc1391 A13 : 0x3ffc7690 A14 : 0x3ffc7680 A15 : 0x00000401 SAR : 0x0000000e EXCCAUSE: 0x00000007 EXCVADDR: 0x00000000 LBEG : 0x4009a598 LEND : 0x4009a5c6 LCOUNT : 0x00000000 Backtrace: 0x400dbe9c:0x3ffc1390 0x40084c59:0x3ffc13c0 0x40083eff:0x3ffc13f0 0x40083581:0x3ffc1410 0x40062219:0x00000000 ================= CORE DUMP START ================= JCgAAA0AAABsAQAA eDb9PxA1/T9wNv0/
We did o LOpY but not possible on L01.
it is possible but harder ;-)
However in this way Guru mediation output is not catched.
good point, core crash will not go then to SD card..
then only external device can catch this
Yes, I suspect that logging to an SD card would be a much better option opposed to corrupting the file system and blowing away everything. However, I moved away from the SD card as it wouldn't mount the SD card half of the time. I have both the pytrack as well as expansion board.
Instead of print()ing, I just wrote a function called debug() which prints out the message to serial as well as appends a log file. I run a .flush() method afterwards. It does increase the amount of read/write cycles, but it also makes sure data is actually written in case of a crash or power failure.
I ended up writing a safe shutdown feature in hopes that the corruption issue does not occur.
@livius We did o LOpY but not possible on L01. However in this way Guru mediation output is not catched.
It is really better if you attach SD card and place logging there without touching flash
You can simply wire SD to your e.g. Lopy.
@losi How you are doing serial logging? Is over USB cable? We are using an external device as Arduino nano with SD card on it powered from main board
@losi Yes. We are writing also to flash, small amount of data (100 bytes) but several times with file erasing before.
Although I did not have serial logging enabled at the time, I am seeing this issue as well. Sadly, this happened during a proof of concept demo for a client. We have seen it across multiple devices now. I am running 1.17.0.b1.
This is starting to make sense, as we log a lot of debugging and diagnostic data directly to the flash while it is operating. There is a slight chance the USB is being unplugged (aka: power cut) during one of these writes. I am also frequently performing flushes so we can catch things in the act.
On random powerups, our LED status indicator goes away and we see the blue pulse of death (and blank scripts).
Perhaps this could be a way you could replicate?
@seb I tried the code also on second LOPY board (instead of L01) and I succeed to get a Guru crash. I decoded the backtrace and below are the results. Looking on both traces (L01 and LopY) sounds like a concurrent issue implementation between Lora IRQ and rtc time (IRAM_ATTR attribute handled properly).
Guru Meditation Error: Core 1 panic'ed (Cache disabled but cached memory region accessed) Register dump: PC : 0x400dbec8 PS : 0x00060034 A0 : 0x80084c68 A1 : 0x3ffc1390 A2 : 0x3ffc784c A3 : 0x00000001 A4 : 0x00000000 A5 : 0x3ffd8600 A6 : 0x00000000 A7 : 0x00000005 A8 : 0x80084a58 A9 : 0x3ffc1370 A10 : 0x00000012 A11 : 0x3ffc1391 A12 : 0x3ffc1391 A13 : 0x3ffc7864 A14 : 0x3ffc7854 A15 : 0x700000eb SAR : 0x0000000f EXCCAUSE: 0x00000007 EXCVADDR: 0x00000000 LBEG : 0x40099334 LEND : 0x4009933f LCOUNT : 0x00000008 Backtrace: 0x400dbec8:0x3ffc1390 0x40084c65:0x3ffc13c0 0x40083f0b:0x3ffc13f0 0x40083585:0x3ffc1410 0x40099331:0x00000000 ecoding 8 results 0x400dbec8: system_get_rtc_time at /Users/danicampora/Code/Espressif/IDF/esp-idf-20180112/components/newlib/./time.c line 261 0x40099334: esp_rom_spiflash_read_data at /Users/danicampora/Code/Espressif/IDF/esp-idf-20180112/components/spi_flash/./spi_flash_rom_patch.c line 225 : (inlined by) esp_rom_spiflash_read at /Users/danicampora/Code/Espressif/IDF/esp-idf-20180112/components/spi_flash/./spi_flash_rom_patch.c line 586 0x4009933f: esp_rom_spiflash_read_data at /Users/danicampora/Code/Espressif/IDF/esp-idf-20180112/components/spi_flash/./spi_flash_rom_patch.c line 224 : (inlined by) esp_rom_spiflash_read at /Users/danicampora/Code/Espressif/IDF/esp-idf-20180112/components/spi_flash/./spi_flash_rom_patch.c line 586 0x400dbec8: system_get_rtc_time at /Users/danicampora/Code/Espressif/IDF/esp-idf-20180112/components/newlib/./time.c line 261 0x40084c65: SX1272OnDioIrq at D:\Colateral\Programming\pycom\pycom-micropython-sigfox\esp32/../drivers/sx127x/sx1272/sx1272.c line 1039 0x40083f0b: machpin_intr_process at D:\Colateral\Programming\pycom\pycom-micropython-sigfox\esp32/mods/machpin.c line 187 0x40083585: _xt_lowint1 at /Users/danicampora/Code/Espressif/IDF/esp-idf-20180112/components/freertos/./xtensa_vectors.S line 1105 0x40099331: esp_rom_spiflash_read_data at /Users/danicampora/Code/Espressif/IDF/esp-idf-20180112/components/spi_flash/./spi_flash_rom_patch.c line 224 : (inlined by) esp_rom_spiflash_read at /Users/danicampora/Code/Espressif/IDF/esp-idf-20180112/components/spi_flash/./spi_flash_rom_patch.c line 586
@seb Thanks. I am trying to isolate the issue. The problem is that the behaviour is not always reproducible. In my code I also re-write a file all the time (saving a state that is usefull after wakeup) ... and this might be the issue.
Yes. The issue is persistent. I added here (https://forum.pycom.io/post/16521) the output for 1.17.0.b1:
Can you post the code that causes this? Preferably in a minimal case. I will try to replicate the issue and investigate the cause. Many errors often trash the filesystem (the FAT layer sees a ‘dirty’ filesystem and deletes everything) erasing the Python scripts too in the process. We are looking into alternatives to FAT but this will be a significant task and will take some time to implement.
maybe this is not so helpfull but try recent firmware