Error 11 EAGAIN transmitting over Lora



  • Hello there,

    Im using my Lopy4 + Expansion Board 3 to build a BLE-Device-Scanner for a beacon tracking project of mine that then transmits found beacons via LoraWAN.

    Im running into [Error 11] EAGAIN errors when transmitting via the Lora Socket.

    A weird problem i am running into is that my code works when connecting the board to my computer via USB and when using a power supply in any office building but not when using a power supply in the plug sockets of an industrial workshop that i plan to use my devices in.

    I thought the unreliable power supply was the issue so i bought an uninteruptable power supply (using an internal battery) to provide steady power. The device does not fully crash anymore (it did before) but i still encounter the EAGAIN Error everytime when transmitting. This crash happens even when solely using the battery of the uninteruptable power supply. I have even tried connecting an external Battery to the Ext.Power Pins of the Expansion Board 3 observing the same error.

    I read a very old topic on this forum (https://forum.pycom.io/topic/2590/oserror-errno-11-eagain) saying that this is caused when the message is added to the queue (an underlying FreeRTOS function as i found out) or by "LORA_STATUS_ERROR" which i have no idea as what this is and how to avoid this.
    I also calculated the duty cycle for my payloads to make sure im not exceeding a limit (in case there is one in the pyom firmware).

    Is there any way of finding out more about the error? Any help is much appreciated!

    My code: https://pastebin.com/6udVbhJA

    Thank you in advance!



  • Okay seems like i fixed it or rather mitigated it.

    After a lot of testing out and recording my results i noticed that the EAGAIN Error did happen only about 30% of the time and that i just gut unlucky in assuming that it happened particularly more often using a different power source.

    Secondly, the EAGAIN Error only happened when the joining process took a long time. I assume that this does something that causes a faulty state in the Lopy4.
    This 30% Error chance could be reduced to about 10% by setting the Datarate to 5 during sending.

    s.setsockopt(socket.SOL_LORA, socket.SO_DR, 5)
    

    After this i also increased the Datarate during joining by using

    lora.join(activation=LoRa.OTAA, auth=(app_eui, app_key), timeout=0, dr=5)
    

    I havent done extensive testing after this, but the error has not occured ever since.

    Thank you again @jcaron!



  • @Alfabo Is it the same devices experiencing the issue or not depending on the power source, or are those different devices?

    If different, do they run the exact same firmware, and are the board revisions the same?

    Does the LoRaWAN join work?



  • Thank you for your reply @jcaron!

    Have you checked that the payload does not indeed exceed the max size?

    I wrote a segmenting function that segments into 49 Byte Payloads. This is because its a multiple of my data packets (7 Byte) and layed out for the worst case (DR0 with a max payload of 51 byte).

    Are you in the same location when it works and when it doesn’t

    Yes i made sure that i am at the exact same location. That was an early concern since there is heavy machinery where i plan to deploy the solution (electro magnetic interference).

    Does every single frame (even the first one) result in the error?

    At the moment yes, even the first message does result in the EAGAIN error. In an early version, and without the uninteruptible power supply, about 3-7 messages could be sent before the device would inevitably crash. Sadly i do not know whether the crash was caused by the EAGAIN error back then.

    If you have any logs/traces, especially with time stamps, that could be useful. Also logs from the gateway or network may shed some light on the situation.

    When the error occurs i dont see any logs from my network server. When i connect it to my laptop or any office power supply that is not in the workshop or the uninteruptible power supply i dont see any abnormalities in terms of network logs.

    I will try to reduce the spreading factor for starters and then use some spare batteries i have left just to narrow down the source of the problem. Also i will try another Lopy4+Exp.Board3 Device just in case of electro static damage. Overall it seems to be very related to the source of power, since the exact same code does run perfectly fine when using the right power supply.

    Anyways, i will report back once i found out more.



  • @Alfabo one of the possible reasons of EAGAIN is exceeding max payload size, which can be as low as 51 bytes. From a quick scan through your code it seems you try not to exceed 49 bytes, though I couldn’t swear it. Have you checked that the payload does not indeed exceed the max size?

    Are you in the same location when it works and when it doesn’t, or different locations? Different locations could mean different gateways and different LoRaWAN settings, but also different number of BLE devices captured and thus different payload sizes.

    At SF12 (slowest data rate) a full payload will take nearly 3 seconds to send, so the stack may delay sending the next one for nearly 300 seconds (depending on which exact channels are in use). I don’t remember exactly what happens in that case, but I would expect the with a blocking socket as I believe you use, this would mean a delay rather than an error, but I may be wrong on that one.

    Does every single frame (even the first one) result in the error?

    If you have any logs/traces, especially with time stamps, that could be useful. Also logs from the gateway or network may shed some light on the situation.


Log in to reply
 

Pycom on Twitter