LoPy L01 LoRaWAN blocking socket.send() hangs permanently

mechatron

Using firmware v1.10.2.b1 on LoPy L01 OEM module

Using the code shown below periodically the socket.send() function call hangs forever. The only way to recover is to enable the watchdog timer.

Code:

import pycom
import network
from network import LoRa
import socket
import machine
import time
import binascii

reset_cause = machine.reset_cause()

def select_subband(lora, subband):
    if (type(subband) is int):
        if ((subband<1) or (subband>8)):
            raise ValueError("subband out of range (1-8)")
    else:
        raise TypeError("subband must be 1-8")

    for channel in range(0, 72):
        lora.remove_channel(channel)

    channel_idx = 0
    for channel in range((subband-1)*8, ((subband-1)*8)+8):
        lora.add_channel(channel_idx, frequency=902300000+channel*200000, dr_min=1, dr_max=3)
        channel_idx += 1

# Initialize LoRa in LORAWAN mode.
print("LoRa init")
lora = LoRa(mode=LoRa.LORAWAN, device_class=LoRa.CLASS_A)

if reset_cause == machine.DEEPSLEEP_RESET:
    lora.nvram_restore()

if reset_cause == machine.DEEPSLEEP_RESET and lora.has_joined():
    print('Skipping LoRaWAN join, previously joined')
else:
    select_subband(lora, 1)

    # create an OTAA authentication parameters
    app_eui = binascii.unhexlify('00 00 00 00 00 00 00 01'.replace(' ',''))
    app_key = binascii.unhexlify('00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01'.replace(' ',''))

    # join a network using OTAA (Over the Air Activation)
    lora.join(activation=LoRa.OTAA, auth=(app_eui, app_key), timeout=0)

    print('Joining LoRa network')

    # wait until the module has joined the network
    while not lora.has_joined():
        time.sleep(2.5)

    print('Joined!!!')

# create a LoRa socket
print("LoRa socket setup")
s = socket.socket(socket.AF_LORA, socket.SOCK_RAW)
s.setsockopt(socket.SOL_LORA, socket.SO_DR, 1)
s.setblocking(False)

print("Init complete, running main loop")

# simulate time delay to detect sensors, etc
time.sleep(10)

while True:
    was_lora_tx_error = False
    try:
        s.setblocking(True)
        print('Trying to send...')
        s.send(bytes(21))
        print('info sent')
    except Exception as e:
        was_lora_tx_error = True
        if e.args[0] == 11:
            print('cannot send just yet, waiting...')
            time.sleep_ms(30)
        else:
            print('error: ')
            print(e.args[0])
            raise    # raise the exception again

    s.setblocking(False)

    if not was_lora_tx_error:
        print('Going to deep sleep')
        time.sleep_ms(10)
        lora.nvram_save()
        lora.power_mode(LoRa.SLEEP)
        machine.deepsleep(10000)

PaulM

@crumble Thanks for the info.

I've subsequently acquired the Particle Boron LTE boards which can actually accomplish my live streaming requirement over LTE with MQTT - and they are way lower power usage than Pycom boards, and it only takes 12 seconds to connect to the network, and it's cheaper. Currently looks like I'll be switching to Particle because of the LTE deficiencies with Pycom, unfortunately. The Cortex M4 power consumption situation (down to 17ma/5V WHILE connected to LTE, idle, zero sleep modes active) blows the Pycom ESP-32 out of the water anyways.

Yes, ESP32 is at 240Hz and the Particle Boron is 64Hz. But I will never in a million years need beyond even 24Mhz, and Pycom doesn't give me a way to underclock and match the current consumption.

Kind of frustrating/suprising that I can accomplish my task by radio-ing to a cell-tower and back with Particle, but with Pycom short-range LoRa, no-can-do.

crumble

@paulm

What would you use in my case?

wires ;)

There will never be a fire and forget solution. 250m will be much too far, if you have a railroad embankment inbetween. If you plan to install some electronic flags showing wind conditions on a shooting range, you may not have to deal with this problem. WiFi with directional antennas maybe fine. So you can keep most of your hard- and software setup.

Setting up a fix network setup isn't that complicated.

But it will be still better to avoid collisions by reducing the load and sending only changes. There is even a cool buzzword for this: "edge computing". As a target archer I can't see where you need this level of detail beside some strange scientific stuff. And in both cases I would like to have more devices. When each pulse count, the grid is much too wide.

Hey pycom: send me a pytrack, if paul now places a sensor with a WiPy in 5m steps all over a 100x500x30m cuboid ;)

PaulM

@jcaron said in LoPy L01 LoRaWAN blocking socket.send() hangs permanently:

How many devices? What distances? What environment? Indoors or outdoors? Line of sight? What antennas? Fixed or mobile?

I appreciate your response, especially for my edification on the proper comparison to cellular telephone audio.

Given the following requirements of my application:

2 or 3 devices max
Guaranteed no conflict with other devices: deployed for temporary mobile usage (remote wilderness locations) during certain events
Maximum 18-24 hour battery powered usage. Power consumption not an issue. Even 250ma is acceptable.
Line of sight to vehicle, 500m+ is preferred, 250m acceptable
Live streaming 2-bytes minimum 50Hz, 100-150Hz theoretical capability preferred
Outdoor environment using regular small antennae shipped with Pycom LoPy on the website
Acceptable to miss 1 in 1000 packets, but not 1 in 100

Given these parameters, would you rule out raw LoRa as a viable solution? WiFi could theoretically suffice, but requires a discomforting level of complication with having a router in my vehicle and making certain connections to fixed IP addresses - lots of room for error in my opinion. These need to be "set it and forget it" after deployment, and thus the attractiveness of a raw radio outward beaming protocol (raw LoRa) with no connection formalities (Bluetooth, WiFi, LoRaWAN).

What would you use in my case?

jcaron

@paulm Sorry if I was unclear... There are two issues here:

One issue is that whatever happens, if set to non-blocking, send should not hang. It should either queue up the packet, or return an error, but definitely not hang. I don't know the details about that, nor whether it's actually related to hitting any transmit limits, that's just a pointer. I'll let Pycom or others answer on that part.
The other issue is that LoRa has inherent limits. Some are due to regulations (e.g. the 1% duty cycle on most of the sub-bands in ETSI-land, or the 400 ms dwell time in FCC-land, or the power limits in both cases, or the requirements for frequency hopping or spread spectrum, etc.), some are due to the actual technology itself. Sending at SF12 on a 125 kHz channel, the raw bit rate is so low that even the smallest LoRaWAN packet takes over a second to transmit once you factor in all the overhead (preamble, header, etc.)

You can probably get slightly better results with raw LoRa (but you usually end up replacing the LoRaWAN headers with your own to manage identification, lost frames, etc.). You of course get better results if you go for faster data rates or a higher-bandwidth channel (where available), but even an SF7 frame on a 500 kHz channel takes over 10 ms to send, so 50+ such packets per second will use over 50% of a channel, which once you factor in multiple devices and the fact that the band is shared with other users means you will most certainly have very high packet loss (remember, there is nothing to try to prevent or manage collisions in LoRaWAN). And higher data rates usually mean a lower range.

Your comparison with the phone is not relevant, and wrong. The standard sample rate for fixed line phone service is 8 kHz x 8 bits, not 44.1 kHz x 16 bits (that's the sample rate for a CD). The sample rate for cellular audio is smaller (don't remember the exact figure, and it probably depends on the exact technology used), and the audio is highly compressed (using lossy compression), so the final bit rate is under 10 kbit/s (which is still higher than what you need, but is far from the several hundred kbit/s you were thinking of). But that uses a lot of power, usually over 250 mA for the whole duration of the call, whether you are actually talking or not.

LoRa (and LoRaWAN) is not designed for the same scenario at all. It's a bit like wondering why you can't send a video stream using a fax machine. Different applications, different technologies. The goal of LoRa[WAN] is to enable devices that send very little data, not frequently, can accept packet loss, and need to work on battery with a lifetime measured in months or years. The most common applications are meters (gas/water/electricity...) and all kinds of sensors reporting infrequently. Those usually send a few bytes somewhere between once an hour and once a day, and if you miss a frame here or there, it's usually not too much of an issue (especially for meters, as it's incremental anyway). But that enables them to draw power only a fraction of the time, and sleep the rest of the time, resulting in very low average power draw, and a long battery lifetime. We're talking about sub-mA average power draw.

Now, we don't know the exact parameters of your application (How many devices? What distances? What environment? Indoors or outdoors? Line of sight? What antennas? Fixed or mobile?), so it's difficult to rule out LoRa entirely or to recommend another technology (Wi-Fi, Bluetooth, 802.15.4, raw 900 MHz radios, LTE Cat-M1, NB-IoT...). Even if you are not in the target use case for LoRa, it may work. Or not, if you need to change your data rate or if you change the number of devices, or if you are in a "noisy" environment, which can be the case in an "event".

PaulM

@timh Thanks.

At the very least, if the infinite-hang behavior on s.send() is indeed within spec or even intentional, it should NOT hang, but return with a proper exception, saying something like "packet limit exceeded".

timh

@paulm I think you need to research the limitations of LoRa - I don't see any real misdirection by PyCOM concerning Lora.

Some fundamental limitations on Lora are set by frequency use in Europe for instance.

For instance

https://www.disk91.com/2017/technology/internet-of-things-technology/all-what-you-need-to-know-about-regulation-on-rf-868mhz-for-lpwan/

Excerpt below.

the 868Mhz band is going, from 865MHz to 870MHz and split in 6 different sub-bands where different rules applies.

The rules are based on 2 restrictions:

Transmission power – it is the maximum power an emitter can use on the channel when it is communicating. 25mW (eq 14dB) is the usual power the lpwan uses for communicating.
The duty cycle – it is defined as the maximum ratio of time on the air per hour. Basically, 1% means you can speak 36s per hour, not more. Duty Cycle is applicable for the sub-band.

So in Europe you could not possibly achieve what you want using Lora irrespective of the hardware platform.

I see your using 900Mhz so obviously the utilisation restrictions of Europe 860Mhz don't apply, but even then I fear LORA/LPWAN is not suitable for your duty.

I wouldn't even consider using MQTT over LTE to stream the data at the frequency your are talking about.

PaulM

@crumble You are probably disappointingly right about there being some sort of intentional break-down limitation on the send rate. I don't know why I didn't ever see it last time I experimented with this in the summer, but it makes sense.

If this is the sad case, then Pycom should publish exactly what such limitations are. How many LoRa send messages can I do before the thing will forcibly hang up on me?

Things like this make Pycom boards feel like they are toys for kids learning how to program. I am in need of industrial grade hardware for serious applications. If LoRa is simply incapable of streaming 2 bytes 50 times per second, I will need to ditch it. But that would be indicative of a misleading picture of what LoRa can indeed offer by the way it is specified in the Pycom website materials, I would say. Feel free to pull out some obscure "max X messages per second" documentation and I will eat my words. I never saw it.

PaulM

@jcaron There is a philosophical objection to your post, and a practical one.

The practical one is that I was doing this in the summer for my initial basic tests, and it worked, or else I would not have bought more Pycom boards. Now it magically doesn't.

It is battery powered, for temporary usage during events, 24 hours max. The power consumption is NOT a problem.

The philosophical objection is even more compelling. Even if most applications of LoRa are different, the Pycom boards SHOULD NOT irreversibly hang up and freeze just by using the API. The fact is it is what I need to do for my application.

I am not willing to buy the following: a telephone call, which has been a wireless capability in everyday life for 20+ years, is UNDOUBTEDLY more than "50 per second" samples, of the same 16-bit resolution.

Actually, the standard telephone sample rate is 44,100Hz.

I would like to understand why ancient telephone/walkie talkie tech could transmit a 16-bit audio signal to my car from ~1km range FORTY+ THOUSAND TIMES per second, whereas my Pycom LoRa board can't do the same fifty times per second.

Would you say Bluetooth is a better choice? Perhaps, but totally defeats my purpose in using LoRa. The range is WAY lower, and I have to deal with bothersome "connection" protocols/procedures between my sender and receiver. I don't want that. I liked what Pycom advertised to me: LoRa raw radio beaming, which could live stream pulses to a vehicle or WiFi node from about 1KM.

Let me know what you think in response to this. Either way, there is an incapacitating bug if s.send() over LoRa irreversibly hangs.

jcaron

@paulm 50 packets per second? You have completely missed the point of LoRa. This won’t even be possible at most data rates (sending even a 2-byte packet will take a lot longer than 20 ms, sometimes closer to a second). And even at data rates where this could be possible, you’ll probably end up with so many collisions it isn’t even funny.

LoRa is designed for (very) low bandwidth applications. You are way, way over what it’s designed for (and capable of in many cases).

I hope you are not planning to have this battery powered?

crumble

@paulm

The LoRa modem will keep you from flooding the air with messages. There is a limit per hour of air time you can use. So you cannot stream your RPM pulses.

You have to reduce the amount of data. Don't send pulses. Calculate the speed on the LoPy and send only changed values. At best calculate a curve and send the formula of the curve.

You may have tested it only for a short moment, so that you never reached your message per hour limit.

PaulM

@jcaron Thank you for the response. I need to live-stream anemometer pulses over LoRa. (2 byte frequency level mapped to 0-65535) Compressing multiples into 1 second update packets containing multiple pulse data defeats the entire purpose of using LoRa. My specification needs to be able to, theoretically, send at least 50 2-byte packets per second. And, I think even 1 per second will also crash it (need to definitively test that, but it's not my use case)

I swear I tested this in the summer, when I first purchased Pycom boards for this purpose, and it worked without any hangs.

It now consistently hangs after sending approx 15 - 30 packets even at low send rates e.g. 10 per second will easily induce the crash.

Reproduced both on LoPy and FiPy, newest firmware.

What is going on?

jcaron

@paulm How often do you send? How many packets do you send? That the hang up happen right from the first packet, or after several packets have been sent?

It might be related to some form of duty cycle/dwell time limitation, though I don’t know what the rules are for US915 and how they are implemented.

PaulM

Currently needing to make an incredibly simple Pycom usage (when pin changes, send 2 bytes over raw LoRa)

I am encountering this insane, inexcusable hangs-forever problem, over a year after it was reported, simply by calling socket.send on LoPy4.

I need to deploy in a few hours, and what should have taken 30 minutes has - as usual with Python based boards - taken hours.

I have been so disillusioned with Pycom products - in which I am heavily invested - since they always seem to undergo inexplicable phases of "stuff works properly" and "critical failures no matter what you try".

I will paste my code hereunder. I have confirmed by testing with LED color that it is the socket.send which hangs, notwithstanding the non-blocking call.

Minutes ago I updated to latest firmware. Same.

Very frustrating.

Has this been addressed?

from network import LoRa
import socket
import pycom
import ustruct
import time, utime
from machine import Pin

new_time = 1000
old_time = 0
def pin_handler(arg):
    global new_time
    new_time = utime.ticks_us()

rPin = Pin(Pin('P11'), mode=Pin.IN)
rPin.callback(trigger = Pin.IRQ_RISING, handler = pin_handler)

lora = LoRa(mode=LoRa.LORA, region=LoRa.US915, frequency=928000000, tx_power=20, bandwidth=LoRa.BW_500KHZ, sf=7, coding_rate=LoRa.CODING_4_5)

s = socket.socket(socket.AF_LORA, socket.SOCK_RAW)
s.setblocking(False)
pycom.heartbeat(False)

while True:

    if (new_time != old_time):
        comp_old_time = old_time
        old_time = new_time
        if (comp_old_time != 0) and ((new_time-comp_old_time) > 0):

            freq = 1000000.0/(new_time-comp_old_time)
            packet = ustruct.pack("<BHB", 35, int(freq/(500.0/65535.0)), 36)
            pycom.rgbled(0x030400)

            s.send(packet)
            print(packet)
            pycom.rgbled(0x000000)

    time.sleep(0.001)

jmarcelino

@mechatron

Thanks for the report. I recorded it below and we’ll look into it

https://github.com/pycom/pycom-micropython-sigfox/issues/105

Explore Pybytes | Official Documentation | Report a Firmware Bug/Issue | GitHub

LoPy L01 LoRaWAN blocking socket.send() hangs permanently

Pycom on Twitter