Yes, i have noticed that sometimes some messages arrive malformed or with bytes missing. This could be due to LoRa message collision between devices. If two devices send at the same time in the same band this can happen.
There is a new blog here in the forum with a new code suggestion that includes.
- Message length check on the nano-gateway
- Message retry on the nodes
- Max timeout in the nodes
Number 3 will help with the fact that you have to have the nano-gateway on before you start the nodes. In this code, after sending a message the node waits in an infinite loop for the ack. The problem with this approach is that if no ack is received (the nano-gateway was not connected yet) the node will stay in that loop until reseted.
Keep in ming that both this and the new code are just examples for specific usages and they require adaptation depending on your needs.
The link for the new post is LoPy Nano-Gateway Extended (Timeout and Retry)
@Roberto We setup a test bench with 3 LoPy following the idea of your code. We did the test with blocking socket and thread and non blocking sockets. The result is the same.
2 are the nodes configured as tx_iq and and one is the Gw (rx_iq).
The script is simple: each device (node and GW) is sending 29 bytes message size (_MSG_FORMAT = "!HHLBB%ds") each 2 seconds. In the msg data we are printing a counter (that varies from device). That counter is also in the header (see above)
We noticed that sometimes if you start the gw after the nodes... randomly some of the nodes are not receiving the data, despite the fact the GW is receiving the data from the nodes. And we observed also that sometimes when we start the node after the GW, the GW is not receiving the data from the nodes.
You need to reset the node or GW in order to get the broadcasted data from to the others... and this hazardous and not at all good in real deployment.
BUT THE MOST PROBLEMATIC is the fact that the 29bytes are partially received, sometimes you are getting 15bytes or whatever number less than 29bytes.
Have you experience this connection issue related with nodes/gw starting order?
Does LoPy stack is guarantee the receiving of msg in one chunk? ( if you send 29bytes you got 29bytes)
Here are some samples:
good msg receive
Payload: b'3058- 1234567890 '
<<< beat msgId 3066
remaining buffer 0
bad msg receive
Payload: b'3059- 1234567890 '
buffer too small <<< ustruct exception
@gertjanvanhethof I'm also interrested to use LoPy as a LoRaWan gateway. Looking forward to a solution.
This post is deleted!
On "Nano-Gateway" with the latest firmware 1.3.0.b1 i'm still getting "ValueError: buffer too small" after hour of two running posted sample code. Is this problem with the sample code or is it firmware problem?
This post is deleted!
Just run the Nano-Gateway with the latest firmware 1.3.0.b1. All problems are resolved :D. Thanks Pycom team.
The distance and the power do not seem to be the problem. I tried running the test over 100 meters. If I put a time.sleep(0.2) after lora_sock.recv(512) and lora_sock.send(pkg) on both server side and client side, the code runs. In previous firmware the code run straight out of the box. I think the OSError:[Enro 11] is caused by a cue in the send buffer. Sometimes there are still receive messages left in the buffer, even after a soft reset. I use the code
recv_dumpbuffer = lora_sock.recv(512) time.sleep(0.2)
to make sure the receivebuffer is empty.
the 200ms waits in between Lora commands are hardly ideal. On previous firmware versions the code run straight out of the box.
@Feiko Are both devices close? Try lowering the transmission power to the minimum (2 dBm) and move them several meters apart (or use an attenuator).
I have been doing tests between a LoPy and a Raspberry with a Dragino Lora/GPS hat (SX1276 based) and with just 2 dBm I can receive packets reliably with the RPi and the LoPy in different flats and at the opposite sides of the building. Depending on the antenna positioning I even registered an average SNR of -8 dB, which is amazing.
@Roberto Thanks for your reply.
I've tried to give the program enough time to receive. According to documentation time.time() should be in seconds not milliseconds. However
import time from machine import Timer chrono = Timer.Chrono() chrono.start() timeout = time.time() + 20 while time.time() < timeout: pass chrono.stop() print(chrono.read())
This outputs 3.062622 so 3 seconds.
I've even tried increasing this to 10 seconds, no luck.
I used the deviceId as a package identifier. Bad practice, but it made for an easy test scenario in which I could identify the package beinig acknoledged.
using time.sleep(0.2) before sending the ACK package fixed the nanogateway crash. However the client still receives the wrong response message (DEVICE_ID -1).
You used the new firmware is this 1.2.2.b1 or the upcoming release?
Could you look into reproducing the problem?
Regarding your problem with the NanoGateway code, here are a few remarks from your post:
In your code you are sending a message and changing the ID every time you do it, the original code had this Device ID as an ID of the device not as a message Id witch is the way you are using it here. It was meant as a way of identifying 2 or mode LoPys connected to the same gateway. I actually tested this same code with 10 devices some days after the original post.
Even with 1 in mind you are just giving the device time.time() + 20 (that is 20ms) to
a. Pack the message
b. Send the message using LoRa
c. 2nd LoPy receive the message using LoRa
d. Message unpack and processing
e. Sending ACK over LoRa
The problem with this approach is that in LoRa packages (specially several byte long ones) can take hundreds of milliseconds to transmit.
Since you are checking almost immediately after you send the message and in a while loop with no delays you get that no ACK received output.
My recommendation would be to
a. Extend the amount of time you are going to way for a message (this might change depending on your application). Lets say 20 seconds for testing
b. make the while(waiting_ack and timeout > time.time()) delay at least 100 ms before checking for messages again
Regarding the Ack with wrong device_id: 1 message. A little more of the same as point 2. To illustrate here is a little time span of your problem
a. Message ID 1 is created and send
b. Node waits 20 ms for ack
c. Message arrives at gateway in ms: 25
d. Node already ditched the package as non responding and sends Message ID: 2
e. Node wauts 20 ms for ack on pkg 2
f. Gateway responds to ack pkg 1 in ms 26
g. Node gets to ms 40 and trashes the pack 2 as non responding to ack and sends message ID: 3
g. Message with ack of message 1 arrives at node in ms 48.
h. At this point if (device_id == DEVICE_ID) will check if the ack that jsut arrived (1) matches the current pkg (3). NO. node prints to screen
Ack with wrong device_id
Adding extra time (like in point 2) would fix this issue but it will give you a temporal solution only, since the error will present itself if the message takes more than 20 secs (unprovable). The full solution would require an implementation of a retry message sending process with a message queue of messages that where sent and are waiting for an ACK.
There is a lot of documentation regarding this kind of algorithms out there and it was not implemented in the original NanoGateway post since it escapes the scope of the demonstration code.
Regarding the OSError: [Errno 11] EAGAIN i'll check into this but i tried the original code posted in the blog with the new firmware and was not able to reproduce it.
You can see the default values:
lora.init(mode, *, frequency=868000000, tx_power=14, bandwidth=LoRa.868000000, sf=7, preamble=8, coding_rate=LoRa.CODING_4_5, power_mode=LoRa.ALWAYS_ON, tx_iq=false, rx_iq=false, adr=false, public=true, tx_retries=1)
I see, this uses some default settings for lora (frequency, tx_power, bandwidth, sf, preamble, coding_rate). But what are the default values? Is there a way to get the default values programatically?
@iber The overhead of SSL (handshake alone can be a few kb) probably makes it a bad candidate considering the duty cycle/time-on-air restrictions of LoRa.
I'd suggest using just an symmetric crypto like AES128 much as LoRaWAN uses.
Would it be possible to use SSL on the LoRa with f.e:
ssl.wrap_socketto secure it with a self signed certificate?
It would be great if the RAM is not consumed if we do not use some modules like BT or Wifi or Lora as many applications may only use one of these radios. In my case I only use Lora.
We are checking what's happening with this nano-gateway example and we'll get back to you shortly. The memory error can be explained due to the newly added interrupt handler task which is using some of the GC memory and the Bluetooth system also consuming a big chunk. Espressif is due to provide some good memory optimizations to sort this out and give us back several tens of KBs of precious RAM.
I am getting same error.
Hi, on latest firmware version (1.2.0.b1) you get "MemoryError: parser could not allocate enough memory"
Pycom, can you please provide working sample with the latest firmware version or is this another bug?
I have been playing with it, actually receiving the packets on a Raspberry Pi with a Dragino Lora+GPS Hat and it works. (The "hat" is based on the SX1276).
There is a possibility of making an ugly hack in order to support multiple channels for a gateway. It wouldn´t be a multi channel gateway, but a channel hop capable gateway :)
The trick (I'll try next week) is to use the CAD functionality. Given the length of the prefix it may be possible to do something like this:
if CAD_DONE // this means timeout
next frequency in list
. receive packet
It may be a bit tight, but it should be possible to monitor three frequencies. I'll get back with the results.