Nanogateway Downlink Problems (TTN and loraserver)
I am using lopy (I also have used Fipy with the same results) as nanogateway and I am able to upload messages from a microchip node RN2903 through the nanogateway to TTN and loraserver. However, I can't receive downlink messages (either automatic ACKs responses without payload as response to confirmed uplink messages, or downlink messages with payload, or even OTAA joins), even when those downlink messages have been generated in the noetwork servers.
The problem seems to be related to the nanogateway because I have tryed with different network servers (loraserver.io and TTN) and different LANs. I have also tryed using a different gateway (multitech conduit gateway) and in that case the netwok works perfectly: I can use any server, any node, any join method, and send and receive uplink and downlink messages. The problem is when trying to use the nanogateway to handle downlink messages.
I have seen in someother posts that some of you experienced sync problems with RX windows; however, in my case the downlink message never reaches the LoRa RF channel (I have confirmed this with a spectrum analyzer). It seems that the problem is in the communication between the gateway and the network server. I could not find much log information from TTN, but the logs of loraserver's lora-gateway-bridge suggest the nanogateway can't handle some PullACK messages:
Note: The boxed logs correspond to the transmission of a confirmed uplink message on frequency 902.3MHz, with one retransmision because the ACK is not received in the RN2903.
Following are the screen shot from the loraserver web-gui for the same experiment (the last downlink is displayed in more detail) and the corresponding jsons of the messages shown in the screen:
With TTN is the same thing, but I get more information from loraserver.
I would appreciate any help, bacause although I can use other gateways, it is also usefull using the 1-channel nanogateway in some cases.
Thnks in advance
@gnm If you can build your fipy firmware, then you could make the changes I did to get a reliable timing. The diff is in this thread: https://github.com/pycom/pycom-micropython-sigfox/issues/141
It's just a few lines of code in three (for Fipy two) files. It gets down to call other internal functions for the µs-timer in the files . (call mp_hal_ticks_us() instead of system_get_rtc_time() in the files esp32/mods/modutime.c, drivers/sx127x/sx1276/sx1276.c and drivers/sx127x/sx1272/sx1272.c).
I made also small changes to the nanogateway script. For that there exists a PR. https://github.com/robert-hh/py<com-libraries/blob/nanogateway/examples/lorawan-nano-gateway/nanogateway.py
Hi @robert-hh, thank you for the observation. You are right.
I noticed that I wasn't upgrading the firmware to the latest version, so I finally upgraded my Fipy to v1.17.0.b1, changed some minor things, and using TTN now the timestamps are right, and the RF signals come in a reasonable time (~1sec and ~5sec, for ACK responses to confirmed messages and for OTAA join response), but I am not using an oscilloscope so I don't have a precise timing. However, there is still no chance to synchronize the downlink messages with node's Rx1 window (to be more precise, the node never receives/understand the incoming response).
I will try to use a Lopy as node and try to increment the duration of Rx1 (now I am using a microchip node that doesn't all me to change Rx1 duration). I hope the pycom guys to release a new and stable version as soon as possible.
Thanks for your help
@gnm The delay should be much shorter. It is 5 seconds for a join request/join response pair, and 1 second for a uplink message/downlink message pair. There is no need to synchronize times. That is all done in the gateway. When the gateway receives a message, it adds to it a time stamp (let's call it t) from it's arbitrary clock and sends it to the TTN server. When the TTN server respons with a downlink message, it takes that time stamp anbd adds to it the intended submission time, which is t+5 for a join response, and t+1 for any other downlink message. The gateway then takes this time, which is based on its won clock, to determine when to forward the downlink message to the node. Thus, the only requirement is for the gateway's clock to be precise for 1 or 5 seconds within the time window, the node opens for receive. For the pycom example, this is +-/- 10 ms. I could not find any statement on how long this window is to be expected, other than is must be long enough to receive 5 bytes of the message preamble, which depends on the data rate.
In Firmware v1.17.0.b1 (and surely others too) this is not guaranteed, because the timer is device from a RC-Oscilator based clock. The patch I made just changes that to a crystal based oscillator.
The other problem is timing calculation in the Pycom nanogateway example, which uses simple + and - instead of ticks_add() and ticks_diff(). The latter copes with the overflow condition of the timer, which happens ~35 Minutes after booting. A fix is here: https://github.com/pycom/pycom-libraries/pull/54. After booting the gateway, that problem should not occur for 35 minutes.
Now I have tryed connecting with TTN servenr and I obtained the following error that prevents the nanogateway to send the downlink message:
UDP recv Exception: overflow converting long int to machine word
Hi @robert-hh ,
I have upgraded the firmware and forced the nanogateway code to transmit downlink messages even for bad timestamp, and now I am able to see the response in the spectrum analyzer.
The problem I am facing is that the delay is tremendusly high (in the order of tens of seconds, so I don't need the oscilloscope to see it :)). I can change the delay by changing the value in the timestamp (hardcoding it), but it is hard to do it right. I guess this can be related with the fact that I can't connect with the NTP server (blocked ports), and the multitech gateway (the one working properly) can obtain the time by a GPS. I will try to use a pytrack board to obtain the time reference and see if that improves the performance. Eventhough, It is possible that I will face the same sync problem you faced...so I will have to learn how to compile the code from the repo and load it into my pycom, just the way you did.
@robert-hh I am pretty sure. Even more, I was able to catch the Send messages using different SF (even with SF7), but also the Received mesages when ussing my other gateway (Multitech). However, one can never be completely sure, and I will try again if I can make myself again with the spectrum analyzer.
Thanks for your help.
P/D: Nice trick to use the oscilloscope for the timing issue debuguing
@gnm Are you sure that with the spectrum analyzer, set to the full bandwidth of the US band, catch the ~50ms of RF signal when the message is sent?
Last week I used a spectrum analyzer to compare antennas, and for that test I've set it to zero span and triggered it by a GPIO pulse created by the xxPy device, which gave a good steady pulse.
My experience was, that if the xxPy device tells it's sending, it really does.
For my test on the gateway timing I used a normal low bandwidth oscilloscope and a 1N4148 diode to detect the presence of an RF signal.
Edit: I just tried the spectrum analyzer in frequency domain mode with a trigger signal. It works. You just have to cater for the fact, that between a GPIO pulse just before the send command and the actual start of RF it takes about 6 ms. you have to set the start frequency of the sweep sufficiently low, depending on the sweep time and the frequency range. So if e.g. the sweep time is 20ms, and the range is 20 MHz or 1 MHz/ms, then the start frequency should be at least 6 MHz lower than the signal you want to catch.
P.S.: Sorry if I'm telling the obvious
@robert-hh Yes I am sure. Thats why I associated the following error message with a problem of the nanogateway to understand the downlink message from the networkserver:
lora-gateway-bridge: time="2018-03-09T14:12:50-03:00" level=error msg="gateway: could not handle packet: gateway: unknown packet type: PullACK" addr="192.168.0.8:49153" data_base64="AovCBDCupP/+KkzQeyJ0eHBrX2FjayI6IHsiZXJyb3IiOiAiVE9PX0xBVEUifX0="
Until monday or tuesday I will not be able to use the spectrum analyzer again, but I double and triple checked it, and I was observing the entire bandwith of 915 band.
It is good to know that the messages I see in the nanogateway logs means that the downlink message properly reached the gateway, and those error logs in the bridge are not very important.
I will try again next week with the spectrum analyzer and I will let you know.
@gnm You should see RF activity at downlink messages. I just noticed the frequency values in the logs. When receiving, it's told to be 902.3 MHz, when sending it's 923.3 MHz. That's according to the Lora spec. Are you sure you tuned your spectrum analyzer to the right frequency?
Hi @robert-hh , tranks for your quick response.
I am a bit confussed, so let me ask you something:
When you talk about the timing requirements, are you talking about the sinchronization of the downlink LoRa packet from the gateway to the node that can't fit inside the RX1 window? Because I think my problem is before that, since I don't see any RF activity of the nanogateway (I am using a spectrum analyzer and I can see this activity with the conduit gateway, for example, but not in the case of the nanogateway)...maybe if I solve this, next I will face the timing issue.
Maybe it helps if also post the logs from the nanogateway when sending 3 uplink messages (first an uncorfimed "04"-payload message, followed by a confirmed "05"-payload message, and finally a retransmission of the last message because the node never received the ACK):
Thanks a lot again for your help! I will check in https://github.com/pycom/pycom-micropython-sigfox/issues/141
@gnm The xxPY modules with the nanogateway sever do not meet the timing requirements of the TTN network. I have modified the firmware to improve that, so in my set-up it is mostly reliable. Pycom should publish the fix soon. See this issue report: https://github.com/pycom/pycom-micropython-sigfox/issues/141