FiPy & Chirpstack advice
Looking for some expert advice here.
First, a little background. We're piloting a smart building setup with a large multisport facility here in the US and LoRa came up pretty early in discussions about a potential solution for our needs. We will have about 100 nodes at a facility that requires a maximum of about 500 meters line of site. Lora thus seems like a good option.
A requirement however is that we run a "private" network. Enter Chirpstack.
We got a whole bunch of different Lora devboards, FiPy, Waveshare's SX1262, etc. as well as a RAK7244 developer gateway with a Pi4 and RAK2455.
The Waveshare devices were tested in a point-to-point in US915 setup (they don't support LoRWAN). With some cheap external antennas and I was seeing about 800m pretty reliably through a dense residential area. The range was fine for our needs.
What really impressed me though was that we had them sending 300-350 byte payloads every second for weeks straight. Way more that would be necessary in production and pretty impressive.
Why does LoRaWAN have otherwise pretty strict duty cycle limits when raw Lora does not? Isn't it exactly the same airwaves?
If we establish our own Chirpstack based gateway in a private network, are we subject to the same duty cycle limits as say, TTN?
Why do I feel like my join & payload reliability between my FiPy and Chirpstack on RAK7244 gateway is SO unreliable and slow?
Am I missing something here in the way we configured the gateway and FiPy that I can't reproduce the speed and reliability of the point to point waveshare devices on the same spectrum? Or is this just a global limitation when you jump to LoraWAN regardless of wether or not you're using a shared network like TTN?
@jcaron Ha! Glad I asked for more advice. Thank you!
I’ll add one more perplexing thing about the Waveshare experiment.
That’s what we’ve tested with in P2P mode. I didn’t test it myself but apparently they have a mesh method as well. Getting closer to the WAN part.
Here’s the kicker though, they’re running on the Semtech SX1262 which is most definitely a Lora chip.
@barryjump I'm more used to the constraints of the EU region, but I guess the end-result is somewhat similar: bi-directional and LoRaWAN rarely go hand in hand. The reason for that is that the gateway is subject to the same limits as any other node, so it needs to share what it is allowed to send between all the nodes it talks to. So you can split the 96.8 bytes/s computed above by the number of nodes. If you send a few bytes a day to each node that's fine. If you start talking to them often, not so much.
In the EU region, it is frequent for downlinks to be sent on high SFs (so in theory you get the largest coverage), which limits A LOT the number of downlinks. Not sure what the rules and practices are in the US region.
Also, remember that a LoRaWAN class A device only listens for downlinks (packets from the network to the node) during two very short window after each transmission (1s and 2s after, by default), and then goes back to sleep and doesn't listen anymore until its next uplink. It is useful to feed back configuration changes and the like, not so much to send real-time traffic to the node. You could switch to class B or C, but that increases power consumption. A lot. We're talking moving from a device which will draw a few tens of µA in deep sleep to a device which will draw over 10 mA listening. That can cut your battery life a thousandfold in the worst case, which is why this is usually only used on externally-powered devices.
Also reliable means acknowledgements, which usually means additional airtime (so in practice, less "useful" traffic) and adds latency in case of lost frames.
All of this is the reason behind the very low limits on downlinks and confirmed packets on TTN (or similarly, on Sigfox which is subject to the same rules). Of course, it's a shared network so they are even more conservative, but the reasoning stays the same.
Also, I don't know what "low latency" is for you, but LoRaWAN frames routinely take tens of even hundreds of milliseconds to send, and in case of a dropped packet, a retransmit will occur at best a bit over 2 seconds later.
If those are not issues for you, having multiple gateways each on its own set of channels may be good option.
In my experience, LoRaWAN has some very interesting use cases, especially in metering from remote outdoor locations, or non time-sensitive environmental monitoring. As soon as you get into time and reliability constraints, that's another story.
I'm not quite sure what protocol would be more adapted here. I would probably go with Zigbee or some other 802.15.4 tech (Thread?), but that will probably require more infrastructure (given the distances), and software-wise it tends to be quite a bit more complex (at least for Zigbee), though you can get quite a few cheap off-the-shelf sensors for many use cases (and some are really, really power efficient — some Zigbee switches, those implementing Zigbee Green Power, do not even need a battery! They get enough power from you pressing the button to send a frame). Not quite sure what the status of BLE mesh is.
In your scenario, it could even be possible to use Wi-Fi for some of the battery-powered nodes. Wake up, connect, send data, wait for ACK, go back to sleep. There are a few gotcha's to watch out for to make sure the node stays awake as little as possible, but if you don't send to often it can work, though given the distances you'll probably need quite a few APs for good coverage. But it doesn't solve the issue of bi-directional traffic if you need nodes to be able to receive downlinks at any time.
Maybe others will have better ideas (or different experiences with LoRaWAN).
@jcaron thank you for the exceptional response!
Great observation on the waveshare devices, I think your guess is correct about their "forgetting" reg limits and assuming you will use it properly.
In case you don't mind being even more useful than you've already been:
The initial specifications for the pilot I described above is:
- All stationary nodes
- Lots of environmental nodes
(ie. temp, humid, the usually. Expect 48 payloads per day.)
- Lots and lots of open/close nodes
(i.e doors, windows, gates. Expect <100 payloads per node per day)
- A few energy monitors
(this one is a bit trickier because high resolution / high frequency is desirable)
- A few modbus/lora bridges
(also tricky for same reason, in addition the desire for low latency downlinks for remote user control)
I imagine the higher resolution sensors would likely be on mains power so moving them to a wired or even wifi connection would be okay. The most critical part is reliable bi-directional connections with the few nodes that do need to be wireless. From what I'm beginning to understand, this may be the biggest problem with Lora. Which is a little disappointing because the Waveshare experience got me very excited. Any thoughts?
@barryjump Don't know what modulations the Waveshare boards use, but LoRa modulations can be extreeeeeeemely slow. So it you have one sending at 100 kbits/s and the other at 980 bits/s, with the same duty cycle/dwell time one can send 100 times more than the other. Since one is not allowed to send for longer than 400 ms per 20 seconds per channel in the US, that dictates how much data you can send.
Parameters which have an effect on the bitrate include the spread factor (SF) and bandwidth. In the US, LoRaWAN can range from 980 bits/s (SF12/500 kHz or SF10/125 kHZ) to 21900 bit/s (SF7/500 kHz). The Waveshare SX1262 hat seems to indicate "airspeeds" of up to 62.5 kbit/s which does not match any LoRa modulation, so they must use different modulations.
IIRC there are also different regulations based on the type of modulation or spread spectrum technique. And it's of course possible that some equipment will just "forget" to meet regulatory limits, counting on you do observe them.
So the exact same band can lead to very different results. It's like Wi-Fi: you have devices which only go up to 2 Mbit/s (the original 802.11), others 11 (802.11b), 54 (802.11g), hundreds or even thousands (802.11n, ac, ax...). Not all cars on the same highway need to travel at the same speed.
There are regional limits you have to obey no matter what, but IIRC TTN's limits are lower than the actual limits. If you operate your own network, you need to obey the FCC limits, but not those of TTN.
Make sure you use the fastest modulations you can (lowest SF numbers), given the distances involved it should be more appropriate. Largest bandwidth should also be good, but the layout of the frequencies and the capabilities of the gateways make it difficult to use several 500 kHz channels, I believe. At SF7/125 kHz you get 5470 bits/s, which means about 400 ms per 242-byte frame including overhead. You can only send on such frame per 20 seconds per channel, so using 8 channels that would get you 96.8 bytes/s.
The US region has many many channels (72). But most gateways will only listen on 8 or 9 of them. You need to make sure your device and gateway are set to use the same channels, otherwise it will indeed give you awful performance with up to nearly 90% of frames being lost. In theory you only need to do that on the LoPy initially to achieve a quick join, and the network should then send the "right" list of channels to the LoPy (don't remember if that happens on join accept or in a subsequent MAC command in a downlink), but it requires your network and gateway to be properly configured for that to happen.
Note that having 100 devices each sending at the maximum possible throughput is not quite a scenario LoRaWAN is designed for. If you use a single gateway (and thus a single 8 or 9 channel set), you'll probably be hit with so many collisions it won't be funny.
How much data do you actually need to send, at what frequency? What are your QoS requirements (latency, packet loss...)? What's the geographical distribution of the devices in the facility? What power requirements to you have (battery powered or external power source)? How mobile are the nodes? I'm not sure LoRaWAN is your best bet in this environment (though I'm not quite sure what alternative could be suggested here).