Multiple UDP sockets from multiple Wipy3 units work on legacy 1.20.0r13 and NOT on pybytes 1.20.2.r4
I am using wipy3 devices for sending data wirelessly at 100Hz with UDP sockets, being the receiver and controller routine at a host PC. Wipy3 units join a WLAN, and the PC is connected to the wifi router.
Up to now I was using firmware version legacy (not pybytes) 1.20.0.r13 (last release for legacy), and since it was not further supported, I decide to migrate to pybytes 1.20.2.r4.
I need to perform some adjustments on the wipy code, already reported on the forum, like the problem with the main() and _threads, solved with a 1 ms sleep time in the main while, or replacing the main by a new "main" _thread, where the typical while(Tue) is included (reported in https://forum.pycom.io/topic/5805/main-thread-blocks-auxiliary-threads-_thread-module/6 ). After all changes NOW I have a partial functional code for both firmware versions.
I don't say fully functional code, because on both firmwares, now the code loads, and data transmission starts, and if I have just one single wipy node transmitting, everything works fine. Both firmwares show the same correct behaviour with one transmitting node.
The problem is with 2 or more nodes.
When I switch on 2 nodes, on legacy 1.20.0.r13, everything works correctly, and the PC records the received data for each node ( I even tested with up to 8 nodes correctly).
BUT with pybytes 1.20.2.r4, both nodes starts the transmissin correctly, but after a while (sometimes 5 seconds, some others 1 minute), the nodes starts to give
[errno 12] ENOMEM
exception messages when trying to send data with the socket sendto function.
sent = socketUDP.sendto(data, addr)
As I told, it just happens with 2 or more nodes, and not always at the same time from start, and seems to me an issue with collisions. I am not so sure what has been changed on the lower library routines for UDP sockets, but for sure something that affects UDP sockets.
To those that advise me to shift to TCP sockets, my application does not need ordered and confirmed packages, I prefer to keep with UDP due to timing issues. My questios is more fucused to developers just in case that any upgrade in the firmware includes now a bug for collision avoidance, or if someone has experienced the same as what I report.
Thanks to all!
I found the core of the problem and solution.
The problem cames from the fact that I was broadcasting the frames from the wipys to the PC. I did it, since I also can have 2 PCs, one as a logger, and the other as a controller which takes the measurements from the wipy, and both PCs must be different due to the project requirements.
Broadcasting to the 255.255.255.255 address works DIRECTLY on legacy 1.20.0r13, BUT not on pybytes 1.20.2.r4
I remove broadcasting, giving as socket_adress just the logger PC and port
socket_addr = (PC_IP_Address, Socket_port)
This way, on both firmware versions, the transmission works correctly!!!
- I look for more information on Broadcasting problems with UDP sockets and micropython.
Initially it seems that for python:
socketUDP.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1)
should enable broadcast if it was disabled. However, this option is NOT implemented for micropython. I found this post:
where they talked about this problem, and they said that adding
socketUDP.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
solves the problem for broadcasting... and IT WORKS
I also tried this:
socketUDP.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 0)
which must means that "don't resuse" the address, and I thought was the default if not settled, and it also works, even better...
CONCLUSION: For broadcating with UDP sockets, implicit declaration of socket option SO_REUSEADDR is required.
My working socket set-up is:
# Set-up UDP socket socketUDP = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) socketUDP.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 0) socketUDP.setblocking(False) # Non-blocking socket socketUDP.settimeout(1.0) # timeout every 1 second t0 = chrono.read()
I post this as solution, since maybe someone else might have the same problem as I had.
Thanks to all who replied to me and tried to help!
@jcaron Dear jcaron
Thanks for your help.
I was alto turning around the things you propose, packet size is small (32 bytes) and constant (frame with 6 sensor data and a counter). I also was able to estimate latency on the received side, beign so stable on legacy versions.
The sync routine I employed is based on both Timer and RTC (I added a xtal to the wipy) timings.
I use a Timer Alarm to run the send routine, and since this Timer is not so stable, every 20 seconds I find the derivation from the RTC counts, and correct/update the Timer period in microseconds range. It works fairly well (100ms deviation in 2 hours record).
Regarding other questions:
- I receive all packets before errors
- Erros are not permanent, they apppear and dissapear
@alvaroav Retransmits do not happen at the UDP level, but much lower at the Wi-Fi level.
Maybe there's a limited size buffer for retransmits and you exceed its capacity?
When it starts doing the errors, is it permanent (i.e. each and every packet from that point will result in an error), or does it come and go (i.e. some packets go through and others not)?
Do you receive all the packets just before the error? Do you have a way to measure latency on packets received?
Can you ping the devices during that time, and make a note of when they start producing the errors compared to the pings? WiFi issues are usually visible by the increased latency due to the retransmits.
What size are your packets? Are they a constant size? Is the rate of the packets constant?
@livius Thanks for your reply
I confirm that I have disabled pybytes for firmware pybytes 1.20.2.r4
I also before asking on the forum was looking for the causes of the errno 12 ENOMEN, and is due to "out of memory" but I also found many posts linking this error with UDP sockets.
I perform a gc.collect once a second, since if I perform it every data tx, it strongly reduces the throughput.
However, if it is related to out of memory, I can not link it with the fact that with just one single node it works, and with 2 or more nodes not. Moreover, when it starts to appear, it appears on both nodes almost simultaneously, this is why I was thinking in some collision (network) problems.
livius last edited by
The error mean - out of memory.
Some code eat memory and do not release it back.
But strange if it happen only with more nodes, as UDP have not confirmation, than all nodes should be separated from each other.
Then problem should be raised even with one node...
Did you disable pybytes?
import pycom pycom.pybytes_on_boot(False)
Maybe some code still do something in the background? But this is only guess...