Threads generate MemoryError when there's an empty while cycle (not necessarily infinite)



  • Greetings!

    I am not sure if this is a programming error on my part, or if this is a symptom of a frailty in the firmware/micropython implementation... Please help!

    I have a class which works with hardware buttons. Each object of this class has a pin with an assigned callback function "press" which deals with the button press: debounces, measures press duration to determine if it is a long press or short press, and executes new threads depending on the case.
    However, if I press the button whilst I am running any type of waiting cycle (an empty cycle where no code is executed) I get the following error.

    Unhandled exception in callback handler
    Traceback (most recent call last):
    File "<stdin>", line 55, in press
    MemoryError: memory allocation failed, allocating 5120 bytes
    

    The version of LoPy and Firmware I'm using is:

    >>> os.uname()
    (sysname='LoPy', nodename='LoPy', release='1.13.0.b1', version='v1.8.6-849-656704e on 2018-01-10', machine='LoPy with ESP32', lorawan='1.0.0')
    

    I have written a self-contained script which exemplifies the issue. If you run it as is, within 8 button presses it runs into a MemoryError. Here is the example output:

    ############# test_case.py Script #############
    Short button press
         Free memory:  43312
         Free memory after GC collect:  43312
    Short button press
         Free memory:  37760
         Free memory after GC collect:  37760
    Short button press
         Free memory:  32208
         Free memory after GC collect:  32208
    Short button press
         Free memory:  26656
         Free memory after GC collect:  26656
    Short button press
         Free memory:  21104
         Free memory after GC collect:  21104
    Short button press
         Free memory:  15552
         Free memory after GC collect:  15552
    Short button press
         Free memory:  10000
         Free memory after GC collect:  10000
    Short button press
         Free memory:  4448
         Free memory after GC collect:  4448
    Unhandled exception in callback handler
    Traceback (most recent call last):
      File "<stdin>", line 55, in press
    MemoryError: memory allocation failed, allocating 5120 bytes
    ###############   END  SCRIPT   ###############
    

    The source-code for the script and Button class is the following:

    #test_case.py - Systematically shows the MemoryError issue raised by threads
    
    print("############# test_case.py Script #############")
    
    import gc
    import pycom
    from time import sleep_ms, ticks_ms, ticks_diff
    from machine import Pin
    from _thread import start_new_thread
    
    class Button:
    
     def __init__(self, pid='P10', longms=1000,pull_ud = None):
       self.pressms = 0
       self.press_dur_ms = 0
       self.longms = longms
       self.pull_ud = pull_ud
       self.pin = Pin(pid, mode=Pin.IN, pull= pull_ud)#, pull=Pin.PULL_UP)
       self.pin.callback(Pin.IRQ_FALLING | Pin.IRQ_RISING, self.press)
    
     def long(self):
       pass
    
     def short(self):
       pass
    
     def press(self, pin):
       gc.collect()
       # If never pressed, store press time
       if self.pressms == 0: self.pressms = ticks_ms()
       else:
         # If pressed within 300 ms of first press, discard (button bounce)
         if ticks_diff(self.pressms, ticks_ms()) < 300: return
    
       self.press_dur_ms = ticks_ms()
       # Wait for value to stabilize for 40 ms
       while ticks_ms() < self.press_dur_ms + 40:
          #sleep_ms(1)
          if self.pin() == 1:
              return
    
       # Measure button press duration
       while self.pin() == 0:
           if(self.press_dur_ms + self.longms < ticks_ms() ):
               # Trigger long press
               gc.collect()##################
               start_new_thread(self.long, ())
               break
    
       # Trigger short press
       if(self.press_dur_ms + self.longms > ticks_ms() ):
           gc.collect()##################
    
           start_new_thread(self.short, ())
    
    ########## workarounds THAT WORK #########################################
           #self.short() # NOT USING THREADS. No memory build up
    
    # # # Directly REPLACE THE THREAD FOR THIS CODE
           # print('Short button press FROM INSIDE THE CLASS')
           # print("     Free memory: ", gc.mem_free() )
           # gc.collect()
           # print("     Free memory after GC collect: ", gc.mem_free() )
    # # # ################################################################
    
       while self.pin() == 0: pass
    
       self.pressms = 0
       gc.collect()
    
    ##################################################################
    
    gc.enable()
    
    exitMain = False
    
    def short_button():
       print('Short button press')
       print("     Free memory: ", gc.mem_free() )
       gc.collect()
       print("     Free memory after GC collect: ", gc.mem_free() )
       #machine.info()
       #info()
       return 0
    
    def long_button():
       print('Long button press')
       print("     Free memory: ", gc.mem_free() )
       gc.collect()
       print("     Free memory after GC collect: ", gc.mem_free() )
       return 0
    
    but = Button(pid='P15',longms=500)
    but.short = short_button
    but.long = long_button
    
    tempo = ticks_ms()
    time_inc = 2000
    
    # It makes no difference whether the loop is infinite or not!
    # could be "while True:" , it makes no difference
    while ticks_ms() < tempo + 10000: #runs for 10 seconds
    #########################################
       # SCENARIO 1 : No memory error
       #
       # sleep_ms(time_inc)
       #print ("TIME INSIDE WHILE: ", (ticks_ms()-tempo)/1000,"s")
    
    #########################################
       # SCENARIO 1 : Memory Error occurs
       #
       pass
    
    print("###############   END  SCRIPT   ###############")
    

    Please notice the while loop at the end of the code. That is where the action starts:

    • If the while loop is empty (pass) we get the MemoryErrorafter X button presses, apparently because the threads generated by the button presses, are not exited.
    • However if the loop has something inside, even a simple print(), the program runs normally and there is no build-up of used memory.

    I tried two workarounds in the class Button implementation, which circumvent the MemoryError issue:

    1. Not using threads, and directly evoking the method, which is OK in this case where the routine is fast.
    2. Directly running the code I want to execute instead of calling the method.

    Both of these workarounds are OK in this simple case, but not in the grand scheme of things were modularity and robustness is desired. Specially if the ISR evokes a longer process, which would be desirable to run in a different thread.

    By the way, I noticed this issue when I was trying to use this Button class with BlynkLib, to allow data to be sent and controlled with button presses...

    Thanks in advance for all pointers to a possible solution.



  • @ea I've actually managed to get it working, however if you check out this thread here, I've just run into a new issue where using multiple pins, each including callbacks, simply causes the first pin pushed to be the only callback which ever runs and only when that pin is pressed. Quite frustrating and unsure what the issue is.
    https://forum.pycom.io/topic/3173/multiple-pins-multiple-callbacks-not-working-urgent



  • @kbman99

    Hi! I’ts been a while since I solved it. So not all details are still clear. But for the version that I was using back then the workaround was not using threads in the IRQ routine, or adding the a short sleep pause within the thread that takes care of the button press. It’s more or less what I outlined in the previous post.
    I have no idea if in the meantime the devs have sorted this issue out. I haven’t been developing with Pycom for a few months now. I’ll get back to it later. Perhaps if you could provide more information more people would be able to help.
    Cheers!



  • @ea Have you found any solution to this issue? I' having a similar issue myself where when multiple threads are running and the button I have setup in main.py is pressed I, in some cases, receive a memory error from within my BUTTON class.



  • @seb
    Thank you for being so helpful. I will continue developing and making do with the available solutions. I just hope that in a couple of months most firmware issues will be solved so we can move to production.
    Thanks for relaying this to the team.
    Regards,
    EA



  • @ea

    Hi,

    I just tested your code and I have been able to reproduce the bug, it seems if the button is pressed while the main file is still executing, we get a memory leak just as you have shown, if this is done after the main.py file finishes execution and returns to the repl, the memory does not increase. I will pass this onto the firmware team and will report back with the result of their investigation.

    In regards to code protection, this is a feature we are actively developing and it is almost complete. Although I cannot provide a exact date, this will be coming soon.



  • @seb nevermind my previous rant. I will just add the time.sleep_ms(1), and be done with it. Still I would like to know why this is the behavior. Engineering shouldn't have mysteries...



  • Hi @seb

    Thank you for trying to help. I did a flash wipe like you recommended. My version is

    >>> os.uname()
    (sysname='LoPy', nodename='LoPy', release='1.15.0.b1', version='v1.8.6-849-baa8c33 on 2018-01-29', machine='LoPy with ESP32', lorawan='1.0.0')
    

    And the code I am running is ipsis verbis the following:

    from machine import Pin
    import _thread
    import gc
    import time
    
    class Button:
        def __init__(self,
                     pid,
                     pull_dir,
                     debounce_ms=300,
                     longpress_ms=1000,
                     threaded=False):
            self._start_time = None
            self.pull_dir = pull_dir
            self.debounce_dur = debounce_ms
            self.long_dur = longpress_ms
            self.threaded = threaded
    
            # Configure pin as input and set callback
            self.pin = Pin(pid, mode=Pin.IN, pull=pull_dir)
            self.pin.callback(Pin.IRQ_FALLING | Pin.IRQ_RISING, self._pin_handler)
    
        def _pin_handler(self, arg):
            # Get initial conditions
            self._start_time = time.ticks_ms()
            pin_state = self.pin.value()
            #print("SEB's pin handler")
            # Button is not being pressed, ignore it
            if ((self.pull_dir == Pin.PULL_UP and pin_state == 1) or
               (self.pull_dir == Pin.PULL_DOWN and pin_state == 0)):
                return
    
            # Time how long the pulse is
            duration = 0
            while self.pin.value() == pin_state:
                duration = time.ticks_ms() - self._start_time
    
            if duration < self.debounce_dur:
                return  # Pin value has not settled
            elif duration < self.long_dur:
                if not self.threaded:
                    self.short_press()
                else:
                    _thread.start_new_thread(self.short_press, ())
            else:
                if not self.threaded:
                    self.long_press()
                else:
                    _thread.start_new_thread(self.long_press, ())
    
        def short_press(self):
            raise NotImplementedError
    
        def long_press(self):
            raise NotImplementedError
    
    
    gc.enable()
    
    
    def short_button():
        print('Short button press')
        print("     Free memory: ", gc.mem_free())
        gc.collect()
        print("     Free memory after GC collect: ", gc.mem_free())
        return 0
    
    
    def long_button():
        print('Long button press')
        print("     Free memory: ", gc.mem_free())
        gc.collect()
        print("     Free memory after GC collect: ", gc.mem_free())
        return 0
    
    #'P21' is button D
    but = Button(pid='P10', pull_dir = Pin.PULL_UP, threaded=True)
    but.short_press = short_button
    but.long_press = long_button
    
    tempo = time.ticks_ms()
    time_inc = 2000
    
    # It makes no difference whether the loop is infinite or not!
    # could be "while True:" , it makes no difference
    while time.ticks_ms() < tempo + 10000: #runs for 10 seconds
    #########################################
        # SCENARIO 1 : No memory error
        #
        #sleep_ms(time_inc)
        #print ("TIME INSIDE WHILE: ", (time.ticks_ms()-tempo)/1000,"s")
    
    #########################################
        # SCENARIO 1 : Memory Error occurs
        #
        pass
    
    print("###############   END  SCRIPT   ###############")
    

    I am running nothing more but that.
    I get the following output

    >>>
    Short button press
         Free memory:  46592
         Free memory after GC collect:  46736
    Short button press
         Free memory:  41184
         Free memory after GC collect:  41200
    Short button press
         Free memory:  35648
         Free memory after GC collect:  35664
    Short button press
         Free memory:  30112
         Free memory after GC collect:  30112
    Short button press
         Free memory:  24560
         Free memory after GC collect:  24560
    Short button press
         Free memory:  19008
         Free memory after GC collect:  19008
    Short button press
         Free memory:  13456
         Free memory after GC collect:  13456
    Short button press
         Free memory:  7904
         Free memory after GC collect:  7904
    Short button press
         Free memory:  2352
         Free memory after GC collect:  2352
    Unhandled exception in callback handler
    Traceback (most recent call last):
      File "<stdin>", line 45, in _pin_handler
    MemoryError: memory allocation failed, allocating 5120 bytes
    ###############   END  SCRIPT   ###############
    >
    

    Of course, after the empty while is passed (10 seconds of execution) I get no issues:

    >>> Short button press
         Free memory:  46528
         Free memory after GC collect:  47472
    Short button press
         Free memory:  47440
         Free memory after GC collect:  47440
    Short button press
         Free memory:  47408
         Free memory after GC collect:  47440
    Short button press
         Free memory:  47408
         Free memory after GC collect:  47440
    Short button press
         Free memory:  47408
         Free memory after GC collect:  47440
    

    The only way to circumvent this issue is by passing threaded=True when creating an instance of Button, or putting something inside the while loop, not necessarily a sleep instruction... a simple print is enough, or a calculation...

    Are you sure you are pressing the button within the 10 seconds? Try with a longer empty while loop (or infinite) to see if it all pans out.

    Thank you for your assistance. This is such a small part of the whole project. I am thinking of ditching this platform as it is proving very glitchy and apparently buggy. I am starting to think I'd be better off using more standard C/C++ libs with MCUs well known and tested in the industry... I just can't afford to waste so much time solving small issues like this. I really hope it's just something on my part and not on firmware. Pycom promises a lot, and the concept is great, but for a commercial product where protecting the source-code and robustness of the system/code is important I'm starting to have doubts this is the right solution...



  • When I ran your version of the code I was able to replicate your memory errors but for me when I add your first block of code to the end of my version of the script I get the following output (tested on multiple lopy's):

    MicroPython v1.8.6-849-baa8c33 on 2018-01-29; LoPy with ESP32
    Type "help()" for more information.
    >>> Short button press
         Free memory:  49744
         Free memory after GC collect:  49904
    Short button press
         Free memory:  49872
         Free memory after GC collect:  49888
    Short button press
         Free memory:  49856
         Free memory after GC collect:  49888
    

    Did you add any extra code or was it just the code I posted above? Just to be certain could you run the following to clear the flash of your device as well a get the firmware version, then reflash the device with the above code.

    import os
    print(os.uname())
    os.mkfs('/flash')
    

    As @incognico suggested, if you still get errors try adding a loop with a sleep_ms(1) in it. It could be that going to sleep for a short period of time allows the main thread to yeild and the garbage collector to run.



  • Hi @seb,

    I'm sorry to report that your code executes with the same behavior:

    • If I add the following code to the end of your script, it doesn't matter whether threaded=True or threaded=False when I create the Button instance. We get MemoryError
    tempo = time.ticks_ms()
    
    while time.ticks_ms() < tempo + 10000: #runs for 10 seconds
        pass
    
    • However, If I add the following code to the end of your script it works with both threaded=True or threaded=False when I create the Button instance. No MemoryError
    tempo = time.ticks_ms()
    time_inc = 2000
    
    while time.ticks_ms() < tempo + 10000: #runs for 10 seconds
        sleep_ms(time_inc)
        print ("TIME INSIDE WHILE: ", (time.ticks_ms()-tempo)/1000,"s")
    


  • @ea

    Hi,

    I had a look over your code and have re-worked some previous debounce code I wrote to include the same functionality. The way we handle interrupts in our firmware is as follows: When the pin change interrupts are triggered as you configure they schedule a call to the callback function in a queue. When this callback executes, it is possible for more of the same callback to be scheduled if the pin changes state, like with button bounce. This can lead to an ever increasing queue.

    With the code I have shared below and the latest firmware I do not see the memory usage increasing with each thread, there is a slight increase the first time it fires but it remains constant after that. Could you test it in your setup to see if you get the same results:

    from machine import Pin
    import _thread
    import gc
    import time
    
    
    class Button:
        def __init__(self,
                     pid,
                     pull_dir,
                     debounce_ms=300,
                     longpress_ms=1000,
                     threaded=False):
            self._start_time = None
            self.pull_dir = pull_dir
            self.debounce_dur = debounce_ms
            self.long_dur = longpress_ms
            self.threaded = threaded
    
            # Configure pin as input and set callback
            self.pin = Pin(pid, mode=Pin.IN, pull=pull_dir)
            self.pin.callback(Pin.IRQ_FALLING | Pin.IRQ_RISING, self._pin_handler)
    
        def _pin_handler(self, arg):
            # Get initial conditions
            self._start_time = time.ticks_ms()
            pin_state = self.pin.value()
    
            # Button is not being pressed, ignore it
            if ((self.pull_dir == Pin.PULL_UP and pin_state == 1) or
               (self.pull_dir == Pin.PULL_DOWN and pin_state == 0)):
                return
    
            # Time how long the pulse is
            duration = 0
            while self.pin.value() == pin_state:
                duration = time.ticks_ms() - self._start_time
    
            if duration < self.debounce_dur:
                return  # Pin value has not settled
            elif duration < self.long_dur:
                if not self.threaded:
                    self.short_press()
                else:
                    _thread.start_new_thread(self.short_press, ())
            else:
                if not self.threaded:
                    self.long_press()
                else:
                    _thread.start_new_thread(self.long_press, ())
    
        def short_press(self):
            raise NotImplementedError
    
        def long_press(self):
            raise NotImplementedError
    
    
    gc.enable()
    
    
    def short_button():
        print('Short button press')
        print("     Free memory: ", gc.mem_free())
        gc.collect()
        print("     Free memory after GC collect: ", gc.mem_free())
        return 0
    
    
    def long_button():
        print('Long button press')
        print("     Free memory: ", gc.mem_free())
        gc.collect()
        print("     Free memory after GC collect: ", gc.mem_free())
        return 0
    
    
    but = Button('P10', Pin.PULL_UP, threaded=True)
    but.short_press = short_button
    but.long_press = long_button
    


  • @seb I ran this code on my FiPy with the latest FW and saw the same symptoms. The main thread needs at least a sleep_ms(1) to release the memory of exited threads.



  • @seb I've updated the firmware to

    (sysname='LoPy', nodename='LoPy', release='1.15.0.b1', version='v1.8.6-849-baa8c33 on 2018-01-29', machine='LoPy with ESP32', lorawan='1.0.0')
    

    and the result is the same. Same error. And some things apparently started having more instabilities with other code. I'm starting to regret having chosen a product in it's infancy for a mission critical development...



  • Hi,

    I notice you are running on release='1.13.0.b1', this is not the latest release. Could you try updating to the latest firmware version (1.15.0.b1) and report if you still have this issue?


 

Pycom on Twitter