Regex ure library is broken?



  • These 3 lines blow up my Wipy due perhaps to a bug in the ure module? The WiPy does an enormous stack trace / memory dump. I managed to hit reset quickly enough to see the output, and after a long time mucking around with my code, I'm pretty sure the problem is the built-in "re" or "ure" module.

    [This link is one line of an RSS feed for Car Talk, by the way]

    import re

    line = b' <enclosure url="https://play.podtrac.com/npr-510208/npr.mc.tritondigital.com/NPR_510208/media/anon.npr-podcasts/podcast/510208/515997156/npr_515997156.mp3?orgId=1&d=3276&p=510208&story=515997156&t=podcast&e=515997156&ft=pod&f=510208" length="26203572" type="audio/mpeg"/>\n'

    re.search(r'"http(s)?://.*.mp3"', line)

    +++++
    /Users/danicampora/Code/Pycom/esp-idf/components/freertos/./heap_regions.c:368 (vPortFreeTagged)- assert failed!
    abort() was called at PC 0x400841ef

    Guru Meditation Error of type LoadProhibited occurred on core 0. Exception was unhandled.
    Register dump:
    .....



  • @paul12345 : I do not know why the implementation is recursive. Maybe it always was, or it's the most elegant solution.
    About the double quote: No, it does not work on WiPy if you drop the double quote at the end of the search string. But you will have a match on machines with more memory, simply because the sample string (line) you are searching at does not contain the pattern mp3".
    B.T.W.: Shortening the string in line works on WiPy.
    Regards, Robert



  • Haha, that's unfortunate. Why is there a recursive implementation for a memory-limited platform? I suppose the answer to that is too involved for this forum.

    Are you suggesting I omit the final double-quote and it will work? Because it seems like even that might fail for some strings or runtime conditions if there is less free memory available. Is that a correct understanding? Probably I'll just omit the re module entirely and do a work around with more basic string operations.

    Thanks,
    Paul



  • Hello @paul12345. You example obviously fails, and the reason is most likely a stack overflow. The string is too long for that kind of search expression. If you look around about this topic, you'll find these things reported several times. Trying that example on linux micropython returns a match, at least if you change your search expression to:

    re.search(r'"http(s)?://.*.mp3', line)

    The stack on these machines is limited, and the ure lib turns out to be recursive. I agree that it should run in a proper error state instead of drifting away.
    Look also here: https://github.com/micropython/micropython/issues/2451



Pycom on Twitter