Pybytes Google integration - removing additional characters?



  • I am trying to use the Pybytes-Google integration. The process went pretty smoothly, all-in-all, except that the payloads that end up on my PubSub subscription have a couple of extra characters in the beginning and end.

    I've been trying to connect directly to Google IOT, but that proved problematic, and difficult to debug.

    My publishing code looks like this:

    from time import sleep
    import utime
    import micropython
    from os import urandom
    import gc
    import pycom
    import machine
    import ujson
    
    payload = {
        "timestamp": utime.time(),
        "device_id": CONFIG.get("device_id"),
        "measurements": [{"test1": 123}, {"test2": 456}]
    }
    
    while True:
        payload["timestamp"] = utime.time()
        data = ujson.dumps(payload)
        pybytes.send_signal(0, data)
    

    The message show up nicely on Pybytes:

    {"timestamp": 1639061966, "measurements": [{"test1": 123, "test2": 456}], "device_id": "device_1"}
    

    But on my google subscription there's an additional > at the start, and a 3 at the end. Like this:

    >{"timestamp": 1639061937, "measurements": [{"test1": 123, "test2": 456}], "device_id": "device_1"}3
    

    This makes it a non-valid json object, and crashes my downstream parsing. Any idea why this is and whether/how this can be avoided?



  • If anyone else has the same issue, my current workaround is a setting up a Cloud Function, that removes the extra chars, and republishes to another topic, using the following code:

    import base64
    import json
    from json import JSONDecodeError
    # import os
    
    from google.cloud import pubsub_v1
    
    def main(event, context):
        # Get decoded message
        decoded_message = decode_message(event, context)
        
        # Instantiates a Pub/Sub client
        publisher = pubsub_v1.PublisherClient()
        # PROJECT_ID = os.getenv('GOOGLE_CLOUD_PROJECT') #Returns None!
        PROJECT_ID = "YOUR-PROJECT-ID-HERE" #TODO supply as function parameter, or get the currently running project
    
        topic_name = "pybytes" #This is the topic the clean message would be published to
        topic_path = publisher.topic_path(PROJECT_ID, topic_name)
    
        # Remove first two chars
        # TODO: Make more robust (clean all until first '}' and all after last '}' - probably use regex?)
        message = decoded_message[3:][:-1]
    
        # Convert string to dict and back - otherwise problems with recognizing as json object occur
        try:
            msg_dict = json.loads(message)
        except JSONDecodeError as e:
            print("using [2:]")
            msg_dict = json.loads(decoded_message[2:][:-1])
        except JSONDecodeError as e:
            print("using [4:]")
            msg_dict = json.loads(decoded_message[4:][:-1])
        message = json.dumps(msg_dict)
    
        # Encode the message as bytes
        message_bytes = message.encode('utf-8')
    
        # Publish the clean message
        try:
            publish_future = publisher.publish(topic_path, data=message_bytes)
            publish_future.result()  # Verify the publish succeeded
            return 'Message published.'
        except Exception as e:
            print(e)
            return (e, 500)
    
    
    def decode_message(event, context):
        """Triggered from a message on a Cloud Pub/Sub topic.
        Args:
             event (dict): Event payload.
             context (google.cloud.functions.Context): Metadata for the event.
        """
        pubsub_message = base64.b64decode(event['data']).decode('utf-8')
        # print(pubsub_message)
        return pubsub_message
    

Log in to reply
 

Pycom on Twitter