Pybytes Google integration - removing additional characters?

Roetz Bikes

I am trying to use the Pybytes-Google integration. The process went pretty smoothly, all-in-all, except that the payloads that end up on my PubSub subscription have a couple of extra characters in the beginning and end.

I've been trying to connect directly to Google IOT, but that proved problematic, and difficult to debug.

My publishing code looks like this:

from time import sleep
import utime
import micropython
from os import urandom
import gc
import pycom
import machine
import ujson

payload = {
    "timestamp": utime.time(),
    "device_id": CONFIG.get("device_id"),
    "measurements": [{"test1": 123}, {"test2": 456}]
}

while True:
    payload["timestamp"] = utime.time()
    data = ujson.dumps(payload)
    pybytes.send_signal(0, data)

The message show up nicely on Pybytes:

{"timestamp": 1639061966, "measurements": [{"test1": 123, "test2": 456}], "device_id": "device_1"}

But on my google subscription there's an additional > at the start, and a 3 at the end. Like this:

>{"timestamp": 1639061937, "measurements": [{"test1": 123, "test2": 456}], "device_id": "device_1"}3

This makes it a non-valid json object, and crashes my downstream parsing. Any idea why this is and whether/how this can be avoided?

Roetz Bikes

If anyone else has the same issue, my current workaround is a setting up a Cloud Function, that removes the extra chars, and republishes to another topic, using the following code:

import base64
import json
from json import JSONDecodeError
# import os

from google.cloud import pubsub_v1

def main(event, context):
    # Get decoded message
    decoded_message = decode_message(event, context)
    
    # Instantiates a Pub/Sub client
    publisher = pubsub_v1.PublisherClient()
    # PROJECT_ID = os.getenv('GOOGLE_CLOUD_PROJECT') #Returns None!
    PROJECT_ID = "YOUR-PROJECT-ID-HERE" #TODO supply as function parameter, or get the currently running project

    topic_name = "pybytes" #This is the topic the clean message would be published to
    topic_path = publisher.topic_path(PROJECT_ID, topic_name)

    # Remove first two chars
    # TODO: Make more robust (clean all until first '}' and all after last '}' - probably use regex?)
    message = decoded_message[3:][:-1]

    # Convert string to dict and back - otherwise problems with recognizing as json object occur
    try:
        msg_dict = json.loads(message)
    except JSONDecodeError as e:
        print("using [2:]")
        msg_dict = json.loads(decoded_message[2:][:-1])
    except JSONDecodeError as e:
        print("using [4:]")
        msg_dict = json.loads(decoded_message[4:][:-1])
    message = json.dumps(msg_dict)

    # Encode the message as bytes
    message_bytes = message.encode('utf-8')

    # Publish the clean message
    try:
        publish_future = publisher.publish(topic_path, data=message_bytes)
        publish_future.result()  # Verify the publish succeeded
        return 'Message published.'
    except Exception as e:
        print(e)
        return (e, 500)


def decode_message(event, context):
    """Triggered from a message on a Cloud Pub/Sub topic.
    Args:
         event (dict): Event payload.
         context (google.cloud.functions.Context): Metadata for the event.
    """
    pubsub_message = base64.b64decode(event['data']).decode('utf-8')
    # print(pubsub_message)
    return pubsub_message

Explore Pybytes | Official Documentation | Report a Firmware Bug/Issue | GitHub

Pybytes Google integration - removing additional characters?

Pycom on Twitter