Python UnicodeDecodeError:&x27;utf-8';编解码器可以';t解码字节0xe0

Python UnicodeDecodeError:&x27;utf-8';编解码器可以';t解码字节0xe0,python,json,Python,Json,寻求专家的帮助 当消息包含\xe0s字符或类似字符时,我想跳过它。我当前的跳过消息例程对于标准json msg工作正常,但在像这样的特殊情况下,\xe0s程序会失败 为了避免失败,必须改变什么?我的目标是,只处理包含durationfld的msg,其他人只需跳过 else: logging.info('Ignoring event in different format') return 正在读取程序失败并重新启动的消息正文 body=b'{"sender":"4603","m

寻求专家的帮助

当消息包含
\xe0s
字符或类似字符时,我想跳过它。我当前的跳过消息例程对于标准json msg工作正常,但在像这样的特殊情况下,
\xe0s
程序会失败

为了避免失败,必须改变什么?我的目标是,只处理包含
durationfld
的msg,其他人只需跳过

else:
    logging.info('Ignoring event in different format')
    return
正在读取程序失败并重新启动的消息正文

body=b'{"sender":"4603","message":"y se tykaji zpusobu, jak pozadat o preneseni \xe0s"}')'
节目的一部分

.............
def convert_to_influx_format(message):
    time = message.annotations["iothub-enqueuedtime"]
    name = message.annotations["iothub-connection-device-id"]

    try:
        json_input = json.loads(message.body)
    except json.decoder.JSONDecodeError:
        return

    if 'durationfld' in json_input:
        yellow = json_input["yellow"]
        msgid = json_input["msgid"]
        trigger = json_input["trigger"]
        durationfld = json_input["durationfld"]
        json_body = [
             {
               'measurement': name,
               'time': time,
               'fields': { "durationfld": durationfld, "yellow": yellow },
               'tags': { "trigger": trigger, "msgid": msgid }
             }
        ]
    else:
        logging.info('Ignoring event in different format')
        return

    return json_body

..................
class Receiver(MessagingHandler):
    def __init__(self):
        super(Receiver, self).__init__()

    def on_start(self, event):
        connect_influxdb()
        connect_iothub(event)
        logging.info("Setup complete")

    def on_message(self, event_received):
        logging.info("Event received: '{0}'".format(event_received.message))

        payload = convert_to_influx_format(event_received.message)

        if payload is not None:
            logging.info("Write points: {0}".format(payload))
            write_influxdb(payload)

    def on_connection_closing(self, event):
        logging.error("Connection closing - trying to reestablish connection")
        connect_iothub(event)

....

def main():
    try:
        Container(Receiver()).run()
    except KeyboardInterrupt:
        pass


if __name__ == "__main__":
    main()
错误日志

[36mamqp         |[0m 2020-02-10T13:05:37.013805723Z Traceback (most recent call last):
[36mamqp         |[0m 2020-02-10T13:05:37.013872623Z   File "./readIotHubAmqpClient.py", line 178, in <module>
[36mamqp         |[0m 2020-02-10T13:05:37.013966323Z     main()
[36mamqp         |[0m 2020-02-10T13:05:37.013982623Z   File "./readIotHubAmqpClient.py", line 172, in main
[36mamqp         |[0m 2020-02-10T13:05:37.014014222Z     Container(Receiver()).run()
[36mamqp         |[0m 2020-02-10T13:05:37.014027022Z   File "/usr/local/lib/python3.8/site-packages/proton/_reactor.py", line 184, in run
[36mamqp         |[0m 2020-02-10T13:05:37.014074722Z     while self.process(): pass
[36mamqp         |[0m 2020-02-10T13:05:37.014089822Z   File "/usr/local/lib/python3.8/site-packages/proton/_reactor.py", line 241, in process
[36mamqp         |[0m 2020-02-10T13:05:37.014120422Z     event.dispatch(handler)
[36mamqp         |[0m 2020-02-10T13:05:37.014132922Z   File "/usr/local/lib/python3.8/site-packages/proton/_events.py", line 165, in dispatch
[36mamqp         |[0m 2020-02-10T13:05:37.014176722Z     self.dispatch(h, type)
[36mamqp         |[0m 2020-02-10T13:05:37.014191222Z   File "/usr/local/lib/python3.8/site-packages/proton/_events.py", line 165, in dispatch
[36mamqp         |[0m 2020-02-10T13:05:37.014379122Z     self.dispatch(h, type)
[36mamqp         |[0m 2020-02-10T13:05:37.014402722Z   File "/usr/local/lib/python3.8/site-packages/proton/_events.py", line 162, in dispatch
[36mamqp         |[0m 2020-02-10T13:05:37.014588722Z     _dispatch(handler, type.method, self)
[36mamqp         |[0m 2020-02-10T13:05:37.014610322Z   File "/usr/local/lib/python3.8/site-packages/proton/_events.py", line 123, in _dispatch
[36mamqp         |[0m 2020-02-10T13:05:37.014652222Z     m(*args)
[36mamqp         |[0m 2020-02-10T13:05:37.014713222Z   File "/usr/local/lib/python3.8/site-packages/proton/_handlers.py", line 260, in on_delivery
[36mamqp         |[0m 2020-02-10T13:05:37.014841522Z     self.on_message(event)
[36mamqp         |[0m 2020-02-10T13:05:37.014862822Z   File "/usr/local/lib/python3.8/site-packages/proton/_handlers.py", line 286, in on_message
[36mamqp         |[0m 2020-02-10T13:05:37.014896422Z     _dispatch(self.delegate, 'on_message', event)
[36mamqp         |[0m 2020-02-10T13:05:37.014937122Z   File "/usr/local/lib/python3.8/site-packages/proton/_events.py", line 123, in _dispatch
[36mamqp         |[0m 2020-02-10T13:05:37.014970122Z     m(*args)
[36mamqp         |[0m 2020-02-10T13:05:37.014982422Z   File "./readIotHubAmqpClient.py", line 139, in on_message
[36mamqp         |[0m 2020-02-10T13:05:37.015060022Z     payload = convert_to_influx_format(event_received.message)
[36mamqp         |[0m 2020-02-10T13:05:37.015077922Z   File "./readIotHubAmqpClient.py", line 100, in convert_to_influx_format
[36mamqp         |[0m 2020-02-10T13:05:37.015144321Z     json_input = json.loads(message.body)
[36mamqp         |[0m 2020-02-10T13:05:37.015160421Z   File "/usr/local/lib/python3.8/json/__init__.py", line 343, in loads
[36mamqp         |[0m 2020-02-10T13:05:37.015208121Z     s = s.decode(detect_encoding(s), 'surrogatepass')
[36mamqp         |[0m 2020-02-10T13:05:37.015233721Z UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe0 in position 73: invalid continuation byte
[36mamqp|[0m 2020-02-10T13:05:37.013805723Z回溯(最近一次呼叫最后一次):
[36mamqp|[0m 2020-02-10T13:05:37.013872623Z文件“/readIotHubAmqpClient.py”,第178行,在
[36mamqp|[0m 2020-02-10T13:05:37.013966323Z干管()
[36mamqp |[0m 2020-02-10T13:05:37.013982623Z文件“/readIotHubAmqpClient.py”,第172行,主文件
[36mamqp |[0m 2020-02-10T13:05:37.014014222Z容器(接收器()).运行()
[36mamqp|[0m 2020-02-10T13:05:37.014027022Z文件“/usr/local/lib/python3.8/site packages/proton/_reactor.py”,第184行,运行中
[36mamqp |[0m 2020-02-10T13:05:37.0140742Z,而自处理():通过
[36mamqp |[0m 2020-02-10T13:05:37.01408822Z文件“/usr/local/lib/python3.8/site packages/proton/_reactor.py”,第241行,正在处理中
[36mamqp|[0m 2020-02-10T13:05:37.01412042Z事件调度(处理程序)
[36mamqp |[0m 2020-02-10T13:05:37.014132922Z文件“/usr/local/lib/python3.8/site packages/proton/_events.py”,第165行,发送中
[36mamqp|[0m 2020-02-10T13:05:37.014176722Z自动调度(h型)
[36mamqp |[0m 2020-02-10T13:05:37.014191222Z文件“/usr/local/lib/python3.8/site packages/proton/_events.py”,第165行,发送中
[36mamqp |[0m 2020-02-10T13:05:37.01437912Z自动调度(h型)
[36mamqp |[0m 2020-02-10T13:05:37.014402722Z文件“/usr/local/lib/python3.8/site packages/proton/_events.py”,第162行,发送中
[36mamqp|[0m 2020-02-10T13:05:37.014588722Z|U调度(处理程序、类型、方法、自身)
[36mamqp |[0m 2020-02-10T13:05:37.014610322Z文件/usr/local/lib/python3.8/site packages/proton/_events.py”,第123行,在调度中
[36mamqp|[0m 2020-02-10T13:05:37.01465222Z m(*args)
[36mamqp |[0m 2020-02-10T13:05:37.01471322Z文件/usr/local/lib/python3.8/site packages/proton/_handlers.py”,第260行,在线交付
[36mamqp|[0m 2020-02-10T13:05:37.01484132Z自开信息(事件)
[36mamqp |[0m 2020-02-10T13:05:37.01486282Z文件”/usr/local/lib/python3.8/site packages/proton/_handlers.py”,在线消息第286行
[36mamqp|[0m 2020-02-10T13:05:37.014896422 Z|U调度(自授权,“on|U消息”事件)
[36mamqp |[0m 2020-02-10T13:05:37.01493712Z文件“/usr/local/lib/python3.8/site packages/proton/_events.py”,第123行,发送
[36mamqp|[0m 2020-02-10T13:05:37.014970122Z m(*args)
[36mamqp|[0m 2020-02-10T13:05:37.01498242Z文件“/readIotHubAmqpClient.py”,第139行,在on_消息中
[36mamqp|[0m 2020-02-10T13:05:37.015060022Z有效载荷=将_转换为_流入_格式(接收到事件_.message)
[36mamqp |[0m 2020-02-10T13:05:37.015077922Z文件“/readIotHubAmqpClient.py”,第100行,转换为流入格式
[36mamqp|[0m 2020-02-10T13:05:37.015144321Z json_input=json.load(message.body)
[36mamqp |[0m 2020-02-10T13:05:37.015160421Z文件“/usr/local/lib/python3.8/json/_init__.py”,第343行,加载
[36mamqp|[0m 2020-02-10T13:05:37.015208121Z s=s.decode(检测_编码,“代理过程”)
[36mamqp|[0m 2020-02-10T13:05:37.015233721Z UnicodeDecodeError:“utf-8”编解码器无法解码位置73处的字节0xe0:无效的连续字节
您可以试试

json.loads(message.body.decode("utf-8"))
如果这不起作用,那么你可以这样做

json.loads(message.body.decode("utf-8","ignore"))

您的输入是ascii兼容但非utf-8编码。当所有内容都是ascii时,事情就会解决,但一旦输入超出ascii,它就会剧烈爆炸,这是一个很好的理由,因为您正在向解码器提供非utf8数据


找出输入的实际编码是什么,或者正确解码,或者修复生成数据的任何内容。我建议检查源代码生成的语言的ISO-8859编码是什么,或者Windows编码/代码页是什么。这两者都是常见的罪魁祸首。

看起来您收到的JSON编码不正确如果在我没有错的情况下,这种类型的字节<代码> 0xe0可以用<代码> LATIN 1/代码>作为解码来解决,但是我不知道如何把它作为一个论证来传递。UTF-8已经被用来解码非UTF-8编码的字符串;这就是问题所在。应该对消息进行编码并忽略无效字符,该
…o preneseni\xe0s
->看起来就像您的常规bog标准默认值。可能是,但我无法识别该语言,因此它可能是任何其他ascii兼容的代码页,例如Windows-1251或1257。