如何在Python中解码Avro消息?

如何在Python中解码Avro消息?,python,apache-kafka,avro,fastavro,Python,Apache Kafka,Avro,Fastavro,我在用Python(3.6.11)解码Avro消息时遇到问题。我已经试用了avro和fastavro软件包。所以我认为问题可能是我提供的字节不正确 使用: 从avro.io导入数据读取器,二进制解码器 导入avro.schema 从io导入字节io schema=avro.schema.parse(“”) { “类型”:“记录”, “名称”:“用户”, “名称空间”:“example.avro”, “字段”:[ { “名称”:“名称”, “类型”:“字符串” }, { “姓名”:“最喜爱的号码”

我在用Python(3.6.11)解码Avro消息时遇到问题。我已经试用了
avro
fastavro
软件包。所以我认为问题可能是我提供的字节不正确

使用:

从avro.io导入数据读取器,二进制解码器
导入avro.schema
从io导入字节io
schema=avro.schema.parse(“”)
{
“类型”:“记录”,
“名称”:“用户”,
“名称空间”:“example.avro”,
“字段”:[
{
“名称”:“名称”,
“类型”:“字符串”
},
{
“姓名”:“最喜爱的号码”,
“类型”:[
“int”,
“空”
]
},
{
“名称”:“最喜欢的颜色”,
“类型”:[
“字符串”,
“空”
]
}
]
}
""")
rb=BytesIO(b'{“name”:“Alyssa”,“favorite_number”:256})
解码器=二进制解码器(rb)
reader=DatumReader(模式)
msg=reader.read(解码器)
打印(msg)
回溯(最近一次呼叫最后一次):
文件“main.py”,第36行,在
msg=reader.read(解码器)
文件“/opt/virtualenvs/python3/lib/python3.8/site packages/avro/io.py”,第626行,已读
返回self.read\u数据(self.writers\u模式、self.readers\u模式、解码器)
读取数据中的文件“/opt/virtualenvs/python3/lib/python3.8/site packages/avro/io.py”,第698行
返回self.read\u记录(writer\u模式、readers\u模式、解码器)
read_记录中的文件“/opt/virtualenvs/python3/lib/python3.8/site packages/avro/io.py”,第898行
field\u val=self.read\u数据(field.type、readers\u field.type、解码器)
读取数据中的文件“/opt/virtualenvs/python3/lib/python3.8/site packages/avro/io.py”,第638行
返回self.read\u union(writer\u schema、readers\u schema、decoder)
文件“/opt/virtualenvs/python3/lib/python3.8/site packages/avro/io.py”,第854行,以只读形式
索引\u of_schema=int(decoder.read\u long())
文件“/opt/virtualenvs/python3/lib/python3.8/site packages/avro/io.py”,第240行,只读
b=ord(自读(1))
TypeError:ord()应为字符,但找到长度为0的字符串
使用:

从fastavro导入无模式\u读取器,解析\u模式
从io导入字节io
模式=解析模式(
{
“类型”:“记录”,
“名称”:“用户”,
“名称空间”:“example.avro”,
“字段”:[
{
“名称”:“名称”,
“类型”:“字符串”
},
{
“姓名”:“最喜爱的号码”,
“类型”:[
“int”,
“空”
]
},
{
“名称”:“最喜欢的颜色”,
“类型”:[
“字符串”,
“空”
]
}
]
}
)
rb=BytesIO(b'{“name”:“Alyssa”,“favorite_number”:256})
msg=schemales\u读取器(rb,schema)
打印(msg)
回溯(最近一次呼叫最后一次):
文件“main.py”,第33行,在
msg=schemales\u读取器(rb,schema)
文件“fastavro/_read.pyx”,第969行,在fastavro._read.schemales_reader中
文件“fastavro/_read.pyx”,第981行,在fastavro._read.schemales_reader中
文件“fastavro/_read.pyx”,第652行,在fastavro中读取数据
文件“fastavro/_read.pyx”,第510行,在fastavro._read.read_记录中
文件“fastavro/_read.pyx”,第644行,在fastavro._read._read_数据中
文件“fastavro/_read.pyx”,第429行,在fastavro._read.read_union中
文件“fastavro/_read.pyx”,第200行,在fastavro中。\u read.read\u long
停止迭代

我不知道我正在编码的消息是格式错误还是编码本身有问题。有什么建议吗?

我会和fastavro谈谈,因为这是我最了解的

您的
rb
变量应该是您试图读取的avro二进制文件(而不是数据)。要获取此二进制文件的示例,您可以执行以下操作:

rb=BytesIO()
无模式的编写器(rb,模式,{“名称”:“Alyssa”,“最喜欢的代码”:256})
rb.getvalue()#b'\x0calysa\x00\x80\x04\x02'
然后,您可以执行您试图执行的操作并读取生成的二进制文件:

rb=BytesIO(b'\x0calysa\x00\x80\x04\x02')
数据=无模式的S\U读取器(rb,模式)
#{'name':'Alyssa','favorite_number':256,'favorite_color':无}

您不应该阅读avro文件吗?您当前正在尝试读取非avro格式的数据。您可以将内容写入avro文件,然后读取我使用kafkapython从主题中获取的内容。但我在解码信息时不断遇到问题。我试图隔离解码问题,但你可能是对的,我不能这样做。当我使用任何一个库写入缓冲区时,我都能够对其进行解码。这个问题实际上可能在我的kafka python消费者身上,或者甚至是在我的主题上产生的消息中。我不确定我的头在哪里。我在使用AKHQ并制作卡夫卡主题的信息。我错误地认为AKHQ使用注册表中的模式将消息编码到Avro。当我读到关于我的问题的评论时,我意识到这并没有发生。我添加了一个制作人,它可以按照您对无模式的S_编写器所描述的那样完成工作,并解决了问题。
from avro.io import DatumReader, BinaryDecoder
import avro.schema
from io import BytesIO

schema = avro.schema.parse("""
    {
        "type": "record",
        "name": "User",
        "namespace": "example.avro",
        "fields": [
            {
                "name": "name",
                "type": "string"
            },
            {
                "name": "favorite_number",
                "type": [
                    "int",
                    "null"
                ]
            },
            {
                "name": "favorite_color",
                "type": [
                    "string",
                    "null"
                ]
            }
        ]
    }
""")

rb = BytesIO(b'{"name": "Alyssa", "favorite_number": 256}')
decoder = BinaryDecoder(rb)
reader = DatumReader(schema)
msg = reader.read(decoder)
print(msg)

Traceback (most recent call last):
  File "main.py", line 36, in <module>
    msg = reader.read(decoder)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/avro/io.py", line 626, in read
    return self.read_data(self.writers_schema, self.readers_schema, decoder)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/avro/io.py", line 698, in read_data
    return self.read_record(writers_schema, readers_schema, decoder)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/avro/io.py", line 898, in read_record
    field_val = self.read_data(field.type, readers_field.type, decoder)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/avro/io.py", line 638, in read_data
    return self.read_union(writers_schema, readers_schema, decoder)
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/avro/io.py", line 854, in read_union
    index_of_schema = int(decoder.read_long())
  File "/opt/virtualenvs/python3/lib/python3.8/site-packages/avro/io.py", line 240, in read_long
    b = ord(self.read(1))
TypeError: ord() expected a character, but string of length 0 found
from fastavro import schemaless_reader, parse_schema
from io import BytesIO

schema = parse_schema(
    {
        "type": "record",
        "name": "User",
        "namespace": "example.avro",
        "fields": [
            {
                "name": "name",
                "type": "string"
            },
            {
                "name": "favorite_number",
                "type": [
                    "int",
                    "null"
                ]
            },
            {
                "name": "favorite_color",
                "type": [
                    "string",
                    "null"
                ]
            }
        ]
    }
)

rb = BytesIO(b'{"name": "Alyssa", "favorite_number": 256}')
msg = schemaless_reader(rb, schema)
print(msg)

Traceback (most recent call last):
  File "main.py", line 33, in <module>
    msg = schemaless_reader(rb, schema)
  File "fastavro/_read.pyx", line 969, in fastavro._read.schemaless_reader
  File "fastavro/_read.pyx", line 981, in fastavro._read.schemaless_reader
  File "fastavro/_read.pyx", line 652, in fastavro._read._read_data
  File "fastavro/_read.pyx", line 510, in fastavro._read.read_record
  File "fastavro/_read.pyx", line 644, in fastavro._read._read_data
  File "fastavro/_read.pyx", line 429, in fastavro._read.read_union
  File "fastavro/_read.pyx", line 200, in fastavro._read.read_long
StopIteration