如何使用PySpark中的ABRiS库向卡夫卡写信？_Pyspark_Apache Kafka_Spark Structured Streaming_Confluent Schema Registry

如何使用PySpark中的ABRiS库向卡夫卡写信？

pyspark apache-kafka

如何使用PySpark中的ABRiS库向卡夫卡写信？,pyspark,apache-kafka,spark-structured-streaming,confluent-schema-registry,Pyspark,Apache Kafka,Spark Structured Streaming,Confluent Schema Registry,有人能用PySpark用library给卡夫卡写信吗我已经能够使用自述文件中的代码成功地阅读： import logging, traceback import requests from pyspark.sql import Column from pyspark.sql.column import * jvm_gateway = spark_context._gateway.jvm abris_avro = jvm_gateway.za.co.absa.abris.avro namin

有人能用PySpark用library给卡夫卡写信吗

我已经能够使用自述文件中的代码成功地阅读：

import logging, traceback
import requests
from pyspark.sql import Column
from pyspark.sql.column import *

jvm_gateway = spark_context._gateway.jvm
abris_avro  = jvm_gateway.za.co.absa.abris.avro
naming_strategy = getattr(getattr(abris_avro.read.confluent.SchemaManager, "SchemaStorageNamingStrategies$"), "MODULE$").TOPIC_NAME()        

schema_registry_config_dict = {"schema.registry.url": schema_registry_url,
                               "schema.registry.topic": topic,
                               "value.schema.id": "latest",
                               "value.schema.naming.strategy": naming_strategy}

conf_map = getattr(getattr(jvm_gateway.scala.collection.immutable.Map, "EmptyMap$"), "MODULE$")
    for k, v in schema_registry_config_dict.items():
        conf_map = getattr(conf_map, "$plus")(jvm_gateway.scala.Tuple2(k, v))
        
    deserialized_df = data_frame.select(Column(abris_avro.functions.from_confluent_avro(data_frame._jdf.col("value"), conf_map))
                      .alias("data")).select("data.*")

然而，我正在努力通过

to_confluent_avro

函数编写主题来扩展行为