Apache spark 为嵌套结构生成Spark Avro记录命名空间_Apache Spark_Apache Spark Sql_Spark Streaming_Avro_Spark Avro

Apache spark 为嵌套结构生成Spark Avro记录命名空间

apache-spark

Apache spark 为嵌套结构生成Spark Avro记录命名空间,apache-spark,apache-spark-sql,spark-streaming,avro,spark-avro,Apache Spark,Apache Spark Sql,Spark Streaming,Avro,Spark Avro,我想使用Spark 2.2.0编写Avro记录，其中模式具有命名空间和其中的一些嵌套记录 { "type": "record", "name": "userInfo", "namespace": "my.example", "fields": [ { "name": "username", "type": "string" }, { "name

我想使用Spark 2.2.0编写Avro记录，其中模式具有命名空间和其中的一些嵌套记录

{
    "type": "record",
    "name": "userInfo",
    "namespace": "my.example",
    "fields": [
        {
            "name": "username",
            "type": "string"
        },
        {
            "name": "address",
            "type": [
                "null",
                {
                    "type": "record",
                    "name": "address",
                    "fields": [
                        {
                            "name": "street",
                            "type": [
                                "null",
                                "string"
                            ],
                            "default": null
                        },
                        {
                            "name": "box",
                            "type": [
                                "null",
                                {
                                    "type": "record",
                                    "name": "box",
                                    "fields": [
                                        {
                                            "name": "id",
                                            "type": "string"
                                        }
                                    ]
                                }
                            ],
                            "default": null
                        }
                    ]
                }
            ],
            "default": null
        }
    ]
}

我需要写出如下记录：

{
    "username": "tom taylor",
    "address": {
        "my.example.address": {
            "street": {
                "string": "unknown"
            },
            "box": {
                "my.example.box": {
                    "id": "id1"
                }
            }
        }
    }
}

但是，当我使用spark Avro（4.0.0）阅读一些Avro GenericRecords并进行一些转换（例如：我正在添加名称空间）时，我希望写出输出：

df.foreach {
    ...
    .write
    .option("recordName", "userInfo")
    .option("recordNamespace", "my.example")
    ...
}

然后，在生成的GenericRecord中，嵌套记录的命名空间将包含从父记录到该元素的“完整路径”。也就是说，我得到的是我的.example.address.box，而不是我的.example.box。当我试图用模式读回这个记录时，当然存在不匹配
为编写器定义名称空间的正确方法是什么