Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Arrays 如何在Spark中将数组项分隔为单独的列?_Arrays_Apache Spark_Multiple Columns - Fatal编程技术网

Arrays 如何在Spark中将数组项分隔为单独的列?

Arrays 如何在Spark中将数组项分隔为单独的列?,arrays,apache-spark,multiple-columns,Arrays,Apache Spark,Multiple Columns,如果其中的值在数组中,如何将一列分隔为4 预期产出: +---------------------------+ |address | +---------------------------+ |[San Jone, 19422, CA, 126]| |[Queens, 11372, NY, 5543] | +---------------------------+ 编辑: 这是我的.json文件,一旦我创建了一个数据帧,我需要将地址分成4列。尝试下面

如果其中的值在数组中,如何将一列分隔为4

预期产出:

+---------------------------+
|address                    |
+---------------------------+
|[San Jone, 19422, CA, 126]|
|[Queens, 11372, NY, 5543]  |
+---------------------------+
编辑:

这是我的.json文件,一旦我创建了一个数据帧,我需要将地址分成4列。

尝试下面的代码

 [
    {
        "firstName": "Rack",
        "lastName": "Jackon",
        "gender": "man",
        "age": 24,
        "address": {
            "streetAddress": "126",
            "city": "San Jone",
            "state": "CA",
            "postalCode": "394221"
        }
    },
   


{
    "firstName": "Apache",
    "lastName": "Spark",
    "gender": "Woman",
    "age": 24,
    "address": {
        "streetAddress": "5543",
        "city": "Queens",
        "state": "NY",
        "postalCode": "11372"
    }
}

]

我是usnig databricks,我想它是spark 3.0,您提到的地址是数组,但看起来像是结构类型。您可以使用
地址。*
它将根据您的要求创建新列。它给出了以下错误:AnalysisException:字段名应为字符串文字,但为0;你能发布你的代码吗?val columns=Seq(“city”、“zip”、“state”、“street”).zipWithIndex test.select(columns.map(c=>col(s“address”)(c.\u 2.as(c.\u 1)):\u*)。show我只需复制粘贴你的代码。如果问题更清楚,我已经编辑好了。如果可以,请回答。非常感谢。
 [
    {
        "firstName": "Rack",
        "lastName": "Jackon",
        "gender": "man",
        "age": 24,
        "address": {
            "streetAddress": "126",
            "city": "San Jone",
            "state": "CA",
            "postalCode": "394221"
        }
    },
   


{
    "firstName": "Apache",
    "lastName": "Spark",
    "gender": "Woman",
    "age": 24,
    "address": {
        "streetAddress": "5543",
        "city": "Queens",
        "state": "NY",
        "postalCode": "11372"
    }
}

]
scala> df.show(false)
+--------------------------+
|address                   |
+--------------------------+
|[San Jone, 19422, CA, 126]|
|[Queens, 11372, NY, 5543] |
+--------------------------+
scala> val columns = Seq("city","zip","state","street").zipWithIndex
scala> df.select(columns.map(c => col(s"address")(c._2).as(c._1)):_*).show(false)
+--------+-----+-----+------+
|city    |zip  |state|street|
+--------+-----+-----+------+
|San Jone|19422|CA   |126   |
|Queens  |11372|NY   |5543  |
+--------+-----+-----+------+