Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在java中使用ApacheSpark从json文件中获取嵌套列_Java_Apache Spark_Apache Spark Sql_Spark Dataframe_Business Intelligence - Fatal编程技术网

如何在java中使用ApacheSpark从json文件中获取嵌套列

如何在java中使用ApacheSpark从json文件中获取嵌套列,java,apache-spark,apache-spark-sql,spark-dataframe,business-intelligence,Java,Apache Spark,Apache Spark Sql,Spark Dataframe,Business Intelligence,我有多个json文件。我必须使用ApacheSpark解析它。它具有嵌套的键init。我必须打印所有列以及嵌套键 这些文件还具有嵌套键。 我想得到所有的列名以及嵌套的列名。我怎样才能得到它 我试过这个: String jsonFilePath = "/home/vipin/workspace/Smarten/jsonParsing/Employee/Employee-01.json,/home/vipin/workspace/Smarten/jsonParsing/Employee/Employ

我有多个json文件。我必须使用ApacheSpark解析它。它具有嵌套的键init。我必须打印所有列以及嵌套键

这些文件还具有嵌套键。 我想得到所有的列名以及嵌套的列名。我怎样才能得到它

我试过这个:

String jsonFilePath = "/home/vipin/workspace/Smarten/jsonParsing/Employee/Employee-01.json,/home/vipin/workspace/Smarten/jsonParsing/Employee/Employee-02.json";

String[] jsonFiles = jsonFilePath.split(",");

Dataset<Row> people = sparkSession.read().json(jsonFiles);
我得到的结果是:

people.show(50, false);

Age | Designation | Email            | Name       | Location
------------------------------------------------------------
22  |Programmer   |vpn2330@gmail.com | Vipin Suman|[Ahmedabad,Gujarat]
我需要以下数据:

Age | Designation | Email            | Name       | City      | State
------------------------------------------------------------
22  |Programmer   |vpn2330@gmail.com | Vipin Suman| Ahmedabad |Gujarat
或者说:-

Age | Designation | Email            | Name       | Location
---------------------------------------------------------------
22  |Programmer   |vpn2330@gmail.com | Vipin Suman| Ahmedabad,Gujarat
如果scema看起来像这样

root
 |-- Age: long (nullable = true)
 |-- Company: struct (nullable = true)
 |    |-- Company Name: string (nullable = true)
 |    |-- Domain: string (nullable = true)
 |-- Designation: string (nullable = true)
 |-- Email: string (nullable = true)
 |-- Name: string (nullable = true)
 |-- Test: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- location: struct (nullable = true)
 |    |-- City: struct (nullable = true)
 |    |    |-- City Name: string (nullable = true)
 |    |    |-- Pin: long (nullable = true)
 |    |-- State: string (nullable = true)  
和json结构

{ 
  "Name":"Vipin Suman",
  "Email":"vpn2330@gmail.com",
 "Designation":"Trainee Programmer",
 "Age":22 ,
 "location":
    {"City":
           {
            "Pin":324009,
            "City Name":"Ahmedabad"
           },
    "State":"Gujarat"
   },
 "Company":
          {
           "Company Name":"Elegant",
           "Domain":"Java"
          }, 
 "Test":["Test1","Test2"]

}

那么我如何才能找到嵌套的键。并以适当的格式显示表格要以上述预期格式显示数据,可以使用以下代码:

people.select("*", "location.*").drop("location").show
它将给出以下输出:

+---+-----------+-----------------+----------+---------+-------+
|Age|Designation|            Email|      Name|     City|  State|
+---+-----------+-----------------+----------+---------+-------+
| 22| Programmer|vpn2330@gmail.com|VipinSuman|Ahmedabad|Gujarat|
+---+-----------+-----------------+----------+---------+-------+

请准备:输入数据示例,您做了什么?问题是什么?非常感谢@himanshuIIITian的回复。我能再问你一个问题吗。如果我不知道嵌套了什么键,我如何才能找到它。或者,如果我有多个嵌套列,那么我如何才能找到并解决这种情况。@这是不可能的,因为如果我们不知道数据帧的架构,那么我们就无法知道它是否嵌套。@Vpn\u talent这个答案解决了您的问题吗?您的答案解决了我的问题。但我还有一个问题。所以我编辑了这个问题。我在以适当的形式展示桌子时遇到了问题。你能帮我一个忙吗?我是斯巴卡语的初学者,用一种可能的方法来获取“location”列,如“location.city”和“location.state”??我知道我们可以用“withColumnRenamed”方法实现,但如果有100列呢??任何建议都会有帮助。
+---+-----------+-----------------+----------+---------+-------+
|Age|Designation|            Email|      Name|     City|  State|
+---+-----------+-----------------+----------+---------+-------+
| 22| Programmer|vpn2330@gmail.com|VipinSuman|Ahmedabad|Gujarat|
+---+-----------+-----------------+----------+---------+-------+