Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/303.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 在Apache Spark SQL中创建了嵌套架构_Java_Json_Apache Spark_Apache Spark Sql_Apache Spark Dataset - Fatal编程技术网

Java 在Apache Spark SQL中创建了嵌套架构

Java 在Apache Spark SQL中创建了嵌套架构,java,json,apache-spark,apache-spark-sql,apache-spark-dataset,Java,Json,Apache Spark,Apache Spark Sql,Apache Spark Dataset,我想将一个简单的JSON模式加载到我的SparkSession中,它有一个带有地址数组的employee。下面是示例JSON {"firstName":"Neil","lastName":"Irani", "addresses" : [ { "city" : "Brindavan", "state" : "NJ" }, { "city" : "Subala", "state" : "DT" }]} 我正在尝试创建加载JSON的模式,我相信下面创建模式的方式有问题。。。请告知。。下面的代

我想将一个简单的JSON模式加载到我的SparkSession中,它有一个带有地址数组的employee。下面是示例JSON

{"firstName":"Neil","lastName":"Irani", "addresses" : [ {  "city" : "Brindavan", "state" : "NJ"  }, {  "city" : "Subala", "state" : "DT"  }]}
我正在尝试创建加载JSON的模式,我相信下面创建模式的方式有问题。。。请告知。。下面的代码是用Java编写的。。。我找不到合理的样品

    List<StructField> employeeFields = new ArrayList<>();
    employeeFields.add(DataTypes.createStructField("firstName", DataTypes.StringType, true));
    employeeFields.add(DataTypes.createStructField("lastName", DataTypes.StringType, true));
    employeeFields.add(DataTypes.createStructField("email", DataTypes.StringType, true));

    List<StructField> addressFields = new ArrayList<>();
    addressFields.add(DataTypes.createStructField("city", DataTypes.StringType, true));
    addressFields.add(DataTypes.createStructField("state", DataTypes.StringType, true));
    addressFields.add(DataTypes.createStructField("zip", DataTypes.StringType, true));

    employeeFields.add(DataTypes.createStructField("addresses", DataTypes.createStructType(addressFields), true));

    StructType employeeSchema = DataTypes.createStructType(employeeFields);


    Dataset<Employee>  rowDataset = sparkSession.read()
            .option("inferSchema", "false")
            .schema(employeeSchema)
            .json("simple_employees.json").as(employeeEncoder);
List employeeFields=new ArrayList();
add(DataTypes.createStructField(“firstName”,DataTypes.StringType,true));
add(DataTypes.createStructField(“lastName”,DataTypes.StringType,true));
add(DataTypes.createStructField(“email”,DataTypes.StringType,true));
List addressFields=new ArrayList();
address fields.add(DataTypes.createStructField(“city”,DataTypes.StringType,true));
address fields.add(DataTypes.createStructField(“state”,DataTypes.StringType,true));
address fields.add(DataTypes.createStructField(“zip”,DataTypes.StringType,true));
add(DataTypes.createStructField(“addresses”,DataTypes.createStructType(addressFields),true));
StructType employeeSchema=DataTypes.createStructType(employeeFields);
Dataset rowDataset=sparkSession.read()
.选项(“推断模式”、“错误”)
.schema(employeeSchema)
.json(“simple_employees.json”).as(employeeEncoder);
更新

我没有创建数组类型,下面的代码可以正常工作

List<StructField> employeeFields = new ArrayList<>();
employeeFields.add(DataTypes.createStructField("firstName", DataTypes.StringType, true));
employeeFields.add(DataTypes.createStructField("lastName", DataTypes.StringType, true));
employeeFields.add(DataTypes.createStructField("email", DataTypes.StringType, true));

List<StructField> addressFields = new ArrayList<>();
addressFields.add(DataTypes.createStructField("city", DataTypes.StringType, true));
addressFields.add(DataTypes.createStructField("state", DataTypes.StringType, true));
addressFields.add(DataTypes.createStructField("zip", DataTypes.StringType, true));
ArrayType addressStruct = DataTypes.createArrayType( DataTypes.createStructType(addressFields));

employeeFields.add(DataTypes.createStructField("addresses", addressStruct, true));
StructType employeeSchema = DataTypes.createStructType(employeeFields);
List employeeFields=new ArrayList();
add(DataTypes.createStructField(“firstName”,DataTypes.StringType,true));
add(DataTypes.createStructField(“lastName”,DataTypes.StringType,true));
add(DataTypes.createStructField(“email”,DataTypes.StringType,true));
List addressFields=new ArrayList();
address fields.add(DataTypes.createStructField(“city”,DataTypes.StringType,true));
address fields.add(DataTypes.createStructField(“state”,DataTypes.StringType,true));
address fields.add(DataTypes.createStructField(“zip”,DataTypes.StringType,true));
ArrayType addressStruct=DataTypes.createArrayType(DataTypes.createStructType(addressFields));
add(DataTypes.createStructField(“addresses”,addressStruct,true));
StructType employeeSchema=DataTypes.createStructType(employeeFields);