Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/358.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用Java SparkSession手动创建数据集时取消支持OperationException_Java_Dataframe_Apache Spark - Fatal编程技术网

使用Java SparkSession手动创建数据集时取消支持OperationException

使用Java SparkSession手动创建数据集时取消支持OperationException,java,dataframe,apache-spark,Java,Dataframe,Apache Spark,我正试图在JUnit测试中从如下字符串创建一个数据集 SparkSession sparkSession = SparkSession.builder().appName("Job Test").master("local[*]") .getOrCreate(); String some1_json = readFileAsString("some1.json"); Str

我正试图在JUnit测试中从如下字符串创建一个数据集

SparkSession sparkSession = SparkSession.builder().appName("Job Test").master("local[*]")
                .getOrCreate();
        String some1_json = readFileAsString("some1.json");
        String some2_json = readFileAsString("some2.json");
        String id = "some_id";

        List<String[]> rowStrs = new ArrayList<>();
        rowStrs.add(new String[] {some_id, some1_json, some2_json});

        JavaSparkContext javaSparkContext = new JavaSparkContext(sparkSession.sparkContext());
        JavaRDD<Row> rowRDD = javaSparkContext.parallelize(rowStrs).map(RowFactory::create);
        StructType schema = new StructType(new StructField[]{
                DataTypes.createStructField("id", DataTypes.StringType, false),
                DataTypes.createStructField("some1_json", DataTypes.StringType, false),
                DataTypes.createStructField("some2_json", DataTypes.StringType, false)});

        Dataset<Row> datasetUnderTest = sparkSession.sqlContext().createDataFrame(rowRDD, schema);
        datasetUnderTest.show();

我错过了什么?我的主要方法很好,但是这个测试失败了。似乎没有正确地从类路径读取某些内容。

通过从与Spark相关的所有依赖项中排除此下面的依赖项,修复了此问题

            <exclusions>
                <exclusion>
                    <groupId>org.apache.hadoop</groupId>
                    <artifactId>hadoop-core</artifactId>
                </exclusion>
            </exclusions>

org.apache.hadoop
hadoop内核
            <exclusions>
                <exclusion>
                    <groupId>org.apache.hadoop</groupId>
                    <artifactId>hadoop-core</artifactId>
                </exclusion>
            </exclusions>