使用Java SparkSession手动创建数据集时取消支持OperationException
我正试图在JUnit测试中从如下字符串创建一个数据集使用Java SparkSession手动创建数据集时取消支持OperationException,java,dataframe,apache-spark,Java,Dataframe,Apache Spark,我正试图在JUnit测试中从如下字符串创建一个数据集 SparkSession sparkSession = SparkSession.builder().appName("Job Test").master("local[*]") .getOrCreate(); String some1_json = readFileAsString("some1.json"); Str
SparkSession sparkSession = SparkSession.builder().appName("Job Test").master("local[*]")
.getOrCreate();
String some1_json = readFileAsString("some1.json");
String some2_json = readFileAsString("some2.json");
String id = "some_id";
List<String[]> rowStrs = new ArrayList<>();
rowStrs.add(new String[] {some_id, some1_json, some2_json});
JavaSparkContext javaSparkContext = new JavaSparkContext(sparkSession.sparkContext());
JavaRDD<Row> rowRDD = javaSparkContext.parallelize(rowStrs).map(RowFactory::create);
StructType schema = new StructType(new StructField[]{
DataTypes.createStructField("id", DataTypes.StringType, false),
DataTypes.createStructField("some1_json", DataTypes.StringType, false),
DataTypes.createStructField("some2_json", DataTypes.StringType, false)});
Dataset<Row> datasetUnderTest = sparkSession.sqlContext().createDataFrame(rowRDD, schema);
datasetUnderTest.show();
我错过了什么?我的主要方法很好,但是这个测试失败了。似乎没有正确地从类路径读取某些内容。通过从与Spark相关的所有依赖项中排除此下面的依赖项,修复了此问题
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
</exclusion>
</exclusions>
org.apache.hadoop
hadoop内核
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
</exclusion>
</exclusions>