Apache spark 将spark流结果写入HDFS或本地文件系统的问题
使用JavaAPI,我编写了一个spark流应用程序,可以正确处理和打印结果,现在我想将结果写入HDFS。版本如下:Apache spark 将spark流结果写入HDFS或本地文件系统的问题,apache-spark,hdfs,spark-streaming,Apache Spark,Hdfs,Spark Streaming,使用JavaAPI,我编写了一个spark流应用程序,可以正确处理和打印结果,现在我想将结果写入HDFS。版本如下: Hadoop2.7.3 Spark2.2.0 Java1.8 代码如下: import java.util.*; import org.apache.spark.SparkConf; import org.apache.spark.streaming.Duration; import org.apache.spark.streaming.api.java.*; import or
Hadoop
2.7.3
Spark
2.2.0
Java
1.8
import java.util.*;
import org.apache.spark.SparkConf;
import org.apache.spark.streaming.Duration;
import org.apache.spark.streaming.api.java.*;
import org.apache.spark.streaming.kafka010.*;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.apache.kafka.common.serialization.ByteArrayDeserializer;
public class Spark {
public static void main(String[] args) throws InterruptedException {
SparkConf conf = new SparkConf().setAppName("Spark Streaming").setMaster("local[*]");
JavaStreamingContext ssc = new JavaStreamingContext(conf, new Duration(1000));
Map<String, Object> kafkaParams = new HashMap<>();
kafkaParams.put("bootstrap.servers", "kafka1:9092,kafka2:9092");
kafkaParams.put("key.deserializer", StringDeserializer.class);
kafkaParams.put("value.deserializer", ByteArrayDeserializer.class);
kafkaParams.put("group.id", "use");
kafkaParams.put("auto.offset.reset", "earliest");
kafkaParams.put("enable.auto.commit", false);
Collection<String> topics = Arrays.asList("testStr");
JavaInputDStream<ConsumerRecord<String, byte[]>> stream =
KafkaUtils.createDirectStream(
ssc,
LocationStrategies.PreferConsistent(),
ConsumerStrategies.<String, byte[]>Subscribe(topics, kafkaParams)
);
stream.map(record -> finall(record.value())).map(record -> Arrays.deepToString(record)).dstream().saveAsTextFiles(
"spark", "txt"
);
ssc.start();
ssc.awaitTermination();
}
public static String[][] finall(byte[] record){
String[][] result = new String[4][];
result[0] = javaTest.bytePrintable(record);
result[1] = javaTest.hexTodecimal(record);
result[2] = javaTest.hexToOctal(record);
result[3] = javaTest.hexTobin(record);
return result;
}
}
哪一个不兼容?或者是有什么东西不见了
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.2.0</version>
</dependency>