Apache spark 从Kafka流解析Spark中的JSON消息
我需要用spark(java)解析一个事件流(格式如下)。我能够阅读流,但还没有找到将消息转换为JavaBean的示例Apache spark 从Kafka流解析Spark中的JSON消息,apache-spark,apache-spark-sql,spark-streaming,Apache Spark,Apache Spark Sql,Spark Streaming,我需要用spark(java)解析一个事件流(格式如下)。我能够阅读流,但还没有找到将消息转换为JavaBean的示例 { user_id : string, session_id : string, event : string, page : string, timestamp : timestamp } 爪哇豆 public class Event implements Serializable { private String use
{
user_id : string,
session_id : string,
event : string,
page : string,
timestamp : timestamp
}
爪哇豆
public class Event implements Serializable {
private String user_id;
private String session_id;
private String page;
private String event;
private Timestamp timestamp;
}
将消息作为字符串读取的代码
Dataset<String> lines = spark
.readStream()
.format("kafka")
.option("kafka.bootstrap.servers", "localhost:9092")
.option("subscribe", topics)
.load()
.selectExpr("CAST(value AS STRING)")
.as(Encoders.STRING());
Dataset line=spark
.readStream()
.格式(“卡夫卡”)
.option(“kafka.bootstrap.servers”,“localhost:9092”)
.选项(“订阅”,主题)
.load()
.selectExpr(“转换(值为字符串)”)
.as(Encoders.STRING());
我使用以下方法实现了这一点
FlatMapFunction<String, Event> linesToEvents = new FlatMapFunction<String, Event>() {
@Override
public Iterator<Event> call(String line) throws JsonMappingException, JsonProcessingException {
ObjectMapper mapper = new ObjectMapper();
ArrayList<Event> eventList = new ArrayList<>();
eventList.add(mapper.readValue(line, Event.class));
return eventList.iterator();
}
};
FlatMapFunction linesToEvents=新的FlatMapFunction(){
@凌驾
公共迭代器调用(字符串行)抛出JsonMappingException、JsonProcessingException{
ObjectMapper mapper=新的ObjectMapper();
ArrayList eventList=新建ArrayList();
add(mapper.readValue(行,Event.class));
返回eventList.iterator();
}
};
Dataset line=spark
.readStream()
.格式(“卡夫卡”)
.option(“kafka.bootstrap.servers”,“localhost:9092”)
.选项(“订阅”,主题)
.load()
.selectExpr(“转换(值为字符串)”)
.as(Encoders.STRING())
.flatMap(linesToEvents,Encoders.bean(Event.class));
Dataset<Event> lines = spark
.readStream()
.format("kafka")
.option("kafka.bootstrap.servers", "localhost:9092")
.option("subscribe", topics)
.load()
.selectExpr("CAST(value AS STRING)")
.as(Encoders.STRING())
.flatMap(linesToEvents, Encoders.bean(Event.class));