Apache spark 从Kafka流解析Spark中的JSON消息

Apache spark 从Kafka流解析Spark中的JSON消息,apache-spark,apache-spark-sql,spark-streaming,Apache Spark,Apache Spark Sql,Spark Streaming,我需要用spark(java)解析一个事件流(格式如下)。我能够阅读流,但还没有找到将消息转换为JavaBean的示例 { user_id : string, session_id : string, event : string, page : string, timestamp : timestamp } 爪哇豆 public class Event implements Serializable { private String use

我需要用spark(java)解析一个事件流(格式如下)。我能够阅读流,但还没有找到将消息转换为JavaBean的示例

{
    user_id  : string,
    session_id : string,
    event : string,
    page : string,
    timestamp : timestamp
}
爪哇豆

public class Event implements Serializable {

    private String user_id;

    private String session_id;
    private String page;
    private String event;
    private Timestamp timestamp;
}
将消息作为字符串读取的代码

Dataset<String> lines = spark
                        .readStream()
                        .format("kafka")
                        .option("kafka.bootstrap.servers", "localhost:9092")
                        .option("subscribe", topics)
                        .load()
                        .selectExpr("CAST(value AS STRING)")
                        .as(Encoders.STRING());     
Dataset line=spark
.readStream()
.格式(“卡夫卡”)
.option(“kafka.bootstrap.servers”,“localhost:9092”)
.选项(“订阅”,主题)
.load()
.selectExpr(“转换(值为字符串)”)
.as(Encoders.STRING());

我使用以下方法实现了这一点

        FlatMapFunction<String, Event> linesToEvents = new FlatMapFunction<String, Event>() {
            @Override
            public Iterator<Event> call(String line) throws JsonMappingException, JsonProcessingException {
                ObjectMapper mapper = new ObjectMapper();
                ArrayList<Event> eventList = new ArrayList<>();
                eventList.add(mapper.readValue(line, Event.class));
                return eventList.iterator();
            }
        };

FlatMapFunction linesToEvents=新的FlatMapFunction(){
@凌驾
公共迭代器调用(字符串行)抛出JsonMappingException、JsonProcessingException{
ObjectMapper mapper=新的ObjectMapper();
ArrayList eventList=新建ArrayList();
add(mapper.readValue(行,Event.class));
返回eventList.iterator();
}
};
Dataset line=spark
.readStream()
.格式(“卡夫卡”)
.option(“kafka.bootstrap.servers”,“localhost:9092”)
.选项(“订阅”,主题)
.load()
.selectExpr(“转换(值为字符串)”)
.as(Encoders.STRING())
.flatMap(linesToEvents,Encoders.bean(Event.class));
        Dataset<Event> lines = spark
                                .readStream()
                                .format("kafka")
                                .option("kafka.bootstrap.servers", "localhost:9092")
                                .option("subscribe", topics)
                                .load()
                                .selectExpr("CAST(value AS STRING)")
                                .as(Encoders.STRING())
                                .flatMap(linesToEvents, Encoders.bean(Event.class));