Apache flink Flink ParquetAvroWriters给出了铸造错误
在阅读了来自Kafka的GenericRecord之后,我编写了一个将流写入拼花地板格式的示例代码Apache flink Flink ParquetAvroWriters给出了铸造错误,apache-flink,avro,parquet,Apache Flink,Avro,Parquet,在阅读了来自Kafka的GenericRecord之后,我编写了一个将流写入拼花地板格式的示例代码 Properties config = new Properties(); config.setProperty("bootstrap.servers", "localhost:9092"); config.setProperty("group.id", "1"); config.setProperty("zookeeper.connect", "
Properties config = new Properties();
config.setProperty("bootstrap.servers", "localhost:9092");
config.setProperty("group.id", "1");
config.setProperty("zookeeper.connect", "localhost:2181");
String schemaRegistryUrl = "http://127.0.0.1:8081";
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
File file = new File(EventProcessor.class.getClassLoader().getResource("event.avsc").getFile());
Schema schema = new Schema.Parser().parse(file);
DataStreamSource<GenericRecord> input = env
.addSource(
new FlinkKafkaConsumer010<GenericRecord>("event_new",
new KafkaGenericAvroDeserializationSchema(schemaRegistryUrl),
config).setStartFromEarliest());
Path path = new Path("/tmp");
final StreamingFileSink sink = StreamingFileSink.forBulkFormat
(path, ParquetAvroWriters.forGenericRecord(schema)).build();
input.addSink(sink);
我不明白出了什么问题。请帮助我理解并解决此问题。最可能的原因是您的event.avsc与存储在Kafka中的记录不匹配。它正在寻找一个字符串,在那里它需要一个记录
如果您添加来自Kafka的模式和示例记录(例如,使用console consumer打印),那么我可以提供更多帮助。我已经解决了它。这仅仅是因为错误的模式。
Caused by: org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: Could not forward element to next operator
Caused by: java.lang.ClassCastException: org.apache.avro.util.Utf8 cannot be cast to org.apache.avro.generic.IndexedRecord
at org.apache.avro.generic.GenericData.getField(GenericData.java:697)
at org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:188)