Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/amazon-s3/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java kinesis analytics flink写入拼花文件_Java_Amazon S3_Apache Flink_Parquet_Amazon Kinesis Analytics - Fatal编程技术网

Java kinesis analytics flink写入拼花文件

Java kinesis analytics flink写入拼花文件,java,amazon-s3,apache-flink,parquet,amazon-kinesis-analytics,Java,Amazon S3,Apache Flink,Parquet,Amazon Kinesis Analytics,使用amazon kinesis analytics和java flink应用程序,我从消防软管中获取数据,并尝试将其作为一系列拼花文件写入S3桶中。我在我的cloud watch日志中遇到以下异常,这是我能看到的唯一可能相关的错误 我已经按照文档中的规定启用了检查点,并包括flink/arvo依赖项。在本地运行这个程序很有效。当到达检查点时,拼花地板文件将写入本地磁盘 例外 "message": "Exception type is USER from filter results [User

使用amazon kinesis analytics和java flink应用程序,我从消防软管中获取数据,并尝试将其作为一系列拼花文件写入S3桶中。我在我的cloud watch日志中遇到以下异常,这是我能看到的唯一可能相关的错误

我已经按照文档中的规定启用了检查点,并包括flink/arvo依赖项。在本地运行这个程序很有效。当到达检查点时,拼花地板文件将写入本地磁盘

例外

"message": "Exception type is USER from filter results [UserClassLoaderExceptionFilter -> USER, UserAPIExceptionFilter -> SKIPPED, UserSerializationExceptionFilter -> SKIPPED, UserFunctionExceptionFilter -> SKIPPED, OutOfMemoryExceptionFilter -> NONE, TooManyOpenFilesExceptionFilter -> NONE, KinesisServiceExceptionFilter -> NONE].",
"throwableInformation": [
    "java.lang.Exception: Error while triggering checkpoint 1360 for Source: Custom Source -> Map -> Sink: HelloS3 (1/1)",
    "org.apache.flink.runtime.taskmanager.Task$1.run(Task.java:1201)",
    "java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)",
    "java.util.concurrent.FutureTask.run(FutureTask.java:266)",
    "java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)",
    "java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)",
    "java.lang.Thread.run(Thread.java:748)",
    "Caused by: java.lang.AbstractMethodError: org.apache.parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter.writePage(Lorg/apache/parquet/bytes/BytesInput;IILorg/apache/parquet/column/statistics/Statistics;Lorg/apache/parquet/column/Encoding;Lorg/apache/parquet/column/Encoding;Lorg/apache/parquet/column/Encoding;)V",
    "org.apache.parquet.column.impl.ColumnWriterV1.writePage(ColumnWriterV1.java:53)",
    "org.apache.parquet.column.impl.ColumnWriterBase.writePage(ColumnWriterBase.java:315)",
    "org.apache.parquet.column.impl.ColumnWriteStoreBase.flush(ColumnWriteStoreBase.java:152)",
    "org.apache.parquet.column.impl.ColumnWriteStoreV1.flush(ColumnWriteStoreV1.java:27)",
    "org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:172)",
    "org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:114)",
    "org.apache.parquet.hadoop.ParquetWriter.close(ParquetWriter.java:308)",
    "org.apache.flink.formats.parquet.ParquetBulkWriter.finish(ParquetBulkWriter.java:62)",
    "org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:62)",
    "org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.closePartFile(Bucket.java:235)",
    "org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.prepareBucketForCheckpointing(Bucket.java:276)",
    "org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.onReceptionOfCheckpoint(Bucket.java:249)",
    "org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.snapshotActiveBuckets(Buckets.java:244)",
    "org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.snapshotState(Buckets.java:235)",
    "org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink.snapshotState(StreamingFileSink.java:347)",
    "org.apache.flink.streaming.util.functions.StreamingFunctionUtils.trySnapshotFunctionState(StreamingFunctionUtils.java:118)",
    "org.apache.flink.streaming.util.functions.StreamingFunctionUtils.snapshotFunctionState(StreamingFunctionUtils.java:99)",
    "org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:90)",
    "org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:395)",
    "org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(StreamTask.java:1138)",
    "org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1080)",
    "org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:754)",
    "org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:666)",
    "org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpoint(StreamTask.java:584)",
    "org.apache.flink.streaming.runtime.tasks.SourceStreamTask.triggerCheckpoint(SourceStreamTask.java:114)",
    "org.apache.flink.runtime.taskmanager.Task$1.run(Task.java:1190)",
    "\t... 5 more"
下面是我的代码片段。我在处理事件时获得日志,甚至从bucketassigner获得日志

env.setstatebend(新的fsstatebend(“s3a:///检查点”);
环境(一);
环境启用检查点(5000,检查点模式。仅一次);
StreamingFileSink接收器=StreamingFileSink
.forBulkFormat(新路径(“s3a:///raw”),ParquetAvroWriters.forReflectRecord(Metric.class))
.withBucketAssigner(新事件时间BucketAssigner())
.build();
我的pom:


org.apache.flink
flink-parquet_2.11
1.11-1
org.apache.parquet
镶木地板
1.11.0
org.apache.hadoop
hadoop客户端
3.2.1
我的AWS配置已启用“快照”。当我使用行写入而不是批量写入时,写入权限对bucket有效


现在真的不确定要找什么来让它工作。

检查您的类路径,尝试确保flink parquet没有使用可传递的hadoop/parquet deps,这可能会与POM中指定的内容冲突。谢谢您的建议。我怀疑这可能与此有关,但由于S3的检查点工作正常,我认为这就是我所需要的。事实上,它不是有两个s3要求的依赖性来获得检查点和批量写入工作。他最终到达了那里。感谢tipI似乎面临同样的问题,唯一的区别是我无法写入本地磁盘。你对此做了什么修改?我发现了问题。编码格式基本上是从两个地方包括,拼花avro和弗林克拼花。我从flink parquet中排除了拼花hadoop,它开始工作了。