Apache flink ApacheFlink将BucketingSink与s3集成

Apache flink ApacheFlink将BucketingSink与s3集成,apache-flink,amazon-emr,Apache Flink,Amazon Emr,可以使用ApacheFlink附带的BucketingSink将数据写入s3吗 我尝试了几种URL的组合,但我似乎无法使用s3 e、 g.s3://bucket/path/to/folder 部署到EMR 5.4.0时,我可以写入hdfs,但不能写入s3 文档中没有提到S3是一个潜在的集成,但我假设它是本机支持的 如果使用s3a://url格式,则会出现以下错误 java.lang.NoSuchMethodError: org.apache.http.params.HttpConnection

可以使用ApacheFlink附带的BucketingSink将数据写入s3吗

我尝试了几种URL的组合,但我似乎无法使用s3

e、 g.s3://bucket/path/to/folder

部署到EMR 5.4.0时,我可以写入hdfs,但不能写入s3

文档中没有提到S3是一个潜在的集成,但我假设它是本机支持的

如果使用s3a://url格式,则会出现以下错误

java.lang.NoSuchMethodError: org.apache.http.params.HttpConnectionParams.setSoKeepalive(Lorg/apache/http/params/HttpParams;Z)V
at com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:96)
at com.amazonaws.http.AmazonHttpClient.<init>(AmazonHttpClient.java:187)
at com.amazonaws.AmazonWebServiceClient.<init>(AmazonWebServiceClient.java:136)
at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:394)
at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:374)
at com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:356)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:235)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2717)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2751)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2733)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:377)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:417)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:351)
at org.apache.flink.streaming.api.functions.util.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
at org.apache.flink.streaming.api.functions.util.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:106)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:225)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:666)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:654)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:257)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
at java.lang.Thread.run(Thread.java:745)
java.lang.NoSuchMethodError:org.apache.http.params.HttpConnectionParams.setSokePaLive(Lorg/apache/http/params/HttpParams;Z)V
位于com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:96)
在com.amazonaws.http.AmazonHttpClient上。(AmazonHttpClient.java:187)
在com.amazonaws.AmazonWebServiceClient.(AmazonWebServiceClient.java:136)
位于com.amazonaws.services.s3.AmazonS3Client(AmazonS3Client.java:394)
在com.amazonaws.services.s3.AmazonS3Client.(AmazonS3Client.java:374)
在com.amazonaws.services.s3.AmazonS3Client.(AmazonS3Client.java:356)
位于org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:235)
位于org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2717)
位于org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
位于org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2751)
位于org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2733)
位于org.apache.hadoop.fs.FileSystem.get(FileSystem.java:377)
位于org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
位于org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:417)
位于org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:351)
位于org.apache.flink.streaming.api.functions.util.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
位于org.apache.flink.streaming.api.functions.util.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
位于org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:106)
位于org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:225)
位于org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:666)
位于org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:654)
位于org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:257)
位于org.apache.flink.runtime.taskmanager.Task.run(Task.java:655)
运行(Thread.java:745)

EMR所做的事情与ASF Hadoop的S3A客户端不兼容10%。坚持使用Amaon的s3://URL:这是他们支持的