Scala Spark 2.4-数据帧写入s3存储桶
从我的本地PC上,我尝试将我的DF加载到S3中。。下面是我的代码片段Scala Spark 2.4-数据帧写入s3存储桶,scala,apache-spark,amazon-s3,Scala,Apache Spark,Amazon S3,从我的本地PC上,我尝试将我的DF加载到S3中。。下面是我的代码片段 sparkContext.hadoopConfiguration.set("fs.s3a.awsAccessKeyId", Util.AWS_ACCESS_KEY) sparkContext.hadoopConfiguration.set("fs.s3a.awsSecretAccessKey", Util.AWS_SECRET_ACCESS_KEY) sparkContext.hadoopConfiguration.set("
sparkContext.hadoopConfiguration.set("fs.s3a.awsAccessKeyId", Util.AWS_ACCESS_KEY)
sparkContext.hadoopConfiguration.set("fs.s3a.awsSecretAccessKey", Util.AWS_SECRET_ACCESS_KEY)
sparkContext.hadoopConfiguration.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
empTableDF.coalesce(1).write
.format("csv")
.option("header", "true")
.mode(SaveMode.Overwrite)
.save("s3a://welpocstg/")
在运行时,我得到以下异常
com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain
我的pom.xml
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.7</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.7</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
<version>2.7.7</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient -->
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.6</version>
</dependency>
org.apache.hadoop
hadoop通用
2.7.7
org.apache.hadoop
hadoop客户端
2.7.7
org.apache.hadoop
hadoop aws
2.7.7
org.apache.httpcomponents
httpclient
4.5.6
您可以尝试以下更改
sparkContext.hadoopConfiguration.set("fs.s3a.access.key", Util.AWS_ACCESS_KEY)
sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", Util.AWS_SECRET_ACCESS_KEY)
Seq("1","2","3").toDF("id")
.coalesce(1)
.write
.format("csv")
.option("header", "true")
.mode(SaveMode.Overwrite)
.save("s3a://welpocstg/")
任何soultion teamwell,都有一份完整的文档专门用于s3a连接器的故障排除