Apache spark Spark S3写入-写入存储桶时出现访问被拒绝错误
我正在尝试从S3存储桶读取和写入文件。我在AWS门户中创建了一个IAM用户。我已经在我的EMR实例中使用相同的密钥配置了aws cli,并且可以从cli将文件读写到特定的S3存储桶中 但是当我从spark shell内部尝试相同的方法时,我能够从bucket中读取文件,但是当我尝试将相同的文件写入同一bucket中的不同路径时,我会得到Apache spark Spark S3写入-写入存储桶时出现访问被拒绝错误,apache-spark,amazon-s3,Apache Spark,Amazon S3,我正在尝试从S3存储桶读取和写入文件。我在AWS门户中创建了一个IAM用户。我已经在我的EMR实例中使用相同的密钥配置了aws cli,并且可以从cli将文件读写到特定的S3存储桶中 但是当我从spark shell内部尝试相同的方法时,我能够从bucket中读取文件,但是当我尝试将相同的文件写入同一bucket中的不同路径时,我会得到AccessDenied错误。这是我执行的一组命令: sc.hadoopConfiguration.set("fs.s3.awsAccessKeyId", "aw
AccessDenied
错误。这是我执行的一组命令:
sc.hadoopConfiguration.set("fs.s3.awsAccessKeyId", "awsAccessKeyId")
sc.hadoopConfiguration.set("fs.s3.awsSecretAccessKey", "awsSecretAccessKey")
val a = spark.read.parquet("s3://path.parquet")
a.write.parquet("s3://path.parquet")
这是错误消息
Caused by: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: ; S3 Extended Request ID: , S3 Extended Request ID:
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1712)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1367)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4914)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4860)
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.initiateMultipartUpload(AmazonS3Client.java:3552)
at com.amazon.ws.emr.hadoop.fs.s3.lite.call.InitiateMultipartUploadCall.perform(InitiateMultipartUploadCall.java:22)
at com.amazon.ws.emr.hadoop.fs.s3.lite.call.InitiateMultipartUploadCall.perform(InitiateMultipartUploadCall.java:8)
at com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor.execute(GlobalS3Executor.java:91)
at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:184)
at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.initiateMultipartUpload(AmazonS3LiteClient.java:145)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346)
at com.sun.proxy.$Proxy32.initiateMultipartUpload(Unknown Source)
at com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream.ensureMultipartUploadIsInitiated(MultipartUploadOutputStream.java:541)
at com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream.uploadSinglePartWithMultipartUpload(MultipartUploadOutputStream.java:399)
at com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream.doMultiPartUpload(MultipartUploadOutputStream.java:436)
... 24 more
提前感谢。检查您的IAM权限。 如果您有一个自定义命名的IAM角色,请确保它使用了
IAM:PassRole
,并检查您的角色名称是否有拼写错误<代码>arn:aws:iam::123456789012:role/YourName
请参阅:您解决了这个问题吗?我现在也面临同样的问题。如果您解决了,请帮助我