Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/amazon-s3/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark S3 Bucket策略拒绝访问除IAM角色和InstanceProfile之外的所有角色_Apache Spark_Amazon S3_Amazon Iam_Amazon Emr - Fatal编程技术网

Apache spark S3 Bucket策略拒绝访问除IAM角色和InstanceProfile之外的所有角色

Apache spark S3 Bucket策略拒绝访问除IAM角色和InstanceProfile之外的所有角色,apache-spark,amazon-s3,amazon-iam,amazon-emr,Apache Spark,Amazon S3,Amazon Iam,Amazon Emr,我有一个EMR集群,它包括在S3 bucket上写入和删除对象的步骤。我一直在尝试在S3 bucket中创建一个bucket策略,该策略拒绝删除除EMR角色和实例概要文件之外的所有主体的访问权限。以下是我的政策 { "Version": "2008-10-17", "Id": "ExamplePolicyId123458", "Statement": [ { "Sid": "ExampleStmtSid12345678",

我有一个EMR集群,它包括在S3 bucket上写入和删除对象的步骤。我一直在尝试在S3 bucket中创建一个bucket策略,该策略拒绝删除除EMR角色和实例概要文件之外的所有主体的访问权限。以下是我的政策

{
    "Version": "2008-10-17",
    "Id": "ExamplePolicyId123458",
    "Statement": [
        {
            "Sid": "ExampleStmtSid12345678",
            "Effect": "Deny",
            "Principal": "*",
            "Action": [
                "s3:DeleteBucket",
                "s3:DeleteObject*"
            ],
            "Resource": [
                "arn:aws:s3:::bucket-name",
                "arn:aws:s3:::bucket-name/*"
            ],
            "Condition": {
                "StringNotLike": {
                    "aws:userId": [
                        "AROAI3FK4OGNWXLHB7IXM:*", #EMR Role Id
                        "AROAISVF3UYNPH33RYIZ6:*", # Instance Profile Role ID
                        "AIPAIDBGE7J475ON6BAEU" # Instance Profile ID
                    ]
                }
            }
        }
    ]
}
正如我在某处发现的,不可能使用通配符条目来指定NotPrincipal部分中的每个角色会话,因此我使用了aws:userId的条件来匹配

每当我在没有bucket策略的情况下运行EMR步骤时,该步骤都会成功完成。但当我将策略添加到bucket并重新运行时,该步骤失败,并出现以下错误

diagnostics: User class threw exception:
org.apache.hadoop.fs.s3a.AWSS3IOException: delete on s3://vr-dump/metadata/test:
com.amazonaws.services.s3.model.MultiObjectDeleteException: One or more objects could not be deleted 
(Service: null; Status Code: 200; Error Code: null; Request ID: 9FC4797479021CEE; S3 Extended Request ID: QWit1wER1s70BJb90H/0zLu4yW5oI5M4Je5aK8STjCYkkhZNVWDAyUlS4uHW5uXYIdWo27nHTak=), S3 Extended Request ID: QWit1wER1s70BJb90H/0zLu4yW5oI5M4Je5aK8STjCYkkhZNVWDAyUlS4uHW5uXYIdWo27nHTak=: One or more objects could not be deleted (Service: null; Status Code: 200; Error Code: null; Request ID: 9FC4797479021CEE; S3 Extended Request ID: QWit1wER1s70BJb90H/0zLu4yW5oI5M4Je5aK8STjCYkkhZNVWDAyUlS4uHW5uXYIdWo27nHTak=)

这里有什么问题?这与EMR Spark配置或bucket策略有关吗?

假设这些角色ID正确,它们从AROA开始,因此具有有效的格式。我相信您还需要策略上的aws帐号。例如:

{
"Version": "2008-10-17",
"Id": "ExamplePolicyId123458",
"Statement": [
    {
        "Sid": "ExampleStmtSid12345678",
        "Effect": "Deny",
        "Principal": "*",
        "Action": [
            "s3:DeleteBucket",
            "s3:DeleteObject*"
        ],
        "Resource": [
            "arn:aws:s3:::vr-dump",
            "arn:aws:s3:::vr-dump/*"
        ],
        "Condition": {
            "StringNotLike": {
                "aws:userId": [
                    "AROAI3FK4OGNWXLHB7IXM:*", #EMR Role Id
                    "AROAISVF3UYNPH33RYIZ6:*", # Instance Profile Role ID
                    "AIPAIDBGE7J475ON6BAEU", # Instance Profile ID
                    "1234567890" # Your AWS Account Number
                ]
            }
        }
    }
]

}

为什么要创建带有拒绝的Bucket策略?合乎逻辑的方法是为EMR集群提供一个角色,允许它向bucket写入数据。这不需要Bucket策略,也不需要拒绝任何内容,除非您有其他授予广泛访问权限的策略。您是否有其他试图拒绝访问的非EMR策略?@JohnRotenstein是的,我有多个IAM身份,具有广泛的S3权限。因此,我认为实行桶装政策会更容易。我只是不知道错误是与策略本身还是与spark配置有关。如果我删除了策略,那么EMR步骤将成功完成。我感到困惑。默认情况下,没有用户/角色可以访问S3。您希望EMR具有访问权限,因此可以通过分配给群集的IAM角色授予访问权限。这应该满足EMR的要求。如果您还想说没有其他用户/角色能够删除bucket或对象,那么最好不要首先分配这些权限。与添加拒绝策略不同,您首先不应该分配允许权限。但是,如果您无法控制此类允许分配,我想知道问题是否是由角色末尾的:*引起的。看看这些其他问题,并尝试他们使用的语法。我还假设您已经阅读了本文:以及文档。