Amazon web services AWS Glue爬虫程序访问被拒绝,已连接AmazonS3FullAccess
我刚刚设置了一个AWS胶水爬行器来爬行S3桶。我已经为爬虫程序设置了IAM角色,并将托管策略“AWSGlueServiceRole”和“AmazonS3FullAccess”附加到该角色。我已确保爬虫正在使用该角色。但是,每次运行爬虫程序时,我都会在日志中收到类似的错误消息:Amazon web services AWS Glue爬虫程序访问被拒绝,已连接AmazonS3FullAccess,amazon-web-services,amazon-s3,aws-glue,Amazon Web Services,Amazon S3,Aws Glue,我刚刚设置了一个AWS胶水爬行器来爬行S3桶。我已经为爬虫程序设置了IAM角色,并将托管策略“AWSGlueServiceRole”和“AmazonS3FullAccess”附加到该角色。我已确保爬虫正在使用该角色。但是,每次运行爬虫程序时,我都会在日志中收到类似的错误消息: ERROR:ERROR Access Denied(服务:Amazon S3;状态代码:403;错误代码:AccessDenied;请求ID:;S3扩展请求ID:)在S3://my bucket/snapshots/sna
ERROR:ERROR Access Denied(服务:Amazon S3;状态代码:403;错误代码:AccessDenied;请求ID:;S3扩展请求ID:)在S3://my bucket/snapshots/snapshot-1/mydb/mydb.mytable/11/part-00000-ffffffffff-ffff-ffff-ffff-c000.gz.parquet检索文件。创建的表没有从此文件推断架构。
我已经确认,在执行角色中附加了“AmazonS3ReadOnlyAccess”的Lambda能够访问bucket。我做错了什么
编辑:设置“阻止所有公共访问”或禁用“阻止所有公共访问”没有明显效果
EDIT2:IAM角色的托管策略文档如下所示。没有内联策略
AWSGlueServiceRole:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"glue:*",
"s3:GetBucketLocation",
"s3:ListBucket",
"s3:ListAllMyBuckets",
"s3:GetBucketAcl",
"ec2:DescribeVpcEndpoints",
"ec2:DescribeRouteTables",
"ec2:CreateNetworkInterface",
"ec2:DeleteNetworkInterface",
"ec2:DescribeNetworkInterfaces",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcAttribute",
"iam:ListRolePolicies",
"iam:GetRole",
"iam:GetRolePolicy",
"cloudwatch:PutMetricData"
],
"Resource": [
"*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:CreateBucket"
],
"Resource": [
"arn:aws:s3:::aws-glue-*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::aws-glue-*/*",
"arn:aws:s3:::*/*aws-glue-*/*"
]
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::crawler-public*",
"arn:aws:s3:::aws-glue-*"
]
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": [
"arn:aws:logs:*:*:/aws-glue/*"
]
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateTags",
"ec2:DeleteTags"
],
"Condition": {
"ForAllValues:StringEquals": {
"aws:TagKeys": [
"aws-glue-service-resource"
]
}
},
"Resource": [
"arn:aws:ec2:*:*:network-interface/*",
"arn:aws:ec2:*:*:security-group/*",
"arn:aws:ec2:*:*:instance/*"
]
}
]
}
AmazonS3FullAccess:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": "*"
}
]
}
原来问题出在KMS上。该存储桶包含一个Aurora RDS快照的导出,该快照显然是加密写入的。因此,一旦我添加了以下策略,我就被设置为:
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Allow",
"Action": [
"kms:Decrypt"
],
"Resource": [
"arn:aws:kms:<region>:<my account id>:key/<my key id>"
]
}
}
您能否使用所用角色的IAM策略更新您的问题,并确认是谁将此文件写入s3存储桶?该文件是由Aurora RDS快照进程写入的。没有“明显”的相关信息。加密密钥就在导出屏幕上。它只是没有在我的脑海中出现。
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::my-bucket/snapshots*"
]
},
{
"Effect": "Allow",
"Action": [
"kms:Decrypt"
],
"Resource": [
"arn:aws:kms:<region>:<my account id>:key/<my key id>"
]
}
]
}