Amazon web services 无法从S3存储桶中检索已处理的文件_Amazon Web Services_Amazon S3_Boto3_Amazon Textract

Amazon web services 无法从S3存储桶中检索已处理的文件

amazon-web-services amazon-s3

Amazon web services 无法从S3存储桶中检索已处理的文件,amazon-web-services,amazon-s3,boto3,amazon-textract,Amazon Web Services,Amazon S3,Boto3,Amazon Textract,我是一个AWS新手，尝试使用他们的OCR服务Textract API。据我所知，我需要将文件上传到S3存储桶，然后在其上运行textract 我戴上水桶，里面有文件：我获得了以下权限：但当我运行代码时，它会出错 import boto3 import trp # Document s3BucketName = "textract-console-us-east-1-057eddde-3f44-45c5-9208

我是一个AWS新手，尝试使用他们的OCR服务Textract API。据我所知，我需要将文件上传到S3存储桶，然后在其上运行textract

我戴上水桶，里面有文件：

我获得了以下权限：

但当我运行代码时，它会出错

        import boto3
        import trp

        # Document
        s3BucketName = "textract-console-us-east-1-057eddde-3f44-45c5-9208-fec27f9f6420"
        documentName = "ok0001_prioridade01_x45f3.pdf"
]\[\[""
        # Amazon Textract client
        textract = boto3.client('textract',region_name="us-east-1",aws_access_key_id="xxxxxx",
                                aws_secret_access_key="xxxxxxxxx")

        # Call Amazon Textract
        response = textract.analyze_document(
            Document={
                'S3Object': {
                    'Bucket': s3BucketName,
                    'Name': documentName
                }
            },
            FeatureTypes=["TABLES"])

以下是我得到的错误：

botocore.errorfactory.InvalidS3ObjectException: An error occurred (InvalidS3ObjectException) when calling the AnalyzeDocument operation: Unable to get object metadata from S3. Check object key, region and/or access permissions.

我错过了什么？我如何解决这个问题呢？

Amazon Textract目前支持PNG、JPEG和PDF格式。看起来您正在使用PDF

一旦获得了有效的格式，就可以使用Python S3 API读取S3对象中的对象数据。读取对象后，可以将字节数组传递给analyze\u document方法。查看如何将AWS SDK for Python（Boto3）与Amazon Textract结合使用的完整示例检测文档图像中的文本、表单和表元素

尝试下面的代码示例，看看您的问题是否得到解决

“您能提供一些要使用的参数的许可吗”

我刚刚运行了JavaV2示例，它运行得很好。在本例中，我使用的是位于特定AmazonS3存储桶中的PNG文件

以下是您需要的参数：

确保在Python中实现此功能时设置了相同的参数。

如果缺少S3访问策略，则应添加

AmazonS3ReadOnlyAccess

策略，以便根据需要快速解决问题

一个好的做法是应用最小特权访问原则，并在需要时继续授予访问权限。因此，我建议您创建一个特定的策略来访问S3存储桶

textract-console-us-east-1-057eddde-3f44-45c5-9208-fec27f9f6420

，并且只能在

us-east-1

地区访问。

您能提供一些要使用的参数许可吗？TypeError:uuu init_uuu（）缺少3个必需的位置参数：“textract_client”、“s3_resource”和“sqs_resource”。请参见上面的pic，其中显示了一个成功的调用和所需的参数