使用Python boto3从S3读取JSON文件
我一直在S3 bucket“test”中关注JSON使用Python boto3从S3读取JSON文件,python,json,amazon-web-services,amazon-s3,boto3,Python,Json,Amazon Web Services,Amazon S3,Boto3,我一直在S3 bucket“test”中关注JSON { 'Details' : "Something" } 我正在使用以下代码阅读此JSON并打印密钥“Details” s3 = boto3.resource('s3', aws_access_key_id=<access_key>, aws_secret_access_key=<secret_key>
{
'Details' : "Something"
}
我正在使用以下代码阅读此JSON并打印密钥“Details”
s3 = boto3.resource('s3',
aws_access_key_id=<access_key>,
aws_secret_access_key=<secret_key>
)
content_object = s3.Object('test', 'sample_json.txt')
file_content = content_object.get()['Body'].read().decode('utf-8')
json_content = json.loads(repr(file_content))
print(json_content['Details'])
s3=boto3.resource('s3',
aws\u访问\u密钥\u id=,
aws\u密码\u访问\u密钥=
)
content\u object=s3.object('test','sample\u json.txt')
file_content=content_object.get()['Body'].read().decode('utf-8')
json_content=json.load(repr(文件内容))
打印(json_内容['Details'])
我得到的错误是“字符串索引必须是整数”
我不想从S3下载文件,然后阅读..如上面的评论所述,
repr
必须删除,json
文件必须使用双引号作为属性。在aws/s3上使用此文件:
{
"Details" : "Something"
}
下面是Python代码,它可以工作:
import boto3
import json
s3 = boto3.resource('s3')
content_object = s3.Object('test', 'sample_json.txt')
file_content = content_object.get()['Body'].read().decode('utf-8')
json_content = json.loads(file_content)
print(json_content['Details'])
# >> Something
我被卡住了一段时间,因为解码对我不起作用(s3对象是gzip的) 找到了帮助我的讨论:
如果打印jsonData,您将看到所需的JSON文件!如果您在AWS本身中运行测试,请确保检查CloudWatch日志,因为在lambda中,如果太长,它将不会输出完整的JSON文件。以下内容对我有效
# read_s3.py
import boto3
BUCKET = 'MY_S3_BUCKET_NAME'
FILE_TO_READ = 'FOLDER_PATH/my_file.json'
client = boto3.client('s3',
aws_access_key_id='MY_AWS_KEY_ID',
aws_secret_access_key='MY_AWS_SECRET_ACCESS_KEY'
)
result = client.get_object(Bucket=BUCKET, Key=FILE_TO_READ)
text = result["Body"].read().decode()
print(text['Details']) # Use your desired JSON Key for your value
直接硬编码AWS Id和密钥不是一个好主意。对于最佳实践,您可以考虑以下任一个:
(1) 从存储在本地存储器中的json文件读取AWS凭据:
import json
credentials = json.load(open('aws_cred.json'))
client = boto3.client('s3',
aws_access_key_id=credentials['MY_AWS_KEY_ID'],
aws_secret_access_key=credentials['MY_AWS_SECRET_ACCESS_KEY']
)
(2) 从环境变量读取(我的首选部署选项):
让我们准备一个shell脚本(set_env.sh
)来设置环境变量,并添加python脚本(read_s3.py
),如下所示:
# set_env.sh
export MY_AWS_KEY_ID='YOUR_AWS_ACCESS_KEY_ID'
export MY_AWS_SECRET_ACCESS_KEY='YOUR_AWS_SECRET_ACCESS_KEY'
# execute the python file containing your code as stated above that reads from s3
python read_s3.py # will execute the python script to read from s3
sh set_env.sh
现在在终端中执行shell脚本,如下所示:
# set_env.sh
export MY_AWS_KEY_ID='YOUR_AWS_ACCESS_KEY_ID'
export MY_AWS_SECRET_ACCESS_KEY='YOUR_AWS_SECRET_ACCESS_KEY'
# execute the python file containing your code as stated above that reads from s3
python read_s3.py # will execute the python script to read from s3
sh set_env.sh
希望添加
botocore.response.streamingbody
与json.load
配合良好:
import json
import boto3
s3 = boto3.resource('s3')
obj = s3.Object(bucket, key)
data = json.load(obj.get()['Body'])
您可以使用AWS Lambda中的以下代码从S3 bucket读取JSON文件,并使用python进行处理
import json
import boto3
import sys
import logging
# logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
VERSION = 1.0
s3 = boto3.client('s3')
def lambda_handler(event, context):
bucket = 'my_project_bucket'
key = 'sample_payload.json'
response = s3.get_object(Bucket = bucket, Key = key)
content = response['Body']
jsonObject = json.loads(content.read())
print(jsonObject)
删除
repr
@AlexHall最初我尝试删除repr
,但不起作用,它给出了ValueError:期望属性名包含在双引号中我解决了问题。。JSON的属性应该用双引号括起来。。我更改了json格式,您在哪一行出错?把那条线分开<代码>文件内容=内容对象…一行包含4个步骤。现在,用4个中间变量将其分成4行。然后看看哪一行失败了。我的问题只需要“.read().decode('utf-8')”,所以谢谢你的提问(;;注意:s3.Object('bucketName','keyName')
所以获取文件s3://foobarBucketName/folderA/folderB/myFile.json的示例是s3.Object('foobarBucketName','folderA/folderB/myFile.json')