由于随机值,在Python中解析JSON数据时出现问题
我正在浏览一个产品评论网站。我可以成功地获取JSON数据,但是我在解析方面遇到了问题。数据的级别如下所示: 有效负载->评论->22Y6N61W6TO2->客户评论 我想要的数据处于“customerReviews”级别。但是,查看其他项目时,“6IYETQATGRMP”值将不同 我不想对每个项目都使用不同的python脚本来解释这个值。如何使用通配符之类的东西来获取我要查找的数据 我在脚本中使用了Python 3、请求和JSON 我的脚本如下所示:由于随机值,在Python中解析JSON数据时出现问题,python,json,web-scraping,Python,Json,Web Scraping,我正在浏览一个产品评论网站。我可以成功地获取JSON数据,但是我在解析方面遇到了问题。数据的级别如下所示: 有效负载->评论->22Y6N61W6TO2->客户评论 我想要的数据处于“customerReviews”级别。但是,查看其他项目时,“6IYETQATGRMP”值将不同 我不想对每个项目都使用不同的python脚本来解释这个值。如何使用通配符之类的东西来获取我要查找的数据 我在脚本中使用了Python 3、请求和JSON 我的脚本如下所示: import json import pan
import json
import pandas as pd
with open('data.json', 'r') as f:
data = json.load(f)
df = pd.json_normalize(data['payload']['reviews']['22Y6N61W6TO2']['customerReviews'])
print(df)
下面是我正在使用的JSON的一部分:
"payload": {
"products": {},
"offers": {},
"idmlMap": {},
"reviews": {
"22Y6N61W6TO2": {
"averageOverallRating": 4.4783,
"roundedAverageOverallRating": 4.5,
"overallRatingRange": 5.0,
"totalReviewCount": 759,
"recommendedPercentage": 89,
"ratingValueOneCount": 35,
"ratingValueTwoCount": 27,
"ratingValueThreeCount": 30,
"ratingValueFourCount": 115,
"ratingValueFiveCount": 552,
"percentageOneCount": 4,
"percentageTwoCount": 3,
"percentageThreeCount": 3,
"percentageFourCount": 15,
"percentageFiveCount": 72,
"activeSort": "relevancy",
"pagination": {
"total": 759,
"pages": [
{
"num": 1,
"gap": false,
"active": true,
"url": "sort=relevancy&page=1"
},
{
"num": 2,
"gap": false,
"active": false,
"url": "sort=relevancy&page=2"
},
{
"num": 3,
"gap": false,
"active": false,
"url": "sort=relevancy&page=3"
},
{
"num": 4,
"gap": false,
"active": false,
"url": "sort=relevancy&page=4"
},
{
"num": 5,
"gap": false,
"active": false,
"url": "sort=relevancy&page=5"
},
{
"num": 6,
"gap": false,
"active": false,
"url": "sort=relevancy&page=6"
},
{
"num": 0,
"gap": true,
"active": false
},
{
"num": 38,
"gap": false,
"active": false,
"url": "sort=relevancy&page=38"
}
],
"next": {
"num": 0,
"gap": false,
"active": false,
"url": "sort=relevancy&page=2"
},
"currentSpan": "1-20"
},
"customerReviews": [
{
"reviewId": "248695872",
"authorId": "13b0b650b7694a54267279bf80e0fdfa99cc7c3c5150d32aff7db274e74c07f5f6e7f7b6c4fe8cb64a007c9e3c0f0c04",
"negativeFeedback": 0,
"positiveFeedback": 0,
"rating": 5.0,
"reviewTitle": "Amazing",
"reviewText": "This thing is amazing. I cooked bbq ribs in 30 mins. Then caramelized for 6 mins in my oven. They was awesome. Best kitchen appliance of 2020. Wish i had bought it before dec 31st. Buy one folks. You'll love it.",
"reviewSubmissionTime": "1/1/2021",
"userNickname": "Keith",
"badges": [
{
"badgeType": "Custom",
"id": "VerifiedPurchaser",
"contentType": "REVIEW"
}
],
"userAttributes": {},
"photos": [
{
"Id": "e917ed53-cf49-48af-b454-42f3fd87536a",
"Sizes": {
"normal": {
"Id": "normal",
"Url": "https://i5.walmartimages.com/dfw/6e29e393-988c/k2-_d716ba9d-2c5b-4f82-b9a6-588575975fe6.v1.bin"
},
"thumbnail": {
"Id": "thumbnail",
"Url": "https://i5.walmartimages.com/dfw/6e29e393-988c/k2-_d716ba9d-2c5b-4f82-b9a6-588575975fe6.v1.bin?odnWidth=150&odnHeight=150&odnBg=ffffff"
}
},
"SizesOrder": [
"normal",
"thumbnail"
]
}
],
"videos": [],
"externalSource": "bazaarvoice"
}
您必须设置一个简单的变量才能使用标准的导出函数
导入json
随机输入
id_1='6IYETQATGRMP'
id_2='7GAADHOOLWCT'
id_3='8WWBHOLWQNNZ'
json\u数据=“”{
“1级”:{
“2级”:{
“6IYETQATGRMP”:{
“级别4”:“您的第一个级别4数据”
},
“7GAADHOOLWCT”:{
“第四级”:“您的第二级第四级数据”
},
“8WWBHOLWQNNZ”:{
“4级”:“您的3级4级数据”
}
}
}
}'''
your\u var=random.choice([id\u 1,id\u 2,id\u 3])
data=json.load(_json_数据)
打印(数据['level_1']['level_2'][your_var]['level_4']))
通过这种方式,您可以使用“your_var”设置所需的ID,如果ID存在,该函数将按预期工作。我相信您可以像这样首先获得密钥
key=list(数据[“有效负载”][reviews'].keys())[0]
df=pd.json_规范化(数据['payload']['reviews'][key]['customerReviews'])
请将此作为一个示例,并包含JSON响应示例,您可以完美地完成此操作。谢谢。