Python 如何将列表中的嵌套json字段解析为数据帧？_Python_Json_Pandas_Python Requests

Python 如何将列表中的嵌套json字段解析为数据帧？

python json pandas

Python 如何将列表中的嵌套json字段解析为数据帧？,python,json,pandas,python-requests,Python,Json,Pandas,Python Requests,我正在进行API调用，并返回每个ID的嵌套JSON响应如果我为一个ID运行API调用，JSON如下所示 u'{"id":26509,"name":"ORD.00001","order_type":"sales","consumer_id":415372,"order_source":"in_store","is_submitted":0,"fulfillment_method":"in_store","order_total":150,"balance_due":150,"tax_total"

我正在进行API调用，并返回每个ID的嵌套JSON响应

如果我为一个ID运行API调用，JSON如下所示

u'{"id":26509,"name":"ORD.00001","order_type":"sales","consumer_id":415372,"order_source":"in_store","is_submitted":0,"fulfillment_method":"in_store","order_total":150,"balance_due":150,"tax_total":0,"coupon_total":0,"order_status":"cancelled","payment_complete":null,"created_at":"2017-12-02 19:49:15","updated_at":"2017-12-02 20:07:25","products":[{"id":48479,"item_master_id":239687,"name":"QA_FacewreckHaze","quantity":1,"pricing_weight_id":null,"category_id":1,"subcategory_id":8,"unit_price":"150.00","original_unit_price":"150.00","discount_total":"0.00","created_at":"2017-12-02 19:49:45","sold_weight":10,"sold_weight_uom":"GR"}],"payments":[],"coupons":[],"taxes":[],"order_subtotal":150}'

我可以使用以下代码行成功地将这一个JSON字符串解析为数据帧：

order_detail = json.loads(r.text)
order_detail = json_normalize(order_detail_staging)

我可以使用以下代码通过API迭代所有ID：

lists = []

for id in df.id:
       r = requests.get("URL/v1/orders/{id}".format(id=id), headers = headers_order)
       lists.append(r.text)

现在，我的所有JSON响应都存储在列表中。如何将列表中的所有元素写入数据帧

我一直在尝试的代码是：

for x in lists:
    order_detail = json.loads(x)
    order_detail = json_normalize(x)
    print(order_detail)

我得到一个错误：

AttributeError: 'unicode' object has no attribute 'itervalues'

我知道这是在第一线发生的：

order_detail = json_normalize(x)

为什么这一行适用于单个JSON字符串，而不适用于列表？如何将嵌套JSON列表放入数据帧中

提前谢谢你的帮助

编辑：

试试这个：

In [28]: lst = list(set(order_detail) - set(['products','coupons','payments','taxes']))

In [29]: pd.io.json.json_normalize(order_detail, ['products'], lst, meta_prefix='p_')
Out[29]:
   category_id           created_at discount_total     id  item_master_id              name original_unit_price pricing_weight_id  \
0            1  2017-12-02 19:49:45           0.00  48479          239687  QA_FacewreckHaze              150.00              None

   quantity  sold_weight         ...          p_tax_total  p_order_source p_consumer_id p_payment_complete p_coupon_total  \
0         1           10         ...                    0        in_store        415372               None              0

   p_fulfillment_method  p_order_type p_is_submitted  p_balance_due         p_updated_at
0              in_store         sales              0            150  2017-12-02 20:07:25

[1 rows x 29 columns]

使用response.json方法直接将其馈送到json_normalize 例如：

df = json_normalize([
    requests.get("URL/v1/orders/{id}".format(id=id), headers = headers_order).json()
    for id in df.id
])

UPD： FailsaLife版本无法处理错误响应：

def gen():
    for id in df.id:
        try:
            yield requests.get("URL/v1/orders/{id}".format(id=id), headers = headers_order).json()
        except ValueError:  # incorrect API response
            pass

df = json_normalize(list(gen()))

感谢您的回复@Marat。我试了一下你的线路，发现了错误ValueError：无法解码JSON对象“它是由熊猫还是请求引发的？因此它是由请求引发的，因为API返回无效的JSON。我编辑了答案来解释这一点，假设忽略这些回答是安全的，哇，真管用！你能告诉我你是如何知道这是一个请求问题的吗？如果你看一下堆栈跟踪，代码后面的第一行是File/Users/bob/anaconda/lib/python2.7/site-packages/requests/models.py-也就是说，下面的所有内容都在requests中感谢你的响应。我得到了错误，TypeError:sequence项0:expected string，numpy.int64 found

def gen():
    for id in df.id:
        try:
            yield requests.get("URL/v1/orders/{id}".format(id=id), headers = headers_order).json()
        except ValueError:  # incorrect API response
            pass

df = json_normalize(list(gen()))