Python 如何有效地将维度添加到从复杂字典创建的数据帧中_Python_Pandas

Python 如何有效地将维度添加到从复杂字典创建的数据帧中

python pandas

Python 如何有效地将维度添加到从复杂字典创建的数据帧中,python,pandas,Python,Pandas,我认为melt（正如前面讨论的）可能对此有用，但我不太明白如何使用它来解决我的问题我从这样一本复杂的字典开始： order = [ { "order_id" : 0, "lines" : [ { "line_id" : 1, "line_amount" : 3.45, "line_description" : "first line"

我认为melt（正如前面讨论的）可能对此有用，但我不太明白如何使用它来解决我的问题

我从这样一本复杂的字典开始：

order = [
    {
        "order_id" : 0,
        "lines" : [
            {
                "line_id" : 1,
                "line_amount" : 3.45,
                "line_description" : "first line"
            },
            {
                "line_id" : 2,
                "line_amount" : 6.66,
                "line_description" : "second line"
            },
            {
                "line_id" : 3,
                "line_amount" : 5.43,
                "line_description" : "third line"
            },
        ]
    },
    {
        "order_id" : 1,
        "lines" : [
        ...
    }
]

我想要一个数据帧，每个订单行一行（不是每个订单一行），它仍然包含原始订单的属性（在本例中只是订单的id）。目前，实现这一目标的最有效方法是：

# Orders DataFrame
odf = pandas.DataFrame(orders)

line_dfs = []
for oid, line_list in odf.iterrows():
    line_df = pandas.DataFrame(line_list).copy()
    line_df["order_id"] = oid
    line_dfs += [ line_df ]

# Line DataFrame
ldf = pandas.concat(line_dfs, sort=False, ignore_index=True).copy()

有没有一种更有效的“矢量化”的方法来实现这一点

ldf = odf.lines.apply(...?...)

感谢您的帮助，包括SO或其他地方的答案链接，这些答案已经解决了这个问题，而我还没有找到。

您是否尝试阅读了json

df=pd.read_json（orders）

您尝试过read_json吗

df=pd.read_json（orders）

使用列表理解和

pop

按键提取

行和字典列表，并传递给DataFrame
构造函数：
orders = [
    {
        "order_id" : 0,
        "lines" : [
            {
                "line_id" : 1,
                "line_amount" : 3.45,
                "line_description" : "first line"
            },
            {
                "line_id" : 2,
                "line_amount" : 6.66,
                "line_description" : "second line"
            },
            {
                "line_id" : 3,
                "line_amount" : 5.43,
                "line_description" : "third line"
            },
        ]
    },
    {
        "order_id" : 1,
        "lines" : [
 {
                "line_id" : 1,
                "line_amount" : 30.45,
                "line_description" : "first line"
            },
            {
                "line_id" : 2,
                "line_amount" : 60.66,
                "line_description" : "second line"
            },
            {
                "line_id" : 3,
                "line_amount" : 50.43,
                "line_description" : "third line"
            },
        ]
    }
]


另一个带循环的解决方案：
L = []
for x in orders:
    for y in x.pop('lines'):
        L.append({**x, **y})

odf = pd.DataFrame(L)

使用列表理解和pop
按键提取行和字典列表，并传递到DataFrame
构造函数：
orders = [
    {
        "order_id" : 0,
        "lines" : [
            {
                "line_id" : 1,
                "line_amount" : 3.45,
                "line_description" : "first line"
            },
            {
                "line_id" : 2,
                "line_amount" : 6.66,
                "line_description" : "second line"
            },
            {
                "line_id" : 3,
                "line_amount" : 5.43,
                "line_description" : "third line"
            },
        ]
    },
    {
        "order_id" : 1,
        "lines" : [
 {
                "line_id" : 1,
                "line_amount" : 30.45,
                "line_description" : "first line"
            },
            {
                "line_id" : 2,
                "line_amount" : 60.66,
                "line_description" : "second line"
            },
            {
                "line_id" : 3,
                "line_amount" : 50.43,
                "line_description" : "third line"
            },
        ]
    }
]


另一个带循环的解决方案：
L = []
for x in orders:
    for y in x.pop('lines'):
        L.append({**x, **y})

odf = pd.DataFrame(L)

这会给我一个订单的数据框，而我想要一个行的数据框。我可能应该用上面的“blob”来重申这个问题，因为它只是一个dict（这样基本的构造函数就可以使用了）。这将得到一个订单数据框，而我需要一个行数据框。我可能应该用上面的“blob”简单地重复这个问题（这样基本构造函数就可以使用了）。