Pandas 嵌套字典

Pandas 嵌套字典,pandas,dictionary,multi-index,Pandas,Dictionary,Multi Index,我在stackoverflow中的第一个问题 我有一个三层嵌套的字典,我想把它转换成df。 字典具有以下结构: dictionary = {'CompanyA': {'Revenue': {date1 : $1}, {date2: $2}},... {'ProfitLoss': {date1 : $0}, {date2: $1}}}, 'CompanyB': {'Revenue': {date1 : $1}, {d

我在stackoverflow中的第一个问题

我有一个三层嵌套的字典,我想把它转换成df。 字典具有以下结构:

dictionary = {'CompanyA': {'Revenue': {date1 : $1}, {date2: $2}},... 
                          {'ProfitLoss': {date1 : $0}, {date2: $1}}},
              'CompanyB': {'Revenue': {date1 : $1}, {date2: $2}},... 
                          {'ProfitLoss': {date1 : $0}, {date2: $1}}},
              'CompanyC': {'Revenue': {date1 : $1}, {date2: $2}},...
                          {'ProfitLoss': {date1 : $0}, {date2: $1}}}}
到目前为止,我已经能够使用以下方法构建df:

df = pd.DataFrame.from_dict(dictionary)
但结果是一个df,值作为字典,如下所示:

          CompanyA          CompanyB          CompanyC
Revenue   {date1:$0,..}     {date1:$1,..}     {date1:$0,..} 
ProfitLoss{date1:$0,..}     {date1:$0,..}     {date1:$0,..}
                    CompanyA    CompanyB    CompanyC  
Revenue    Date1       $1          $1          $1 
           Date2       $2          $2          $2

ProfitLoss Date1       $0          $0          $0
           Date2       $1          $1          $1
{
    'CompanyA': {
        ('Revenue', 'date1'): 1,
        ('ProfitLoss', 'date1'): 0,
    }
    ...
}
import pandas as pd

data = {
    'CompanyA': {
        'Revenue': {
            "date1": 1,
            "date2": 2
        },
        'ProfitLoss': {
            "date1": 0,
            "date2": 1
        }
    },
    'CompanyB': {
        'Revenue': {
            "date1": 4,
            "date2": 5
        },
        'ProfitLoss': {
            "date1": 2,
            "date2": 3
        }
    }
}

# Reshape your data and pass it to `DataFrame.from_dict`
df = pd.DataFrame.from_dict({i: {(j, k): data[i][j][k]
                                    for j in data[i] for k in data[i][j]}
                                for i in data}, orient="columns")

print(df)
我希望桌子看起来像这样:

          CompanyA          CompanyB          CompanyC
Revenue   {date1:$0,..}     {date1:$1,..}     {date1:$0,..} 
ProfitLoss{date1:$0,..}     {date1:$0,..}     {date1:$0,..}
                    CompanyA    CompanyB    CompanyC  
Revenue    Date1       $1          $1          $1 
           Date2       $2          $2          $2

ProfitLoss Date1       $0          $0          $0
           Date2       $1          $1          $1
{
    'CompanyA': {
        ('Revenue', 'date1'): 1,
        ('ProfitLoss', 'date1'): 0,
    }
    ...
}
import pandas as pd

data = {
    'CompanyA': {
        'Revenue': {
            "date1": 1,
            "date2": 2
        },
        'ProfitLoss': {
            "date1": 0,
            "date2": 1
        }
    },
    'CompanyB': {
        'Revenue': {
            "date1": 4,
            "date2": 5
        },
        'ProfitLoss': {
            "date1": 2,
            "date2": 3
        }
    }
}

# Reshape your data and pass it to `DataFrame.from_dict`
df = pd.DataFrame.from_dict({i: {(j, k): data[i][j][k]
                                    for j in data[i] for k in data[i][j]}
                                for i in data}, orient="columns")

print(df)

我尝试使用pd.MultiIndex.from_dict(.from_product)并更改索引,但没有结果。你知道下一步该怎么办吗?任何提示都将不胜感激

我知道你是新来的,但类似的问题可能会有答案,请参见。下次尝试使用关键字查找类似的问题。例如,我通过搜索“pandas nested dict”找到了一个链接,就是这样,第一个链接就是SO post

无论如何,您需要重新调整输入
dict
。您需要这样的dict结构:

          CompanyA          CompanyB          CompanyC
Revenue   {date1:$0,..}     {date1:$1,..}     {date1:$0,..} 
ProfitLoss{date1:$0,..}     {date1:$0,..}     {date1:$0,..}
                    CompanyA    CompanyB    CompanyC  
Revenue    Date1       $1          $1          $1 
           Date2       $2          $2          $2

ProfitLoss Date1       $0          $0          $0
           Date2       $1          $1          $1
{
    'CompanyA': {
        ('Revenue', 'date1'): 1,
        ('ProfitLoss', 'date1'): 0,
    }
    ...
}
import pandas as pd

data = {
    'CompanyA': {
        'Revenue': {
            "date1": 1,
            "date2": 2
        },
        'ProfitLoss': {
            "date1": 0,
            "date2": 1
        }
    },
    'CompanyB': {
        'Revenue': {
            "date1": 4,
            "date2": 5
        },
        'ProfitLoss': {
            "date1": 2,
            "date2": 3
        }
    }
}

# Reshape your data and pass it to `DataFrame.from_dict`
df = pd.DataFrame.from_dict({i: {(j, k): data[i][j][k]
                                    for j in data[i] for k in data[i][j]}
                                for i in data}, orient="columns")

print(df)
我会这样做:

          CompanyA          CompanyB          CompanyC
Revenue   {date1:$0,..}     {date1:$1,..}     {date1:$0,..} 
ProfitLoss{date1:$0,..}     {date1:$0,..}     {date1:$0,..}
                    CompanyA    CompanyB    CompanyC  
Revenue    Date1       $1          $1          $1 
           Date2       $2          $2          $2

ProfitLoss Date1       $0          $0          $0
           Date2       $1          $1          $1
{
    'CompanyA': {
        ('Revenue', 'date1'): 1,
        ('ProfitLoss', 'date1'): 0,
    }
    ...
}
import pandas as pd

data = {
    'CompanyA': {
        'Revenue': {
            "date1": 1,
            "date2": 2
        },
        'ProfitLoss': {
            "date1": 0,
            "date2": 1
        }
    },
    'CompanyB': {
        'Revenue': {
            "date1": 4,
            "date2": 5
        },
        'ProfitLoss': {
            "date1": 2,
            "date2": 3
        }
    }
}

# Reshape your data and pass it to `DataFrame.from_dict`
df = pd.DataFrame.from_dict({i: {(j, k): data[i][j][k]
                                    for j in data[i] for k in data[i][j]}
                                for i in data}, orient="columns")

print(df)
输出:

                  CompanyA  CompanyB
ProfitLoss date1         0         2
           date2         1         3
Revenue    date1         1         4
           date2         2         5
                                       CompanyA  CompanyB
ProfitLoss 2018-10-08 11:19:09.006375         0         2
           2019-10-08 11:19:09.006375         1         3
Revenue    2018-10-08 11:19:09.006375         1         4
           2019-10-08 11:19:09.006375         2         5
编辑

使用实际日期时间回复您的评论:

import pandas as pd
import datetime as dt

date1 = dt.datetime.now()
date2 = date1 + dt.timedelta(days=365)

data = {
    'CompanyA': {
        'Revenue': {
            date1: 1,
            date2: 2
        },
        'ProfitLoss': {
            date1: 0,
            date2: 1
        }
    },
    'CompanyB': {
        'Revenue': {
            date1: 4,
            date2: 5
        },
        'ProfitLoss': {
            date1: 2,
            date2: 3
        }
    }
}

# Reshape your data and pass it to `DataFrame.from_dict`
df = pd.DataFrame.from_dict({i: {(j, k): data[i][j][k]
                                    for j in data[i] for k in data[i][j]}
                                for i in data}, orient="columns")

print(df)
输出:

                  CompanyA  CompanyB
ProfitLoss date1         0         2
           date2         1         3
Revenue    date1         1         4
           date2         2         5
                                       CompanyA  CompanyB
ProfitLoss 2018-10-08 11:19:09.006375         0         2
           2019-10-08 11:19:09.006375         1         3
Revenue    2018-10-08 11:19:09.006375         1         4
           2019-10-08 11:19:09.006375         2         5

可能是埃德加的复制品,谢谢你花时间回答我的问题。按照您的示例,我得到了与使用Pandas MultiIndex之前相同的结果。在这两种情况下,我都得到了正确的表格式,但表中没有数据。日期格式(datetime.date(2018,3,31):1.0)是否与表格填充错误有关?@edmond dantes,检查字典的结构。我认为
dictionary={'CompanyA':{'Revenue':{date1:$1},{date2:$2},{/code>应该是
dictionary={'CompanyA':{'Revenue':{date1:$1,date2:$2},{/code>。原始字典的结构没关系,我错放了一个{}很抱歉,继续下去,看起来GITHUB上有一个bug。这正是我的问题:[我会考虑我的问题!