Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/324.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Pandas:GroupByDataFrame并使用缺失的数据创建dict_Python_Python 3.x_Pandas_Dataframe - Fatal编程技术网

Python Pandas:GroupByDataFrame并使用缺失的数据创建dict

Python Pandas:GroupByDataFrame并使用缺失的数据创建dict,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,下面是我的df: In [78]: df Out[78]: site date race count 0 1 1999-01-31 Asian 100 1 1 1999-01-31 African 25 2 2 1999-01-31 Asian 200 3 1 2001-01-21 Asian 95 4 2 2001-01-21 Asian 130

下面是我的
df

   In [78]: df
Out[78]: 
   site        date     race  count
0     1  1999-01-31    Asian    100
1     1  1999-01-31  African     25
2     2  1999-01-31    Asian    200
3     1  2001-01-21    Asian     95
4     2  2001-01-21    Asian    130
5     1  2003-01-12    Asian     80
6     2  2003-01-12  Mexican     35
我想在
race
date
上对上述内容进行分组,并创建如下输出:

预期:

{
    "dates":[
    "1999-01-31",
    "2001-01-21",
    "2003-01-12"
    ]
},
{
    "race": "Asian"
    "data": [
    300,
    225,
    80
    ]
},
{
    "race": "African"
    "data": [
    25,
    0,
    0
    ]
},
{
    "race": "Mexican"
    "data": [
    0,
    0,
    35
    ]
}
我的尝试:

In [77]: df.groupby(['race', 'date'])['count'].sum().reset_index(level=1)
Out[77]: 
               date  count
race                      
African  1999-01-31     25
Asian    1999-01-31    300
Asian    2001-01-21    225
Asian    2003-01-12     80
Mexican  2003-01-12     35

我可以通过分组获得上述内容,但不确定如何创建预期输出。

这里处理的是与其他值不同的
日期,因此首先使用数据透视,然后使用自定义格式列出理解:

df = df.pivot_table(index='date',columns='race',values='count',fill_value=0, aggfunc='sum')

L = [{"dates": list(df.index)}] + [dict(race=k, data=list(v)) for k, v in df.items()]
print (L)
[{'dates': ['1999-01-31', '2001-01-21', '2003-01-12']}, 
 {'race': 'African', 'data': [25, 0, 0]}, 
 {'race': 'Asian', 'data': [300, 225, 80]},
 {'race': 'Mexican', 'data': [0, 0, 35]}]