Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/334.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Pandas:将分组的df转换为dict列表,其中两列作为键、值对_Python_Python 2.7_Pandas_Dictionary_Pandas Groupby - Fatal编程技术网

Python Pandas:将分组的df转换为dict列表,其中两列作为键、值对

Python Pandas:将分组的df转换为dict列表,其中两列作为键、值对,python,python-2.7,pandas,dictionary,pandas-groupby,Python,Python 2.7,Pandas,Dictionary,Pandas Groupby,我有以下建议: YEAR MONTH VALUE 0 2010 january 1 1 2010 february 0 2 2010 march 2 3 2010 april 1 4 2010 may -2 5 2010 june -0 6 2010 july

我有以下建议:

       YEAR      MONTH        VALUE
0   2010    january          1
1   2010   february          0
2   2010      march          2
3   2010      april          1
4   2010        may         -2
5   2010       june         -0
6   2010       july          1
7   2010     august          0
8   2010  september          1
9   2010    october          2
10  2010   november         -0
11  2010   december          0
12  2011    january          1
13  2011   february          0
14  2011      march          0
15  2011      april         -0
16  2011        may          0
17  2011       june         -0
18  2011       july         -0
19  2011     august         -1
20  2011  september         -1
21  2011    october          1
22  2011   november          0
23  2011   december          1
我需要将其转换为以下格式

[{"id":0,"year":2010,"january":1,"february":1,"march":2,"april":1,"may":null,"june":null,"july":null,"august":null,"september":null,"october":null,"november":null,"december":null
基本上,我已经按年份对df进行了分组。现在,我希望每个组都有一个单独的字典,该字典以月份为键,相应的值为值。还有一个额外的键、年份值和组号(id=0)


PS:忽略我所需格式中的空值。它们都应该有相应的月份值

我将dict存储在一个列表中,仍然使用
groupby
+for循环

l=[]
count=0
for x ,y in df.groupby('YEAR'):

    d=y.set_index('MONTH').VALUE.to_dict() 
    d['id']=count
    d['year']=x
    l.append(d)
    count=count+1
l
Out[821]: 
[{'april': 1.56,
  'august': 0.95,
  'december': 0.83,
  'february': 0.81,
  'id': 0,
  'january': 1.02,
  'july': 1.32,
  'june': -0.57,
  'march': 2.66,
  'may': -2.02,
  'november': -0.53,
  'october': 2.17,
  'september': 1.79,
  'year': 2010},
 {'april': -0.17,
  'august': -1.81,
  'december': 1.36,
  'february': 0.84,
  'id': 1,
  'january': 1.06,
  'july': -0.04,
  'june': -0.27,
  'march': 0.11,
  'may': 0.15,
  'november': 0.75,
  'october': 1.95,
  'september': -1.55,
  'year': 2011}]

您只需调用
dict(df.values)
,就可以从这些值创建一个字典,然后您只需要以正确的方式链接这些组来构建列表

out = []
for idx, (key, group) in enumerate(df.groupby('YEAR')):
    year = dict(group.iloc[:, ~group.columns.isin(['YEAR'])].values)
    year.update({'id': idx})
    out.append(year)
或者作为一个列表

dict_merge = lambda a,b: a.update(b) or a
out = [dict_merge(dict(group.iloc[:, 1:].values), {'id': idx}) for idx, (key, group) in enumerate(groups)]
print(out)
[{'april': 1.56,
  'august': 0.95,
  'december': 0.83,
  'february': 0.81,
  'id': 0,
  'january': 1.02,
  'july': 1.32,
  'june': -0.57,
  'march': 2.66,
  'may': -2.02,
  'november': -0.53,
  'october': 2.17,
  'september': 1.79},
 {'april': -0.17,
  'august': -1.81,
  'december': 1.36,
  'february': 0.84,
  'id': 1,
  'january': 1.06,
  'july': -0.04,
  'june': -0.27,
  'march': 0.11,
  'may': 0.15,
  'november': 0.75,
  'october': 1.95,
  'september': -1.55}]

对于O(n)解决方案,可以使用
collections.defaultdict

然后,只需使用
{**x,**y}
语法将
id
year
键添加到列表理解中,以组合两个词典

请注意,对字典项使用
排序
,可确保结果按年份排序

from collections import defaultdict

d = defaultdict(lambda: defaultdict(int))

for row in df.itertuples():
    d[row[1]][row[2]] = row[3]

res = [{**{'id': i, 'year': k}, **v} for i, (k, v) in enumerate(sorted(d.items()))]
结果:

[{'april': 1,
  'august': 0,
  'december': 0,
  'february': 0,
  'id': 0,
  'january': 1,
  'july': 1,
  'june': 0,
  'march': 2,
  'may': -2,
  'november': 0,
  'october': 2,
  'september': 1,
  'year': 2010},
 {'april': 0,
  'august': -1,
  'december': 1,
  'february': 0,
  'id': 1,
  'january': 1,
  'july': 0,
  'june': 0,
  'march': 0,
  'may': 0,
  'november': 0,
  'october': 1,
  'september': -1,
  'year': 2011}]

以下解决方案之一是否有帮助?如果是这样,请随意接受(左边绿色勾选),否则请随意要求澄清。