Pandas 如何分组并将计数转换回数据帧
假设我有以下df:Pandas 如何分组并将计数转换回数据帧,pandas,Pandas,假设我有以下df: years = [] months = [] ys = [2003,2003,2004,2005] for y in ys: for i in range(1,4): years.append(y) months.append(i) df= pd.DataFrame({"year":years,'month':months}) df year month 0 2003 1 1 200
years = []
months = []
ys = [2003,2003,2004,2005]
for y in ys:
for i in range(1,4):
years.append(y)
months.append(i)
df= pd.DataFrame({"year":years,'month':months})
df
year month
0 2003 1
1 2003 2
2 2003 3
3 2003 1
4 2003 2
5 2003 3
6 2004 1
7 2004 2
8 2004 3
9 2005 1
10 2005 2
11 2005 3
- 请注意,2003年会重演
year month count
0 2003 1 1
1 2003 2 2
2 2003 3 3
3 2003 1 1
4 2003 2 2
5 2003 3 3
6 2004 1 4
7 2004 2 5
8 2004 3 6
9 2005 1 7
10 2005 2 8
11 2005 3 9
我厌倦了df['count']=df.groupby(['year','month'])。转换('count')
,但我得到了“传递的项数错误0,位置意味着1”使用.ngroup()
:
印刷品:
年-月计数
0 2003 1 1
1 2003 2 2
2 2003 3 3
3 2003 1 1
4 2003 2 2
5 2003 3 3
6 2004 1 4
7 2004 2 5
8 2004 3 6
9 2005 1 7
10 2005 2 8
11 2005 3 9
添加另一个带有zip
和的方法,该方法将按顺序为每个组返回唯一的编号:
df["count"] = pd.factorize([*zip(df['year'],df['month'])])[0]+1
或者将与因子分解一起使用:
cols = ['year','month']
df["count"] = pd.factorize(df[cols].to_records(index=False))[0]+1
cols = ['year','month']
df["count"] = pd.factorize(df[cols].to_records(index=False))[0]+1
print(df)
year month count
0 2003 1 1
1 2003 2 2
2 2003 3 3
3 2003 1 1
4 2003 2 2
5 2003 3 3
6 2004 1 4
7 2004 2 5
8 2004 3 6
9 2005 1 7
10 2005 2 8
11 2005 3 9