Python 两个柱的滚动发生
在pandas中,我有两个系列的x行,我想添加一列,其中我得到col1中的值从第一行到x-1行出现的滚动次数 df如下所示:Python 两个柱的滚动发生,python,pandas,countif,rolling-computation,Python,Pandas,Countif,Rolling Computation,在pandas中,我有两个系列的x行,我想添加一列,其中我得到col1中的值从第一行到x-1行出现的滚动次数 df如下所示: col1 col2 0 B A 1 B C 2 A B 3 A B 4 A C 5 B A 所需输出为 col1 col2 freq 0 B A 0 1 B C 1 2 A B 1 3 A B 2 4 A C 3 #A appears
col1 col2
0 B A
1 B C
2 A B
3 A B
4 A C
5 B A
所需输出为
col1 col2 freq
0 B A 0
1 B C 1
2 A B 1
3 A B 2
4 A C 3 #A appears 3 times in the two columns from row 0 to 3
5 B A 4 #B appears 4 times in the two columns from row 0 to 4
来自初学者的提前感谢,
G
印刷品:
col1 col2 freq
0 B A 0
1 B C 1
2 A B 1
3 A B 2
4 A C 3
5 B A 4
编辑(任意列数的解决方案):
无论df中有多少列,这都将解决此问题
import pandas as pd
import numpy as np
def add(d1,d2):
# adding two dictionary
for i in d2.keys():
if i in d1.keys():
d1[i] = d1[i] +d2[i]
else:
d1[i] = d2[i]
return d1
if __name__ == '__main__':
counts = {}
df = pd.DataFrame({"a":[1, 2, 3, 1, 2], "b":[2, 1, 2, 3, 1]})
col = list(df)
for ind, it in df.iterrows():
unique,count = np.unique(it,return_counts=True)
unique_dict = dict(zip(unique, count))
counts = add(counts,unique_dict)
df.loc[ind, "freq"] = counts[it[col[0]]]
df["freq"] =df["freq"]-1
让我们使用一些数据帧重塑、groupby和cumcount:
dfs = df.stack()
df['freq'] = dfs.groupby(dfs).cumcount().unstack()['col1']
print(df)
输出:
col1 col2 freq
0 B A 0
1 B C 1
2 A B 1
3 A B 2
4 A C 3
5 B A 4
如果我有更多的感冒会怎么样
dfs = df.stack()
df['freq'] = dfs.groupby(dfs).cumcount().unstack()['col1']
print(df)
col1 col2 freq
0 B A 0
1 B C 1
2 A B 1
3 A B 2
4 A C 3
5 B A 4