python熊猫-基于服装逻辑分割帧行
具有以下数据帧:python熊猫-基于服装逻辑分割帧行,python,pandas,Python,Pandas,具有以下数据帧: day user score total 0 1 A 10 10 1 1 A 5 15 2 2 B 5 20 3 3 C 10 30 4 3 B 5 35 5 3 B 5 40 6 4 C 0 40 7 4 C 5 45 总计列是使用
day user score total
0 1 A 10 10
1 1 A 5 15
2 2 B 5 20
3 3 C 10 30
4 3 B 5 35
5 3 B 5 40
6 4 C 0 40
7 4 C 5 45
总计列是使用cumsum
方法创建的
import pandas as pd
df = pd.DataFrame({
'day' : [1,1,2,3,3,3,4,4],
'user' : ['A','A','B','C','B','B','C','C'],
'score': [10,5,5,10,5,5,0,5]
})
df["total"] = df.cumsum()["score"]
print(df.head(10))
现在,我想将数据帧拆分为一个集合,在接下来的两天内(一天的数据长度不同),以获得以下组:
day user score total
0 1 A 10 10
1 1 A 5 15 <--- days 1 & 2
2 2 B 5 20
-------------------------
3 3 C 10 30
4 3 B 5 35
5 3 B 5 40 <--- days 3 & 4
6 4 C 0 40
7 4 C 5 45
day用户总分
011010
1a515让我们做因子分解
得到2的div数
d={x : y for x , y in df.groupby(df.day.factorize()[0]//2)}
...
...
{0: day user score total
0 1 A 10 10
1 1 A 5 15
2 2 B 5 20, 1: day user score total
3 3 C 10 30
4 3 B 5 35
5 3 B 5 40
6 4 C 0 40
7 4 C 5 45}
不确定这是否是你想要的。。。将day列转换为timedelta并在其上分组
df.day = pd.to_timedelta(df.day, 'D')
#u could change it to 4 days or whatever number
grouping = df.resample(rule='2D', on='day')
[v for k,v in grouping]
[ day user score total
0 1 days A 10 10
1 1 days A 5 15
2 2 days B 5 20, day user score total
3 3 days C 10 30
4 3 days B 5 35
5 3 days B 5 40
6 4 days C 0 40
7 4 days C 5 45, Empty DataFrame
Columns: [day, user, score, total]
Index: []]
[v for k,v in grouping][0]
day user score total
0 1 days A 10 10
1 1 days A 5 15
2 2 days B 5 20
[v for k,v in grouping][1]
day user score total
3 3 days C 10 30
4 3 days B 5 35
5 3 days B 5 40
6 4 days C 0 40
7 4 days C 5 45