Python 如何从dataframe中获取唯一的行,同时在某些列中保留最大值的行?
我有一个数据帧(来自以下csv): 我想删除timestamp列上的重复项,同时保留'load'值最大的行 在这种情况下:Python 如何从dataframe中获取唯一的行,同时在某些列中保留最大值的行?,python,pandas,Python,Pandas,我有一个数据帧(来自以下csv): 我想删除timestamp列上的重复项,同时保留'load'值最大的行 在这种情况下: load,timestamp,timestr 0,1576147339.49,124219 0,1576147339.502,124219 2,1576147339.637,124219 1,1576147339.641,124219 9,1576147339.662,124219 8,1576147339.663,124219 0,1576147339.673,12421
load,timestamp,timestr
0,1576147339.49,124219
0,1576147339.502,124219
2,1576147339.637,124219
1,1576147339.641,124219
9,1576147339.662,124219
8,1576147339.663,124219
0,1576147339.673,124219
3,1576147341.567,124221
2,1576147341.568,124221
1,1576147341.569,124221
0,1576147341.57,124221
4,1576147341.581,124221
“加载”的最大值不必首先出现
最好的方法是什么?尝试使用
groupby
:
print(df.groupby('timestamp', as_index=False)['load'].max().join(df['timestr']))
输出:
timestamp load timestr
0 1.576147e+09 0 124219
1 1.576147e+09 0 124219
2 1.576147e+09 2 124219
3 1.576147e+09 1 124219
4 1.576147e+09 9 124219
5 1.576147e+09 8 124219
6 1.576147e+09 0 124219
7 1.576147e+09 3 124221
8 1.576147e+09 2 124221
9 1.576147e+09 1 124221
10 1.576147e+09 0 124221
11 1.576147e+09 4 124221
重置精度并使用groupby显示最大值:
pd.options.display.float_format = '{:.3f}'.format
df.groupby('timestamp').max()
输出:
load timestr
timestamp
1576147339.490 0 124219
1576147339.502 0 124219
1576147339.637 2 124219
1576147339.641 1 124219
1576147339.662 9 124219
1576147339.663 8 124219
1576147339.673 0 124219
1576147341.567 3 124221
1576147341.568 2 124221
1576147341.569 1 124221
1576147341.570 0 124221
1576147341.581 4 124221
groupby()+max()?我对熊猫还不熟悉,我想可以,但是怎么会呢?我接受得太快了。。。我想在载荷柱上找到最大值。我该怎么写?
load timestr
timestamp
1576147339.490 0 124219
1576147339.502 0 124219
1576147339.637 2 124219
1576147339.641 1 124219
1576147339.662 9 124219
1576147339.663 8 124219
1576147339.673 0 124219
1576147341.567 3 124221
1576147341.568 2 124221
1576147341.569 1 124221
1576147341.570 0 124221
1576147341.581 4 124221