Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/338.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Dataframe:仅获取某个列值最大的行_Python_Pandas_Dataframe - Fatal编程技术网

Python Dataframe:仅获取某个列值最大的行

Python Dataframe:仅获取某个列值最大的行,python,pandas,dataframe,Python,Pandas,Dataframe,我有一个数据帧: In [73]: data = {'ID':[1234,1234,1234,1234,1235,1235,1236,1237,1237,1237,1237], 'Date':['1/4/2001','1/4/2001','6/1/2003','6/1/2003', '7/1/1998', '7/1/1998', '4/23/2005', '7/1/2005','7/1/2005','7/1/2005','7/1/2005'], 'CalcYr': [2018, 2019, 20

我有一个数据帧:

In [73]: data = {'ID':[1234,1234,1234,1234,1235,1235,1236,1237,1237,1237,1237], 'Date':['1/4/2001','1/4/2001','6/1/2003','6/1/2003', '7/1/1998', '7/1/1998', '4/23/2005', '7/1/2005','7/1/2005','7/1/2005','7/1/2005'], 'CalcYr': [2018, 2019, 2018, 2019, 2007, 2008, 2018, 2016, 2017, 2018, 2019], 'Values':[0.1,0.1,0.2,0.3,0.3,0.4,0.6,0,0.1,0,0.2]}
In[74]: df = pd.DataFrame(data)

In [75]: df
Out[75]: 
      ID       Date  CalcYr  Values
0   1234   1/4/2001    2018     0.1
1   1234   1/4/2001    2019     0.1
2   1234   6/1/2003    2018     0.2
3   1234   6/1/2003    2019     0.3
4   1235   7/1/1998    2007     0.3
5   1235   7/1/1998    2008     0.4
6   1236  4/23/2005    2018     0.6
7   1237   7/1/2005    2016     0.0
8   1237   7/1/2005    2017     0.1
9   1237   7/1/2005    2018     0.0
10  1237   7/1/2005    2019     0.2
我想做的是只为相同的
ID
Date
值保留一行,其中
CalcYr
最大。例如,对于1234的
ID
和2001年1月4日的
Date
,我只保留
CalcYr
为2019的行。结果将是:

         ID       Date  CalcYr  Values
    0   1234   1/4/2001    2019     0.1
    1   1234   6/1/2003    2019     0.3
    2   1235   7/1/1998    2008     0.4
    3   1236  4/23/2005    2018     0.6
    4   1237   7/1/2005    2019     0.2
使用:


df[df['CalcYr'].eq(df.groupby(['ID','Date'])['CalcYr'].transform('max'))]
谢谢。这似乎是可行的,但我很好奇它是如何在CalcYr列中查找max值的,而不是Values列。
df.groupby(['ID','Date'], as_index=False).max()

     ID       Date  CalcYr  Values
0  1234   1/4/2001    2019     0.1
1  1234   6/1/2003    2019     0.3
2  1235   7/1/1998    2008     0.4
3  1236  4/23/2005    2018     0.6
4  1237   7/1/2005    2019     0.2