获取大数据帧Python上的值_Python_Pandas_Pandas Groupby

获取大数据帧Python上的值

python pandas

获取大数据帧Python上的值,python,pandas,pandas-groupby,Python,Pandas,Pandas Groupby,我有一个大数据框，结构如下： ID Year Consumption 1 2012 24 2 2012 20 3 2012 21 1 2013 22 2 2013 23 3 2013 24 4 2013 25 ID First_Year Max_Consumption 1 2012 24 2 2012 23 3 201

我有一个大数据框，结构如下：

ID  Year Consumption
1   2012      24
2   2012      20
3   2012      21
1   2013      22
2   2013      23
3   2013      24
4   2013      25

ID   First_Year  Max_Consumption
1       2012          24
2       2012          23
3       2012          24
4       2013          25

我想要另一个数据帧，它包含出现的第一年，以及每个ID所有时间的最大消耗，如下所示：

ID  Year Consumption
1   2012      24
2   2012      20
3   2012      21
1   2013      22
2   2013      23
3   2013      24
4   2013      25

ID   First_Year  Max_Consumption
1       2012          24
2       2012          23
3       2012          24
4       2013          25

有没有一种不用循环就能提取这些数据的方法？我试过这个：

year = list(set(df.Year))
ids = list(set(df.ID))

antiq = list()
max_con = list()

for i in ids:
    df_id = df[df['ID'] == i]
    antiq.append(min(df_id['Year']))
    max_con.append(max(df_id['Consumption']))

但是太慢了。谢谢大家!

使用

GroupBy

agg

：

res = df.groupby('ID', as_index=False).agg({'Year': 'min', 'Consumption': 'max'})

print(res)

   ID  Year  Consumption
0   1  2012           24
1   2  2012           23
2   3  2012           24
3   4  2013           25

使用

GroupBy

agg

：

res = df.groupby('ID', as_index=False).agg({'Year': 'min', 'Consumption': 'max'})

print(res)

   ID  Year  Consumption
0   1  2012           24
1   2  2012           23
2   3  2012           24
3   4  2013           25

groupby

的另一个替代方案是

pivot\u table

：

pd.pivot_table(df, index="ID", aggfunc={"Year":min, "Consumption":max})

groupby

的另一个替代方案是

pivot\u table

：

pd.pivot_table(df, index="ID", aggfunc={"Year":min, "Consumption":max})