Python 如何使用pandas中的groupby添加函数?

Python 如何使用pandas中的groupby添加函数?,python,pandas,Python,Pandas,我有这样的桌子: ID1 = [2002070, 2002070,2002070, 2002070, 2002070, 2002740,2002740,2002740,2003010,2003010] ID2 = [2002070, 2002070,200800, 200800, 200800, 300540,300540,300540,2002740,2002740] ID3 = [2002740, 2002740,2002740, 2002070, 2002070, 2002070,3000

我有这样的桌子:

ID1 = [2002070, 2002070,2002070, 2002070, 2002070, 2002740,2002740,2002740,2003010,2003010]
ID2 = [2002070, 2002070,200800, 200800, 200800, 300540,300540,300540,2002740,2002740]
ID3 = [2002740, 2002740,2002740, 2002070, 2002070, 2002070,3000540,3000540,5001020,5001020]
Value1 = [4.5, 4.2, 3.7, 4.8, 4.4, 4.6, 3.3, 5.3, 3.8 ,2.6]
Value2 = [7.2, 6.4, 10, 2.3, 1.5, 4.7, 9.5, 4.2, 4.6 ,1.5]
Value3 = [8.4, 8.4, 8.4, 7.4, 7.4, 7.4, 5.3, 5.3, 6.1 ,6.1]
date1 = ['2005-12-07', '2008-05-14', '2008-10-27', '2009-04-20', '2012-03-01', '2013-11-28','2012-08-13', '2011-07-27', '2011-11-02', '2011-08-04']
date2 = ['2003-10-10', '2005-12-07', '2004-05-14', '2011-06-03', '2015-07-05', '2013-04-22','2002-01-14', '2005-04-12', '2011-06-26', '2004-10-18']
date3 = ['2010-10-22', '2012-03-01', '2013-11-28', '2005-12-07', '2012-03-01', '2009-04-20','2012-10-02', '2008-01-30', '2006-08-09', '2006-02-12']
date1=pd.to_datetime(date1)
date2=pd.to_datetime(date2)
date3=pd.to_datetime(date3)
df1=pd.DataFrame({'ID': ID1, 'Value1': Value1, 'Date1':date1}).sort_values('Date1')
df2=pd.DataFrame({'ID': ID2, 'Value2': Value2, 'Date2':date2}).sort_values('Date2')
df3=pd.DataFrame({'ID': ID3, 'Value3': Value3, 'Date3':date3}).sort_values('Date3')


        ID  Value1  Date1         ID    Value2  Date2         ID    Value3  Date3
0   2002070 4.5 2005-12-07      2002070 7.2 2003-10-10      2002740 8.4 2010-10-22
1   2002070 4.2 2008-05-14      2002070 6.4 2005-12-07      2002740 8.4 2012-03-01
2   2002070 3.7 2008-10-27       200800 10  2004-05-14      2002740 8.4 2013-11-28
3   2002070 4.8 2009-04-20       200800 2.3 2011-06-03      2002070 7.4 2005-12-07
4   2002070 4.4 2012-03-01       200800 1.5 2015-07-05      2002070 7.4 2012-03-01
5   2002740 4.6 2013-11-28       300540 4.7 2013-04-22      2002070 7.4 2009-04-20
6   2002740 3.3 2012-08-13       300540 9.5 2002-01-14      3000540 5.3 2012-10-02
7   2002740 5.3 2011-07-27       300540 4.2 2005-04-12      3000540 5.3 2008-01-30
8   2003010 3.8 2011-11-02      2002740 4.6 2011-06-26      5001020 6.1 2006-08-09
9   2003010 2.6 2011-08-04      2002740 1.5 2004-10-18      5001020 6.1 2006-02-12
我想做以下步骤:

  • 比较ID1、ID2和ID3是否相等
  • 如果date1和date2不同,如果value2和value3(对于任何date3,但对于ID3=ID2=ID1)存在,而value1不存在
  • 然后计算ID3=ID2=ID1(对于任何日期3,日期3都不相关)值1\u new=value2/(value3^2) 并为value1\u new设置date1\u new=date2
  • 例如,考虑ID=2002070,与Date1不同的唯一日期2是Date2.iloc[0],因此我将获得Value1\u new=value2.iloc[0]/(value3.iloc[3]^2)和Date1\u new1=Date2.iloc[0]。然后,我将把这个新值附加到用pedices 1表示的列中。对于Date2.iloc[1],我将保留Value1.iloc[0],因此我不做任何事情

    也许我应该使用groupby('ID'),但是我不知道如何将最后两个步骤集成到groupby函数中。你知道做这些的一些功能吗?还是我应该用自行车