Python 按日期之间的最小绝对差值、按组选择行

Python 按日期之间的最小绝对差值、按组选择行,python,pandas,datetime,dataframe,Python,Pandas,Datetime,Dataframe,我想为每个A组选择A日期在时间上更接近B日期的行 输出应为: A B C 0 2002-01-16 2002-02-28 Jack 1 2002-01-16 2002-01-30 Helen 2 2002-01-16 2002-02-28 Peter 3 2002-01-16 2002-01-30 Jud 4 2002-04-27 2002-04-30 Nick 5 2002-04-27 2002-0

我想为每个
A
组选择
A
日期在时间上更接近
B
日期的行

输出应为:

            A           B      C
0  2002-01-16  2002-02-28   Jack
1  2002-01-16  2002-01-30  Helen
2  2002-01-16  2002-02-28  Peter
3  2002-01-16  2002-01-30    Jud
4  2002-04-27  2002-04-30   Nick
5  2002-04-27  2002-05-25  Wendy
6  2002-04-27  2002-04-30  Bryan
7  2002-04-27  2002-05-25  Sarah
使用:

详细信息

df = df[df['B'].sub(df['A']).groupby(df['A']).transform(lambda x: x == x.min())]
print (df)
           A          B      C
1 2002-01-16 2002-01-30  Helen
3 2002-01-16 2002-01-30    Jud
4 2002-04-27 2002-04-30   Nick
6 2002-04-27 2002-04-30  Bryan
这是一种方式

print (df['B'].sub(df['A']))

0   43 days
1   14 days
2   43 days
3   14 days
4    3 days
5   28 days
6    3 days
7   28 days
dtype: timedelta64[ns]

print (df['B'].sub(df['A']).groupby(df['A']).transform(lambda x: x == x.min()))
0    False
1     True
2    False
3     True
4     True
5    False
6     True
7    False
dtype: bool
结果:

# convert columns to datetime
df[['A', 'B']] = df[['A', 'B']].apply(pd.to_datetime)

# calculate absolute difference
df['Diff'] = (df['B'] - df['A']).abs()

# filter for difference equal to mapped minimum
res = df.loc[df['Diff'] == df['A'].map(df.groupby('A')['Diff'].min())]

2002年2月有30天吗?很好Jpp,哈哈。编辑
# convert columns to datetime
df[['A', 'B']] = df[['A', 'B']].apply(pd.to_datetime)

# calculate absolute difference
df['Diff'] = (df['B'] - df['A']).abs()

# filter for difference equal to mapped minimum
res = df.loc[df['Diff'] == df['A'].map(df.groupby('A')['Diff'].min())]
           A          B      C    Diff
1 2002-01-16 2002-01-30  Helen 14 days
3 2002-01-16 2002-01-30    Jud 14 days
4 2002-04-27 2002-04-30   Nick  3 days
6 2002-04-27 2002-04-30  Bryan  3 days