Python 按日期之间的最小绝对差值、按组选择行
我想为每个Python 按日期之间的最小绝对差值、按组选择行,python,pandas,datetime,dataframe,Python,Pandas,Datetime,Dataframe,我想为每个A组选择A日期在时间上更接近B日期的行 输出应为: A B C 0 2002-01-16 2002-02-28 Jack 1 2002-01-16 2002-01-30 Helen 2 2002-01-16 2002-02-28 Peter 3 2002-01-16 2002-01-30 Jud 4 2002-04-27 2002-04-30 Nick 5 2002-04-27 2002-0
A
组选择A
日期在时间上更接近B
日期的行
输出应为:
A B C
0 2002-01-16 2002-02-28 Jack
1 2002-01-16 2002-01-30 Helen
2 2002-01-16 2002-02-28 Peter
3 2002-01-16 2002-01-30 Jud
4 2002-04-27 2002-04-30 Nick
5 2002-04-27 2002-05-25 Wendy
6 2002-04-27 2002-04-30 Bryan
7 2002-04-27 2002-05-25 Sarah
使用:
详细信息:
df = df[df['B'].sub(df['A']).groupby(df['A']).transform(lambda x: x == x.min())]
print (df)
A B C
1 2002-01-16 2002-01-30 Helen
3 2002-01-16 2002-01-30 Jud
4 2002-04-27 2002-04-30 Nick
6 2002-04-27 2002-04-30 Bryan
这是一种方式
print (df['B'].sub(df['A']))
0 43 days
1 14 days
2 43 days
3 14 days
4 3 days
5 28 days
6 3 days
7 28 days
dtype: timedelta64[ns]
print (df['B'].sub(df['A']).groupby(df['A']).transform(lambda x: x == x.min()))
0 False
1 True
2 False
3 True
4 True
5 False
6 True
7 False
dtype: bool
结果:
# convert columns to datetime
df[['A', 'B']] = df[['A', 'B']].apply(pd.to_datetime)
# calculate absolute difference
df['Diff'] = (df['B'] - df['A']).abs()
# filter for difference equal to mapped minimum
res = df.loc[df['Diff'] == df['A'].map(df.groupby('A')['Diff'].min())]
2002年2月有30天吗?很好Jpp,哈哈。编辑
# convert columns to datetime
df[['A', 'B']] = df[['A', 'B']].apply(pd.to_datetime)
# calculate absolute difference
df['Diff'] = (df['B'] - df['A']).abs()
# filter for difference equal to mapped minimum
res = df.loc[df['Diff'] == df['A'].map(df.groupby('A')['Diff'].min())]
A B C Diff
1 2002-01-16 2002-01-30 Helen 14 days
3 2002-01-16 2002-01-30 Jud 14 days
4 2002-04-27 2002-04-30 Nick 3 days
6 2002-04-27 2002-04-30 Bryan 3 days