Python 熊猫聚集-如何根据事件日期查找最近的事件

Python 熊猫聚集-如何根据事件日期查找最近的事件,python,pandas,dataframe,datetime,date-range,Python,Pandas,Dataframe,Datetime,Date Range,我有一个很长的事件列表,这些事件都在熊猫数据框中,每个事件都有开始和结束日期。如何使用pandas.groupby()在每个城市和场馆组合中找到“最近”的活动 最近的事件可能是过去的事件,如果没有新的即将发生的事件。i、 e本例中数据帧中的最新事件,它恰好发生在过去。 如果有多个即将发生的事件,则未来最接近的事件将被视为最近的事件 我尝试了下面的groupby.agg(“max”),但这将始终提供未来最远的事件: dfp.groupby(['CITY', 'VENUE'], as_index=F

我有一个很长的事件列表,这些事件都在熊猫数据框中,每个事件都有开始和结束日期。如何使用
pandas.groupby()
在每个城市和场馆组合中找到“最近”的活动

最近的事件可能是过去的事件,如果没有新的即将发生的事件。i、 e本例中数据帧中的最新事件,它恰好发生在过去。 如果有多个即将发生的事件,则未来最接近的事件将被视为最近的事件

我尝试了下面的
groupby.agg(“max”)
,但这将始终提供未来最远的事件:

dfp.groupby(['CITY', 'VENUE'], as_index=False).agg({"EVENT_START" : "max", "EVENT_END": "max"})
寻找一种方法来获取时间上最近的事件(如果没有未来事件,则为过去最近的事件)

样本数据:

EVENT_START,EVENT_END,Event Description,City,Venue
2/5/2016,3/12/2016,event 1,Chicago,Art Institute of Chicago
11/2/2014,12/2/2014,event 2,Los Angelos,Party Haus
1/25/2018,1/31/2018,event 3,Long Beach,Precious Lamb
8/26/2018,8/31/2018,event 4,West Columbia,New Brookland Tavern
11/20/2017,12/17/2017,event 5,Paris,Orsay Museum
6/26/2018,7/9/2018,event 6,Lahaina,Bamboo Fresh
4/3/2010,5/2/2010,event 7,Mitchell,The Corn Palace
9/21/2015,10/18/2015,event 8,San Diego,San Diego Zoo
1/4/2014,1/15/2014,event 9,Portland,Doug Fir Lounge
9/21/2019,9/26/2019,event 10,St. Louis,Krispy Kreme
3/15/2015,2/13/2018,event 11,Corvallis,The Beanery
9/23/2005,10/2/2005,event 12,San Jose,Winchester Mystery House
12/11/2019,12/14/2019,event 13,Chicago,Art Institute of Chicago
6/1/2013,6/26/2013,event 14,Los Angelos,Party Haus
7/10/2020,9/4/2020,event 15,Long Beach,Precious Lamb
10/18/2020,11/26/2020,event 16,West Columbia,New Brookland Tavern
5/14/2004,5/16/2004,event 17,Paris,Orsay Museum
11/16/2020,11/20/2020,event 18,Lahaina,Bamboo Fresh
7/19/2020,10/22/2020,event 19,Mitchell,The Corn Palace
11/1/2017,11/30/2017,event 20,San Diego,San Diego Zoo
7/31/2015,8/1/2015,event 21,Portland,Doug Fir Lounge
10/12/2012,10/20/2012,event 22,St. Louis,Krispy Kreme
2/28/2003,3/13/2003,event 23,Corvallis,The Beanery
9/16/2019,9/20/2019,event 24,San Jose,Winchester Mystery House
3/1/2022,4/1/2022,event 25,Chicago,Art Institute of Chicago
2/19/2009,2/25/2009,event 26,Los Angelos,Party Haus
4/16/2015,5/8/2015,event 27,Long Beach,Precious Lamb
9/7/2016,9/11/2016,event 28,West Columbia,New Brookland Tavern
8/4/2001,8/26/2001,event 29,Paris,Orsay Museum
4/27/2017,6/11/2017,event 30,Lahaina,Bamboo Fresh
5/21/2011,6/19/2011,event 31,Mitchell,The Corn Palace
6/3/2020,8/10/2020,event 32,San Diego,San Diego Zoo
10/29/2012,11/15/2012,event 33,Portland,Doug Fir Lounge
9/1/2027,10/15/2027,event 34,St. Louis,Krispy Kreme
6/23/2017,6/25/2017,event 35,Corvallis,The Beanery
4/25/2007,5/26/2007,event 36,San Jose,Winchester Mystery House
5/30/2003,7/1/2003,event 37,Chicago,Art Institute of Chicago
3/14/2008,4/12/2008,event 38,Los Angelos,Party Haus
5/29/2017,7/27/2017,event 39,Long Beach,Precious Lamb
1/31/2015,3/7/2015,event 40,West Columbia,New Brookland Tavern
4/1/2017,4/21/2017,event 41,Paris,Orsay Museum
12/29/2003,1/31/2004,event 42,Lahaina,Bamboo Fresh
7/3/2021,7/17/2021,event 43,Mitchell,The Corn Palace
9/1/2004,4/30/2005,event 44,San Diego,San Diego Zoo
10/14/2006,10/27/2006,event 45,Portland,Doug Fir Lounge
7/18/2017,7/19/2017,event 46,St. Louis,Krispy Kreme
6/1/2006,6/1/2006,event 47,Corvallis,The Beanery
10/1/2012,11/4/2012,event 48,San Jose,Winchester Mystery House
9/5/2011,9/19/2011,event 49,Chicago,Art Institute of Chicago
5/28/2020,6/2/2020,event 50,Los Angelos,Party Haus
3/1/2023,4/1/2023,event 51,Chicago,Art Institute of Chicago
结果应该是:

3/1/2022    4/1/2022    event 25    Chicago Art Institute of Chicago
5/28/2020   6/2/2020    event 50    Los Angelos Party Haus
7/10/2020   9/4/2020    event 15    Long Beach  Precious Lamb
10/18/2020  11/26/2020  event 16    West Columbia   New Brookland Tavern
11/20/2017  12/17/2017  event 5 Paris   Orsay Museum
11/16/2020  11/20/2020  event 18    Lahaina Bamboo Fresh
7/3/2021    7/17/2021   event 43    Mitchell    The Corn Palace
6/3/2020    8/10/2020   event 32    San Diego   San Diego Zoo
7/31/2015   8/1/2015    event 21    Portland    Doug Fir Lounge
9/1/2027    10/15/2027  event 34    St. Louis   Krispy Kreme
6/23/2017   6/25/2017   event 35    Corvallis   The Beanery
9/16/2019   9/20/2019   event 24    San Jose    Winchester Mystery House
IIUC

输出:

   EVENT_START  EVENT_END Event Description           City                     Venue
24  2022-03-01 2022-04-01          event 25        Chicago  Art Institute of Chicago
34  2017-06-23 2017-06-25          event 35      Corvallis               The Beanery
17  2020-11-16 2020-11-20          event 18        Lahaina              Bamboo Fresh
14  2020-07-10 2020-09-04          event 15     Long Beach             Precious Lamb
49  2020-05-28 2020-06-02          event 50    Los Angelos                Party Haus
42  2021-07-03 2021-07-17          event 43       Mitchell           The Corn Palace
4   2017-11-20 2017-12-17           event 5          Paris              Orsay Museum
20  2015-07-31 2015-08-01          event 21       Portland           Doug Fir Lounge
31  2020-06-03 2020-08-10          event 32      San Diego             San Diego Zoo
23  2019-09-16 2019-09-20          event 24       San Jose  Winchester Mystery House
33  2027-09-01 2027-10-15          event 34      St. Louis              Krispy Kreme
15  2020-10-18 2020-11-26          event 16  West Columbia      New Brookland Tavern
   EVENT_START  EVENT_END Event Description           City                     Venue                        diff
33  2027-09-01 2027-10-15          event 34      St. Louis              Krispy Kreme   2422 days 02:40:47.022263
50  2023-03-01 2023-04-01          event 51        Chicago  Art Institute of Chicago    777 days 02:40:47.022263
42  2021-07-03 2021-07-17          event 43       Mitchell           The Corn Palace    171 days 02:40:47.022263
15  2020-10-18 2020-11-26          event 16  West Columbia      New Brookland Tavern   -48 days +02:40:47.022263
17  2020-11-16 2020-11-20          event 18        Lahaina              Bamboo Fresh   -54 days +02:40:47.022263
14  2020-07-10 2020-09-04          event 15     Long Beach             Precious Lamb  -131 days +02:40:47.022263
31  2020-06-03 2020-08-10          event 32      San Diego             San Diego Zoo  -156 days +02:40:47.022263
49  2020-05-28 2020-06-02          event 50    Los Angelos                Party Haus  -225 days +02:40:47.022263
23  2019-09-16 2019-09-20          event 24       San Jose  Winchester Mystery House  -481 days +02:40:47.022263
10  2015-03-15 2018-02-13          event 11      Corvallis               The Beanery -1065 days +02:40:47.022263
4   2017-11-20 2017-12-17           event 5          Paris              Orsay Museum -1123 days +02:40:47.022263
20  2015-07-31 2015-08-01          event 21       Portland           Doug Fir Lounge -1992 days +02:40:47.022263
那么,如果事件已过,您可能希望使用事件结束:

df['diff'] = np.where(df['EVENT_START'] - pd.Timestamp('now') > pd.Timedelta(days=0), 
         df['EVENT_START'] - pd.Timestamp('now'), 
         df['EVENT_END'] - pd.Timestamp('now'))

df.loc[df.groupby(['City', 'Venue'])['diff'].idxmax()]
输出(使用事件51更新):

使用
排序\u值
删除重复项

df['diff'] = np.where(df['EVENT_START'] - pd.Timestamp('now') > pd.Timedelta(days=0), 
         df['EVENT_START'] - pd.Timestamp('now'), 
         df['EVENT_END'] - pd.Timestamp('now'))

df.sort_values('diff', ascending=False).drop_duplicates(['City', 'Venue'])
输出:

   EVENT_START  EVENT_END Event Description           City                     Venue
24  2022-03-01 2022-04-01          event 25        Chicago  Art Institute of Chicago
34  2017-06-23 2017-06-25          event 35      Corvallis               The Beanery
17  2020-11-16 2020-11-20          event 18        Lahaina              Bamboo Fresh
14  2020-07-10 2020-09-04          event 15     Long Beach             Precious Lamb
49  2020-05-28 2020-06-02          event 50    Los Angelos                Party Haus
42  2021-07-03 2021-07-17          event 43       Mitchell           The Corn Palace
4   2017-11-20 2017-12-17           event 5          Paris              Orsay Museum
20  2015-07-31 2015-08-01          event 21       Portland           Doug Fir Lounge
31  2020-06-03 2020-08-10          event 32      San Diego             San Diego Zoo
23  2019-09-16 2019-09-20          event 24       San Jose  Winchester Mystery House
33  2027-09-01 2027-10-15          event 34      St. Louis              Krispy Kreme
15  2020-10-18 2020-11-26          event 16  West Columbia      New Brookland Tavern
   EVENT_START  EVENT_END Event Description           City                     Venue                        diff
33  2027-09-01 2027-10-15          event 34      St. Louis              Krispy Kreme   2422 days 02:40:47.022263
50  2023-03-01 2023-04-01          event 51        Chicago  Art Institute of Chicago    777 days 02:40:47.022263
42  2021-07-03 2021-07-17          event 43       Mitchell           The Corn Palace    171 days 02:40:47.022263
15  2020-10-18 2020-11-26          event 16  West Columbia      New Brookland Tavern   -48 days +02:40:47.022263
17  2020-11-16 2020-11-20          event 18        Lahaina              Bamboo Fresh   -54 days +02:40:47.022263
14  2020-07-10 2020-09-04          event 15     Long Beach             Precious Lamb  -131 days +02:40:47.022263
31  2020-06-03 2020-08-10          event 32      San Diego             San Diego Zoo  -156 days +02:40:47.022263
49  2020-05-28 2020-06-02          event 50    Los Angelos                Party Haus  -225 days +02:40:47.022263
23  2019-09-16 2019-09-20          event 24       San Jose  Winchester Mystery House  -481 days +02:40:47.022263
10  2015-03-15 2018-02-13          event 11      Corvallis               The Beanery -1065 days +02:40:47.022263
4   2017-11-20 2017-12-17           event 5          Paris              Orsay Museum -1123 days +02:40:47.022263
20  2015-07-31 2015-08-01          event 21       Portland           Doug Fir Lounge -1992 days +02:40:47.022263
IIUC

输出:

   EVENT_START  EVENT_END Event Description           City                     Venue
24  2022-03-01 2022-04-01          event 25        Chicago  Art Institute of Chicago
34  2017-06-23 2017-06-25          event 35      Corvallis               The Beanery
17  2020-11-16 2020-11-20          event 18        Lahaina              Bamboo Fresh
14  2020-07-10 2020-09-04          event 15     Long Beach             Precious Lamb
49  2020-05-28 2020-06-02          event 50    Los Angelos                Party Haus
42  2021-07-03 2021-07-17          event 43       Mitchell           The Corn Palace
4   2017-11-20 2017-12-17           event 5          Paris              Orsay Museum
20  2015-07-31 2015-08-01          event 21       Portland           Doug Fir Lounge
31  2020-06-03 2020-08-10          event 32      San Diego             San Diego Zoo
23  2019-09-16 2019-09-20          event 24       San Jose  Winchester Mystery House
33  2027-09-01 2027-10-15          event 34      St. Louis              Krispy Kreme
15  2020-10-18 2020-11-26          event 16  West Columbia      New Brookland Tavern
   EVENT_START  EVENT_END Event Description           City                     Venue                        diff
33  2027-09-01 2027-10-15          event 34      St. Louis              Krispy Kreme   2422 days 02:40:47.022263
50  2023-03-01 2023-04-01          event 51        Chicago  Art Institute of Chicago    777 days 02:40:47.022263
42  2021-07-03 2021-07-17          event 43       Mitchell           The Corn Palace    171 days 02:40:47.022263
15  2020-10-18 2020-11-26          event 16  West Columbia      New Brookland Tavern   -48 days +02:40:47.022263
17  2020-11-16 2020-11-20          event 18        Lahaina              Bamboo Fresh   -54 days +02:40:47.022263
14  2020-07-10 2020-09-04          event 15     Long Beach             Precious Lamb  -131 days +02:40:47.022263
31  2020-06-03 2020-08-10          event 32      San Diego             San Diego Zoo  -156 days +02:40:47.022263
49  2020-05-28 2020-06-02          event 50    Los Angelos                Party Haus  -225 days +02:40:47.022263
23  2019-09-16 2019-09-20          event 24       San Jose  Winchester Mystery House  -481 days +02:40:47.022263
10  2015-03-15 2018-02-13          event 11      Corvallis               The Beanery -1065 days +02:40:47.022263
4   2017-11-20 2017-12-17           event 5          Paris              Orsay Museum -1123 days +02:40:47.022263
20  2015-07-31 2015-08-01          event 21       Portland           Doug Fir Lounge -1992 days +02:40:47.022263
那么,如果事件已过,您可能希望使用事件结束:

df['diff'] = np.where(df['EVENT_START'] - pd.Timestamp('now') > pd.Timedelta(days=0), 
         df['EVENT_START'] - pd.Timestamp('now'), 
         df['EVENT_END'] - pd.Timestamp('now'))

df.loc[df.groupby(['City', 'Venue'])['diff'].idxmax()]
输出(使用事件51更新):

使用
排序\u值
删除重复项

df['diff'] = np.where(df['EVENT_START'] - pd.Timestamp('now') > pd.Timedelta(days=0), 
         df['EVENT_START'] - pd.Timestamp('now'), 
         df['EVENT_END'] - pd.Timestamp('now'))

df.sort_values('diff', ascending=False).drop_duplicates(['City', 'Venue'])
输出:

   EVENT_START  EVENT_END Event Description           City                     Venue
24  2022-03-01 2022-04-01          event 25        Chicago  Art Institute of Chicago
34  2017-06-23 2017-06-25          event 35      Corvallis               The Beanery
17  2020-11-16 2020-11-20          event 18        Lahaina              Bamboo Fresh
14  2020-07-10 2020-09-04          event 15     Long Beach             Precious Lamb
49  2020-05-28 2020-06-02          event 50    Los Angelos                Party Haus
42  2021-07-03 2021-07-17          event 43       Mitchell           The Corn Palace
4   2017-11-20 2017-12-17           event 5          Paris              Orsay Museum
20  2015-07-31 2015-08-01          event 21       Portland           Doug Fir Lounge
31  2020-06-03 2020-08-10          event 32      San Diego             San Diego Zoo
23  2019-09-16 2019-09-20          event 24       San Jose  Winchester Mystery House
33  2027-09-01 2027-10-15          event 34      St. Louis              Krispy Kreme
15  2020-10-18 2020-11-26          event 16  West Columbia      New Brookland Tavern
   EVENT_START  EVENT_END Event Description           City                     Venue                        diff
33  2027-09-01 2027-10-15          event 34      St. Louis              Krispy Kreme   2422 days 02:40:47.022263
50  2023-03-01 2023-04-01          event 51        Chicago  Art Institute of Chicago    777 days 02:40:47.022263
42  2021-07-03 2021-07-17          event 43       Mitchell           The Corn Palace    171 days 02:40:47.022263
15  2020-10-18 2020-11-26          event 16  West Columbia      New Brookland Tavern   -48 days +02:40:47.022263
17  2020-11-16 2020-11-20          event 18        Lahaina              Bamboo Fresh   -54 days +02:40:47.022263
14  2020-07-10 2020-09-04          event 15     Long Beach             Precious Lamb  -131 days +02:40:47.022263
31  2020-06-03 2020-08-10          event 32      San Diego             San Diego Zoo  -156 days +02:40:47.022263
49  2020-05-28 2020-06-02          event 50    Los Angelos                Party Haus  -225 days +02:40:47.022263
23  2019-09-16 2019-09-20          event 24       San Jose  Winchester Mystery House  -481 days +02:40:47.022263
10  2015-03-15 2018-02-13          event 11      Corvallis               The Beanery -1065 days +02:40:47.022263
4   2017-11-20 2017-12-17           event 5          Paris              Orsay Museum -1123 days +02:40:47.022263
20  2015-07-31 2015-08-01          event 21       Portland           Doug Fir Lounge -1992 days +02:40:47.022263
#拆分过去事件和未来事件
cond=df['EVENT_START']>datetime.now()
df_future=df[cond]
df_pass=df[~cond]
#保持最近的距离
dfn_furture=df_furture.sort_值(['City','vention','EVENT_START']))\
.drop_duplicates(['City','vention'],keep='first')
#为每个城市和过去的事件合并一个最近的未来事件
dfn=pd.concat([dfn未来,df过去])
df_result=dfn.sort_值(['City','vention','EVENT_START']))\
.drop_duplicates(['City','vention'],keep='last')。sort_index()
结果:

   EVENT_START  EVENT_END Event Description           City  \
4   2017-11-20 2017-12-17           event 5          Paris   
14  2020-07-10 2020-09-04          event 15     Long Beach   
15  2020-10-18 2020-11-26          event 16  West Columbia   
17  2020-11-16 2020-11-20          event 18        Lahaina   
20  2015-07-31 2015-08-01          event 21       Portland   
23  2019-09-16 2019-09-20          event 24       San Jose   
24  2022-03-01 2022-04-01          event 25        Chicago   
31  2020-06-03 2020-08-10          event 32      San Diego   
33  2027-09-01 2027-10-15          event 34      St. Louis   
34  2017-06-23 2017-06-25          event 35      Corvallis   
42  2021-07-03 2021-07-17          event 43       Mitchell   
49  2020-05-28 2020-06-02          event 50    Los Angelos   

                       Venue  
4               Orsay Museum  
14             Precious Lamb  
15      New Brookland Tavern  
17              Bamboo Fresh  
20           Doug Fir Lounge  
23  Winchester Mystery House  
24  Art Institute of Chicago  
31             San Diego Zoo  
33              Krispy Kreme  
34               The Beanery  
42           The Corn Palace  
49                Party Haus  
#拆分过去事件和未来事件
cond=df['EVENT_START']>datetime.now()
df_future=df[cond]
df_pass=df[~cond]
#保持最近的距离
dfn_furture=df_furture.sort_值(['City','vention','EVENT_START']))\
.drop_duplicates(['City','vention'],keep='first')
#为每个城市和过去的事件合并一个最近的未来事件
dfn=pd.concat([dfn未来,df过去])
df_result=dfn.sort_值(['City','vention','EVENT_START']))\
.drop_duplicates(['City','vention'],keep='last')。sort_index()
结果:

   EVENT_START  EVENT_END Event Description           City  \
4   2017-11-20 2017-12-17           event 5          Paris   
14  2020-07-10 2020-09-04          event 15     Long Beach   
15  2020-10-18 2020-11-26          event 16  West Columbia   
17  2020-11-16 2020-11-20          event 18        Lahaina   
20  2015-07-31 2015-08-01          event 21       Portland   
23  2019-09-16 2019-09-20          event 24       San Jose   
24  2022-03-01 2022-04-01          event 25        Chicago   
31  2020-06-03 2020-08-10          event 32      San Diego   
33  2027-09-01 2027-10-15          event 34      St. Louis   
34  2017-06-23 2017-06-25          event 35      Corvallis   
42  2021-07-03 2021-07-17          event 43       Mitchell   
49  2020-05-28 2020-06-02          event 50    Los Angelos   

                       Venue  
4               Orsay Museum  
14             Precious Lamb  
15      New Brookland Tavern  
17              Bamboo Fresh  
20           Doug Fir Lounge  
23  Winchester Mystery House  
24  Art Institute of Chicago  
31             San Diego Zoo  
33              Krispy Kreme  
34               The Beanery  
42           The Corn Palace  
49                Party Haus  


你能发布完整的数据集,而不仅仅是前两列吗?数据如下:对不起,看不到任何数据。请用数据编辑您的原始帖子同样,添加所需的输出将更有帮助。添加更多事件扫描您发布的是完整的数据集,而不仅仅是前两列?数据如下:抱歉,没有看到任何数据。请用数据编辑您的原始帖子同样,添加所需的输出将更有帮助。添加更多事件添加一个条目,如下,到数据集:3/1/2023,4/1/2023,事件51,芝加哥艺术学院Chicago@Ganga第二个语句使用新的更新输出。谢谢您的回答!如果我只想参加最远的活动,是的,但我要找的是“最近的”,例如芝加哥艺术学院的多个活动Chicago@Ganga啊。。。见。在数据集中添加如下条目:3/1/2023,4/1/2023,事件51,芝加哥艺术学院Chicago@Ganga第二个语句使用新的更新输出。谢谢您的回答!如果我只想参加最远的活动,是的,但我要找的是“最近的”,例如芝加哥艺术学院的多个活动Chicago@Ganga啊。。。我明白了。非常感谢您的回复,问题很快就解决了!令人惊叹的!!非常感谢您的回复,问题很快就解决了!令人惊叹的!!