Python将数据帧与日期列表进行比较，并根据结果分配字符串_Python_Pandas_Dataframe

Python将数据帧与日期列表进行比较，并根据结果分配字符串

python pandas dataframe

Python将数据帧与日期列表进行比较，并根据结果分配字符串,python,pandas,dataframe,Python,Pandas,Dataframe,我有datetime索引的数据帧。我有三份日期清单，说明他们的病情。我想用三个列表比较dataframe的每个日期，并为行分配一个字符串 df = index data 2019-02-04 14:52:00 73.923746 2019-02-05 10:48:00 73.335315 2019-02-05 11:28:00 72.021457 2019-02-06 10:49:00 72.367468 2019-02-07 1

我有datetime索引的数据帧。我有三份日期清单，说明他们的病情。我想用三个列表比较dataframe的每个日期，并为行分配一个字符串

df = 
  index                   data
2019-02-04 14:52:00    73.923746
2019-02-05 10:48:00    73.335315
2019-02-05 11:28:00    72.021457
2019-02-06 10:49:00    72.367468
2019-02-07 10:16:00    73.434296
2019-02-14 10:54:00    73.094386
2019-02-27 12:08:00    70.930997
2019-02-28 12:41:00    70.444107
2019-02-28 13:21:00    70.426729
2019-03-29 11:29:00    70.758032
2019-04-29 11:29:00    70.758032
2019-12-14 14:30:00    73.515568
2019-12-23 10:54:00    72.812583

bad_dates = [dates_bwn_twodates('2019-03-22','2019-04-09'),'bad_day']
good_dates= [dates_bwn_twodates('2019-4-10','2019-4-29'),'good_day']

explist = [bad_dates,good_dates]

我想将df中的每个索引与上述两个列表进行比较，并生成一个新的列来指示当天的情况。我现在的代码

df['test'] =  'normal_day'
for i in explist:
    for j in df.index:
        if bool(set(i[0])&set(j.strftime('%Y-%m-%d'))) == True:
            df['test'].loc[j] = i[1]

我目前的产出是

  index                   data       test 
2019-02-04 14:52:00    73.923746     normal_day 
2019-02-05 10:48:00    73.335315     normal_day 
2019-02-05 11:28:00    72.021457     normal_day 
2019-02-06 10:49:00    72.367468     normal_day 
2019-02-07 10:16:00    73.434296     normal_day 
2019-02-14 10:54:00    73.094386     normal_day 
2019-02-27 12:08:00    70.930997     normal_day 
2019-02-28 12:41:00    70.444107     normal_day 
2019-02-28 13:21:00    70.426729     normal_day 
2019-03-29 11:29:00    70.758032     normal_day 
2019-04-29 11:29:00    70.758032     normal_day 
2019-12-14 14:30:00    73.515568     normal_day 
2019-12-23 10:54:00    72.812583     normal_day

我的代码工作不正常

创建遮罩

bad = df['index'].between('2019-03-22', '2019-04-09')
good = df['index'].between('2019-04-10', '2019-04-29')

然后分配给他们

df['test'] =  'normal_day'
df.loc[bad, 'test'] = 'bad_day'
df.loc[good, 'test'] = 'good_day'

您的解决方案是如此简单和优雅。我得到了一些错误：

AttributeError:'DatetimeIndex'对象没有属性'between'

我找到了这个方法

mask=（df['date']>start_date）&（df['date']你也可以转换到str
来使用between，df['index'].astype（str）.between（…）
或者我正在尝试使用在时间之间。它看起来比掩蔽好。但是我在df.在时间之间（pd.to_datetime（'2019-04-30'）、pd.to_datetime（'2019-05-09'））
作为值错误：无法将arg[时间戳（'2019-04-30 00:00'）]转换为时间
如果您的df['index'].dtype
是datetime64
，between
应该可以正常工作，df['index']。between（'2019-02-05'，'2019-04-28'）
我的代码不能正常工作到底意味着什么？为什么要使用循环？为什么如果…==True:
？你没有阅读熊猫文档吗？这回答了你的问题吗？