Python 如何在给定的datetime64值范围内获取错过的日期?

Python 如何在给定的datetime64值范围内获取错过的日期?,python,pandas,datetime,Python,Pandas,Datetime,我在熊猫中有以下数据帧df: dti id_n 2016-07-27 13:55:00 1 2016-07-29 13:50:07 1 2016-07-29 14:50:08 1 2016-07-30 23:50:01 2 2016-08-01 12:50:00 3 2016-08-02 12:50:00 3 dti的类型是datetime64。 我希望获得新的数据帧结果,其中dti的最小值和最大值之间缺少日期: 结果= 2016-07-28 20

我在熊猫中有以下数据帧df:

dti                  id_n
2016-07-27 13:55:00  1
2016-07-29 13:50:07  1
2016-07-29 14:50:08  1
2016-07-30 23:50:01  2
2016-08-01 12:50:00  3
2016-08-02 12:50:00  3
dti的类型是datetime64。 我希望获得新的数据帧结果,其中dti的最小值和最大值之间缺少日期:

结果=

2016-07-28
2016-07-31
如何获取它?

用于删除时间,然后创建并获取:

另一种解决方案是通过平均值进行下采样,并获得NaNs值的指数:


这里是另一个解决方案,供比较。我使用normalize删除时间并执行一组比较

import pandas as pd

df = pd.DataFrame([['2016-07-27 13:55:00', 1], ['2016-07-29 13:50:07', 1],
                   ['2016-07-29 14:50:08', 1], ['2016-07-30 23:50:01', 2],
                   ['2016-08-01 12:50:00', 3], ['2016-08-02 12:50:00', 3]],
                  columns=['dti', 'id_n'])

df['dti'] = pd.to_datetime(df['dti'])

full = set(pd.to_datetime(pd.date_range(df['dti'].dt.date.min(), df['dti'].dt.date.max(), normalize=True)))
select = set(df['dti'].dt.normalize())

full - select

# {Timestamp('2016-07-28 00:00:00', freq='D'),
#  Timestamp('2016-07-31 00:00:00', freq='D')}
a = df.resample('d', on='dti').mean()
print (a)
            id_n
dti             
2016-07-27   1.0
2016-07-28   NaN
2016-07-29   1.0
2016-07-30   2.0
2016-07-31   NaN
2016-08-01   3.0
2016-08-02   3.0

b = a.index[a['id_n'].isnull()]
print (b)
DatetimeIndex(['2016-07-28', '2016-07-31'], dtype='datetime64[ns]', name='dti', freq=None)
import pandas as pd

df = pd.DataFrame([['2016-07-27 13:55:00', 1], ['2016-07-29 13:50:07', 1],
                   ['2016-07-29 14:50:08', 1], ['2016-07-30 23:50:01', 2],
                   ['2016-08-01 12:50:00', 3], ['2016-08-02 12:50:00', 3]],
                  columns=['dti', 'id_n'])

df['dti'] = pd.to_datetime(df['dti'])

full = set(pd.to_datetime(pd.date_range(df['dti'].dt.date.min(), df['dti'].dt.date.max(), normalize=True)))
select = set(df['dti'].dt.normalize())

full - select

# {Timestamp('2016-07-28 00:00:00', freq='D'),
#  Timestamp('2016-07-31 00:00:00', freq='D')}