Python 删除不同时间分辨率的天数列表（分钟数据）_Python_Pandas

Python 删除不同时间分辨率的天数列表（分钟数据）

python pandas

Python 删除不同时间分辨率的天数列表（分钟数据）,python,pandas,Python,Pandas,我有一个这样的数据帧（时间戳只包含9:00到20:00）我有一份清单，上面列出了我想在df中排除的几天（在“不完整的天数”中） 0 2020-05-18 1 2020-05-19 3 2020-05-21 4 2020-05-22 5 2020-05-23 6 2020-05-24 Name: Time, dtype: datetime64[ns] 我只是试着 df[df['Time'] != incomplete_days] 但是，错误是这样说的 V

我有一个这样的数据帧（时间戳只包含9:00到20:00）

我有一份清单，上面列出了我想在df中排除的几天（在“不完整的天数”中）

0    2020-05-18
1    2020-05-19
3    2020-05-21
4    2020-05-22
5    2020-05-23
6    2020-05-24

Name: Time, dtype: datetime64[ns]

我只是试着

df[df['Time'] != incomplete_days]

但是，错误是这样说的

ValueError: Can only compare identically-labeled Series objects

我是否应该用要排除的天数列表制作时间戳（1分钟分辨率）他们在df吗？如果是这样的话，我如何在给定的日子里用开始时间和结束时间来确定时间

难道我不需要用1分钟的分辨率来做时间戳吗

（我已经从20:01到08:59之间删去了不相关的时间，并在df中保留了从09:00到20:00的时间。我不想再次使用要排除的天数列表制作小时时间戳。我使用了以下变量，用于删去不相关的时间）

-----编辑是的

给予

及

给予

当我这样做的时候

df.drop([df['Time'].dt.date not in incomplete_days],inplace=True)

我得到以下错误

TypeError: 'Series' objects are mutable, thus they cannot be hashed

我看到它非常接近，但出现了一些问题。

假设您有两个数据帧

df

和

df1

，它们的列采用日期时间格式：

df

    Date
0   2020-05-18 10:18:00
1   2020-05-18 10:19:00
2   2020-05-18 10:20:00
3   2020-05-18 10:21:00
4   2020-05-18 10:22:00
5   2020-07-20 12:00:00

df1

    incomplete_days
0   2020-05-18
1   2020-05-19
3   2020-05-21
4   2020-05-22
5   2020-05-23
6   2020-05-24

您可以使用布尔索引，并将两列转换为具有相同格式的字符串进行比较。将

与

isin一起使用（实际上是“不在”）而不是=。您不能使用=
将行与整个系列进行比较，因此当前方法是语法错误。在布尔索引[]
中转换格式将保持数据帧的初始格式，并且不会从日期更改为字符串
df = df[~(df['Date'].dt.strftime('%Y-%m-%d').isin(df1['incomplete_days'].dt.strftime('%Y-%m-%d')))]

Out[38]: 
Date
5 2020-07-20 12:00:00

很抱歉造成了混乱。这是我的一个小错误。它工作完美！非常感谢，大卫！
[Timestamp('2020-05-18 00:00:00'),
 Timestamp('2020-05-19 00:00:00'),
 Timestamp('2020-05-21 00:00:00'),
 Timestamp('2020-05-22 00:00:00'),
 Timestamp('2020-05-23 00:00:00'),
 Timestamp('2020-05-24 00:00:00'),
 Timestamp('2020-05-25 00:00:00'),
 Timestamp('2020-05-26 00:00:00'),
 Timestamp('2020-05-27 00:00:00'),
 Timestamp('2020-05-28 00:00:00'),
 Timestamp('2020-05-29 00:00:00'),
 Timestamp('2020-05-30 00:00:00'),
 Timestamp('2020-05-31 00:00:00'),
 Timestamp('2020-06-01 00:00:00'),
 Timestamp('2020-06-02 00:00:00'),
 Timestamp('2020-06-03 00:00:00'),
 Timestamp('2020-06-10 00:00:00'),
 Timestamp('2020-07-02 00:00:00'),
 Timestamp('2020-07-05 00:00:00'),
 Timestamp('2020-07-06 00:00:00')]

df.drop([df['Time'].dt.date not in incomplete_days],inplace=True)

TypeError: 'Series' objects are mutable, thus they cannot be hashed

    Date
0   2020-05-18 10:18:00
1   2020-05-18 10:19:00
2   2020-05-18 10:20:00
3   2020-05-18 10:21:00
4   2020-05-18 10:22:00
5   2020-07-20 12:00:00

    incomplete_days
0   2020-05-18
1   2020-05-19
3   2020-05-21
4   2020-05-22
5   2020-05-23
6   2020-05-24

df = df[~(df['Date'].dt.strftime('%Y-%m-%d').isin(df1['incomplete_days'].dt.strftime('%Y-%m-%d')))]

Out[38]: 
Date
5 2020-07-20 12:00:00