Python 将负日期时间转换为NaT
我有两列:“Ask”和“Answeed”,但“Answeed”是一个对象,“Ask”是datetime64[ns]。因此我将“Answeed”转换为Datetime:Python 将负日期时间转换为NaT,python,pandas,dataframe,Python,Pandas,Dataframe,我有两列:“Ask”和“Answeed”,但“Answeed”是一个对象,“Ask”是datetime64[ns]。因此我将“Answeed”转换为Datetime: df['answered'] = pd.to_datetime(df['answered']) index, asked, answered 0 2016-07-04 07/07/2016 1 2016-07-03 07/01/2016 2 2016-07-05 07/09/2016 3
df['answered'] = pd.to_datetime(df['answered'])
index, asked, answered
0 2016-07-04 07/07/2016
1 2016-07-03 07/01/2016
2 2016-07-05 07/09/2016
3 NaT NaN
然后,我制作了第三列,给出了两者之间的时间差:
df['Days']= df['answered'] - df['asked']
index, asked, answered, Days
0 2016-07-04 07/07/2016 3 days
1 2016-07-03 07/01/2016 -2 days
2 2016-07-05 07/09/2016 4
3 NaT NaN NaT
在@piRSquared的帮助下,我试图将消极的日子变成NaT,但当我这样做时,什么也没发生:
df.update(df[['Days']].mask(df < 0))
df.update(df['Days']].mask(df<0))
我如何将消极的日子变成NaT?使用
然后,您可以使用
df.update(df[['Days', 'col2']].mask(df < 0))
df.update(df['Days','col2']].mask(df<0))
假设您想要获取所有属于timedelta的列
df.select_dtypes([np.timedelta]).mask(df < 0)
df.选择数据类型([np.timedelta]).mask(df<0)
更新
df.update(df.select_dtypes([np.timedelta64]).mask(df < 0))
df.update(df.select_数据类型([np.timedelta64]).mask(df<0))
For me works comapre系列
(列)by0 Timedelta
,然后通过或创建NaT
:
未来警告:在未来,“NAT
我得到了“无效类型比较”。这是因为我有更多的列,而我只在这里放了这两个列吗?我尝试了这个,但没有任何效果:df.mask(df['Days']<0)由于某种原因仍然不起作用。当我打印出负数日期时,我得到:“-5天+00:00:00”“。您提供的解决方案不会抛出错误,但不会将负数转换为NaT。这个问题还有其他解决办法吗?@AdamSchroeder数据类型是什么?df.Days.dtype我得到这个:对不起,我的错误。我得到了这个:dtype('谢谢@jezrael我已经为此工作了几个小时。你的解决方案和详细的解释真的帮了我的忙。
df.select_dtypes([np.timedelta]).mask(df < 0)
df.update(df.select_dtypes([np.timedelta64]).mask(df < 0))
mask = df['Days'] < pd.Timedelta(0)
df['Days'] = df['Days'].mask(mask)
print (df)
asked answered Days
0 2016-07-04 2016-07-07 3 days
1 2016-07-03 2016-07-01 NaT
2 2016-07-05 2016-07-09 4 days
3 NaT NaT NaT
mask = df['Days'] < pd.Timedelta(0)
df.loc[mask, 'Days'] = np.nan
print (df)
asked answered Days
0 2016-07-04 2016-07-07 3 days
1 2016-07-03 2016-07-01 NaT
2 2016-07-05 2016-07-09 4 days
3 NaT NaT NaT
print (df)
asked answered Days Days2
0 2016-07-04 2016-07-07 3 days 3 days
1 2016-07-03 2016-07-01 -2 days -2 days
2 2016-07-05 2016-07-09 4 days 4 days
3 NaT NaT NaT NaT
df1 = df.select_dtypes([np.timedelta64])
#return wrong mask
m1 = df1 < pd.Timedelta(0)
print (m1)
Days Days2
0 False False
1 False False
2 False False
3 True True
#if comapre with apply by Series it works
m2 = df1.apply(lambda x: x < pd.Timedelta(0))
print (m2)
Days Days2
0 False False
1 True True
2 False False
3 False False
#compare numpy array works but warning
m3 = df1.values < np.array(0, dtype=np.timedelta64)
print (m3)
[[False False]
[ True True]
[False False]
[ True True]]
df[df1.columns] = df1.mask(m2)
print (df)
asked answered Days Days2
0 2016-07-04 2016-07-07 3 days 3 days
1 2016-07-03 2016-07-01 NaT NaT
2 2016-07-05 2016-07-09 4 days 4 days
3 NaT NaT NaT NaT