Python 将pd datetime对象更改为整数
我有一个熊猫数据框,里面有两个日期。我想计算他们之间的天数差。但由此产生的差异看起来像字符串ex(“7天”)。有没有办法将此更改为整数日期差Python 将pd datetime对象更改为整数,python,date,pandas,Python,Date,Pandas,我有一个熊猫数据框,里面有两个日期。我想计算他们之间的天数差。但由此产生的差异看起来像字符串ex(“7天”)。有没有办法将此更改为整数日期差 y['datePulled'] = pd.to_datetime(y['datePulled']) y['Dates'] = pd.to_datetime(y['Dates']) y['Datediff'] = y['datePulled'] - y['Dates'] y['Datediff'] 0 7 days 1 6 days 2 5
y['datePulled'] = pd.to_datetime(y['datePulled'])
y['Dates'] = pd.to_datetime(y['Dates'])
y['Datediff'] = y['datePulled'] - y['Dates']
y['Datediff']
0 7 days
1 6 days
2 5 days
3 4 days
4 3 days
5 2 days
6 1 days
您可以使用:
(y['Datediff'] / np.timedelta64(1, 'D')).astype(int)
或:
样本:
import pandas as pd
import numpy as np
y = pd.DataFrame({ 'datePulled': ['2016-01-05','2016-01-04'],
'Dates': ['2016-01-01','2016-01-02']})
y['datePulled'] = pd.to_datetime(y['datePulled'])
y['Dates'] = pd.to_datetime(y['Dates'])
y['Datediff'] = y['datePulled'] - y['Dates']
print (y)
#output is float, cast to int
y['Datediff1'] = (y['Datediff'] / np.timedelta64(1, 'D')).astype(int)
y['Datediff2'] = y['Datediff'].dt.days
print (y)
Dates datePulled Datediff Datediff1 Datediff2
0 2016-01-01 2016-01-05 4 days 4 4
1 2016-01-02 2016-01-04 2 days 2 2
在较大的数据帧中,第一种方法速度更快:
y = pd.concat([y]*1000).reset_index(drop=True)
In [236]: %timeit (y['Datediff'] / np.timedelta64(1, 'D')).astype(int)
1000 loops, best of 3: 789 µs per loop
In [237]: %timeit y['Datediff'].dt.days
100 loops, best of 3: 15.3 ms per loop
获取以下错误:回溯(最近一次调用last):文件“”,第1行,类型错误:元数据中的日期时间单位“d”无效
y = pd.concat([y]*1000).reset_index(drop=True)
In [236]: %timeit (y['Datediff'] / np.timedelta64(1, 'D')).astype(int)
1000 loops, best of 3: 789 µs per loop
In [237]: %timeit y['Datediff'].dt.days
100 loops, best of 3: 15.3 ms per loop