Python 将pd datetime对象更改为整数

Python 将pd datetime对象更改为整数,python,date,pandas,Python,Date,Pandas,我有一个熊猫数据框,里面有两个日期。我想计算他们之间的天数差。但由此产生的差异看起来像字符串ex(“7天”)。有没有办法将此更改为整数日期差 y['datePulled'] = pd.to_datetime(y['datePulled']) y['Dates'] = pd.to_datetime(y['Dates']) y['Datediff'] = y['datePulled'] - y['Dates'] y['Datediff'] 0 7 days 1 6 days 2 5

我有一个熊猫数据框,里面有两个日期。我想计算他们之间的天数差。但由此产生的差异看起来像字符串ex(“7天”)。有没有办法将此更改为整数日期差

y['datePulled'] = pd.to_datetime(y['datePulled'])
y['Dates'] = pd.to_datetime(y['Dates'])
y['Datediff'] = y['datePulled'] - y['Dates']
y['Datediff']
0    7 days
1    6 days
2    5 days
3    4 days
4    3 days
5    2 days
6    1 days
您可以使用:

(y['Datediff'] / np.timedelta64(1, 'D')).astype(int)
或:

样本:

import pandas as pd
import numpy as np

y = pd.DataFrame({ 'datePulled': ['2016-01-05','2016-01-04'], 
                    'Dates': ['2016-01-01','2016-01-02']})

y['datePulled'] = pd.to_datetime(y['datePulled'])
y['Dates'] = pd.to_datetime(y['Dates'])
y['Datediff'] = y['datePulled'] - y['Dates']
print (y)

#output is float, cast to int
y['Datediff1'] = (y['Datediff'] / np.timedelta64(1, 'D')).astype(int)

y['Datediff2'] = y['Datediff'].dt.days
print (y)
       Dates datePulled  Datediff  Datediff1  Datediff2
0 2016-01-01 2016-01-05    4 days          4          4
1 2016-01-02 2016-01-04    2 days          2          2
在较大的数据帧中,第一种方法速度更快:

y = pd.concat([y]*1000).reset_index(drop=True)

In [236]: %timeit (y['Datediff'] / np.timedelta64(1, 'D')).astype(int)
1000 loops, best of 3: 789 µs per loop

In [237]: %timeit y['Datediff'].dt.days
100 loops, best of 3: 15.3 ms per loop

获取以下错误:回溯(最近一次调用last):文件“”,第1行,类型错误:元数据中的日期时间单位“d”无效
y = pd.concat([y]*1000).reset_index(drop=True)

In [236]: %timeit (y['Datediff'] / np.timedelta64(1, 'D')).astype(int)
1000 loops, best of 3: 789 µs per loop

In [237]: %timeit y['Datediff'].dt.days
100 loops, best of 3: 15.3 ms per loop