Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/ssis/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 计算数据帧的时差_Python_Datetime_Time_Pandas - Fatal编程技术网

Python 计算数据帧的时差

Python 计算数据帧的时差,python,datetime,time,pandas,Python,Datetime,Time,Pandas,我有一个熊猫数据框,其中索引如下: Index([16/May/2013:23:56:43, 16/May/2013:23:56:42, 16/May/2013:23:56:43, ..., 17/May/2013:23:54:45, 17/May/2013:23:54:45, 17/May/2013:23:54:45], dtype=object) 我用以下方法计算了后续事件的时差 df2['tvalue'] = df2.index df2['tvalue'] = np.datetime64

我有一个熊猫数据框,其中索引如下:

Index([16/May/2013:23:56:43, 16/May/2013:23:56:42, 16/May/2013:23:56:43, ..., 17/May/2013:23:54:45, 17/May/2013:23:54:45, 17/May/2013:23:54:45], dtype=object)
我用以下方法计算了后续事件的时差

df2['tvalue'] = df2.index
df2['tvalue'] = np.datetime64(df2['tvalue'])
df2['delta'] = (df2['tvalue']-df2['tvalue'].shift()).fillna(0)
所以我得到了以下输出

    Time                      tvalue delta                                          
16/May/2013:23:56:43   2013-05-01 13:23:56 00:00:00  
16/May/2013:23:56:42   2013-05-01 13:23:56 00:00:00  
16/May/2013:23:56:43   2013-05-01 13:23:56 00:00:00  
16/May/2013:23:56:43   2013-05-01 13:23:56 00:00:00  
16/May/2013:23:56:48   2013-05-01 13:23:56 00:00:00  
16/May/2013:23:56:48   2013-05-01 13:23:56 00:00:00  
16/May/2013:23:56:48   2013-05-01 13:23:56 00:00:00  
16/May/2013:23:57:44   2013-05-01 13:23:57 00:00:01  
16/May/2013:23:57:44   2013-05-01 13:23:57 00:00:00  
16/May/2013:23:57:44   2013-05-01 13:23:57 00:00:00  

但是它计算了以小时为单位的时间差,并且日期也不同?这里会有什么问题?

解析您的日期非常重要,我认为strtime可以做到,但对我来说不起作用。上面的示例时间只是字符串,而不是日期时间

In [140]: from dateutil import parser

In [130]: def parse(x):
   .....:     date, hh, mm, ss = x.split(':')
   .....:     dd, mo, yyyy = date.split('/')
   .....:     return parser.parse("%s %s %s %s:%s:%s" % (yyyy,mo,dd,hh,mm,ss))
   .....: 

In [131]: map(parse,idx)
Out[131]: 
[datetime.datetime(2013, 5, 16, 23, 56, 43),
 datetime.datetime(2013, 5, 16, 23, 56, 42),
 datetime.datetime(2013, 5, 16, 23, 56, 43),
 datetime.datetime(2013, 5, 17, 23, 54, 45),
 datetime.datetime(2013, 5, 17, 23, 54, 45),
 datetime.datetime(2013, 5, 17, 23, 54, 45)]

In [132]: pd.to_datetime(map(parse,idx))
Out[132]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-16 23:56:43, ..., 2013-05-17 23:54:45]
Length: 6, Freq: None, Timezone: None

In [133]: df = DataFrame(dict(time = pd.to_datetime(map(parse,idx))))

In [134]: df
Out[134]: 
                 time
0 2013-05-16 23:56:43
1 2013-05-16 23:56:42
2 2013-05-16 23:56:43
3 2013-05-17 23:54:45
4 2013-05-17 23:54:45
5 2013-05-17 23:54:45

In [138]: df['delta'] = (df['time']-df['time'].shift()).fillna(0)

In [139]: df
Out[139]: 
                 time     delta
0 2013-05-16 23:56:43  00:00:00
1 2013-05-16 23:56:42 -00:00:01
2 2013-05-16 23:56:43  00:00:01
3 2013-05-17 23:54:45  23:58:02
4 2013-05-17 23:54:45  00:00:00
5 2013-05-17 23:54:45  00:00:00
[140]中的
:来自dateutil导入解析器
在[130]中:def parse(x):
..:日期,hh,mm,ss=x.split(“:”)
..:dd,mo,yyyy=日期.拆分('/'))
..:返回parser.parse(“%s%s%s%s:%s:%s”%(yyyy、mo、dd、hh、mm、ss))
.....: 
In[131]:映射(parse,idx)
出[131]:
[datetime.datetime(2013,5,16,23,56,43),
datetime.datetime(2013,5,16,23,56,42),
datetime.datetime(2013,5,16,23,56,43),
datetime.datetime(2013,5,17,23,54,45),
datetime.datetime(2013,5,17,23,54,45),
datetime.datetime(2013,5,17,23,54,45)]
In[132]:pd.to_datetime(map(parse,idx))
出[132]:
[2013-05-16 23:56:43, ..., 2013-05-17 23:54:45]
长度:6,频率:无,时区:无
在[133]中:df=DataFrame(dict(time=pd.to_datetime(map(parse,idx)))
In[134]:df
出[134]:
时间
0 2013-05-16 23:56:43
1 2013-05-16 23:56:42
2 2013-05-16 23:56:43
3 2013-05-17 23:54:45
4 2013-05-17 23:54:45
5 2013-05-17 23:54:45
在[138]中:df['delta']=(df['time']-df['time'].shift()).fillna(0)
In[139]:df
出[139]:
时间增量
0 2013-05-16 23:56:43  00:00:00
1 2013-05-16 23:56:42 -00:00:01
2 2013-05-16 23:56:43  00:00:01
3 2013-05-17 23:54:45  23:58:02
4 2013-05-17 23:54:45  00:00:00
5 2013-05-17 23:54:45  00:00:00
也可以使用
df.diff()
代替
df['time']-df['time'].shift()
。稍微干净一点。@Jeff:工作得很好!:)但是处理代码需要时间!