Python 2.7 从编码时间的numpy数组中获取小时和分钟

Python 2.7 从编码时间的numpy数组中获取小时和分钟,python-2.7,numpy,pandas,Python 2.7,Numpy,Pandas,我有一个大的numpy数组来编码时间。假设我们有这样的东西: from pandas import DataFrame t = {'time': ['08:35', '08:38', '13:42', '13:46']} df = DataFrame(t) import numpy as np time_array = np.array(df.time) print time_array 输出: ['08:35' '08:38' '13:42' '13:46'] [ 8 35] [ 8 3

我有一个大的numpy数组来编码时间。假设我们有这样的东西:

from pandas import DataFrame
t = {'time': ['08:35', '08:38', '13:42', '13:46']}
df = DataFrame(t)

import numpy as np
time_array = np.array(df.time)
print time_array
输出:

['08:35' '08:38' '13:42' '13:46']
[ 8 35]
[ 8 38]
[13 42]
[13 46]
是否有一种有效的方法从时间数组中分别获取小时和分钟

当然,这可以在循环中完成:

for i in range(len(time_array)):
    print np.fromstring(time_array[i], dtype=int, sep=":")
输出:

['08:35' '08:38' '13:42' '13:46']
[ 8 35]
[ 8 38]
[13 42]
[13 46]
但我正在寻找一种“更快”的矢量化方法,如果有的话

编辑:

我已经对解决方案进行了计时(请参见下面的代码)

输出:1个回路,每个回路的最佳输出为3:3.02秒

Paul H的解决方案1:

def foo2(df):
    df['hour'] = df['time'].apply(lambda x: int(x.split(':')[0]))
    df['minute'] = df['time'].apply(lambda x: int(x.split(':')[1]))

%timeit foo2(df)
输出:1个循环,最佳3个:每个循环4.31秒

Paul H的解决方案2:

import time
def foo3(df):
    df['hour'] = df['time'].apply(lambda x: time.strptime(x, '%H:%M').tm_hour)
    df['minute'] = df['time'].apply(lambda x: time.strptime(x, '%H:%M').tm_min)

%timeit foo3(df)

输出:1圈,最好3圈:42.1秒

from pandas import DataFrame
t = {'time': ['08:35', '08:38', '13:42', '13:46']}
df = DataFrame(t)
df['hour'] = df['time'].apply(lambda x: int(x.split(':')[0]))
df['minute'] = df['time'].apply(lambda x: int(x.split(':')[1]))
print(df)

    time  hour  minute
0  08:35     8      35
1  08:38     8      38
2  13:42    13      42
3  13:46    13      46
然后您可以执行
df['hour'].values
以获取小时数组

编辑: 仅仅为了露齿而笑,你也可以做:

import time
df['hour'] = df.timestring.apply(lambda x: time.strptime(x, '%H:%M').tm_hour)
df['minute'] = df.timestring.apply(lambda x: time.strptime(x, '%H:%M').tm_min)