如何在Python中从.xlsx读取时间?
我正在尝试使用Pandas读取一个.xlsx文件,该文件包含两列:时间持续时间(mm:ss.cs)和日期(mm/dd/yyyy): 我只是简单地调用了如何在Python中从.xlsx读取时间?,python,pandas,time,xlsx,Python,Pandas,Time,Xlsx,我正在尝试使用Pandas读取一个.xlsx文件,该文件包含两列:时间持续时间(mm:ss.cs)和日期(mm/dd/yyyy): 我只是简单地调用了read\u excel(): 执行后,df为: Time Date 0 1900-02-02 12:01:28.010000 1/11/2020 1 1900-02-02 12:01:47.250000 5/19/2019 2 1900-02-02 12:01:57.160000 1/11/2020 3 1900-02
read\u excel()
:
执行后,df为:
Time Date
0 1900-02-02 12:01:28.010000 1/11/2020
1 1900-02-02 12:01:47.250000 5/19/2019
2 1900-02-02 12:01:57.160000 1/11/2020
3 1900-02-02 12:01:20.820000 1/12/2020
4 1900-02-02 12:01:25.180000 3/2/2019
5 1900-02-02 12:01:52 12/17/2017
6 1900-02-02 12:01:34.820000 5/19/2019
7 1900-02-02 12:01:29.020000 2/29/2020
8 1900-02-02 12:02:14.420000 11/17/2017
9 1900-02-02 12:01:44.160000 6/30/2019
10 1900-02-02 12:01:38.900000 6/29/2019
11 1900-02-02 12:01:35.830000 11/16/2019
12 1900-02-02 12:01:39.990000 1/13/2019
13 1900-02-02 12:01:43.040000 3/3/2019
14 1900-02-02 12:01:51 10/13/2018
15 1900-02-02 12:02:52.830000 1/12/2020
16 33.50048090277778 3/1/2020
17 33.50048275462963 11/16/2019
18 33.50049525462963 4/13/2019
19 33.50051261574074 10/13/2019
20 33.50052291666667 1/13/2019
21 33.50052708333333 12/16/2018
22 33.50053125 3/3/2019
23 33.500586921296296 10/14/2018
24 1900-02-02 12:01:04.010000 1/21/2018
25 1900-02-02 12:01:04.580000 10/15/2017
26 1900-02-02 12:01:06.100000 11/18/2017
27 33.500560416666666 6/29/2019
28 33.50057731481481 6/1/2019
29 33.5005806712963 7/7/2019
30 1900-02-02 12:01:00.750000 6/2/2018
31 33.500622800925925 1/12/2020
32 33.500644791666666 10/13/2019
33 33.50069386574074 1/12/2019
34 1900-02-02 12:01:06.200000 7/7/2019
35 1900-02-02 12:01:08.090000 5/19/2019
36 33.50058518518519 12/15/2018
37 1900-02-02 12:01:14.410000 12/17/2017
38 33.50064733796296 6/1/2019
39 1900-02-02 12:01:16.870000 6/2/2018
40 33.50038888888889 2/29/2020
41 33.50040439814815 1/11/2020
42 33.50040833333333 11/16/2019
43 33.50044074074074 1/12/2019
44 33.50044988425926 3/3/2019
45 33.50046435185185 12/16/2018
46 33.50046759259259 10/13/2018
47 33.50056875 12/16/2017
48 33.500570949074074 1/20/2018
49 33.50060983796296 11/17/2017
50 33.50061643518519 10/14/2017
51 33.50050231481482 7/7/2019
52 33.500507407407405 6/1/2019
53 33.500539699074075 6/2/2018
54 33.500544328703704 6/30/2019
有些时间是通过添加前缀“1900-02-02 12:”转换的,有些时间被更改为“33.5xxxxxxx”,这看起来完全是一团糟
我尝试将时间作为字符串来读取,但得到了相同的df:
df = pd.read_excel(r'test.xlsx', dtype={'Time':str})
因此,我的问题是:如何正确读取时间值-保持其原始格式,将其转换为毫秒或Python识别的任何时间格式?我将使用
Time
像您一样读取数据,并解析字符串:
df = pd.read_excel(r'test.xlsx', dtype={'Time':str})
# parse the string time
min_sec = ('0:' + df.Time).str.extract('(?P<min>\d+):(?P<sec>[\d\.]+)$').astype(float)
# convert to timedelta
df['Time'] = pd.to_timedelta(min_sec['min'] * 60 + min_sec['sec'], unit='s')
我会像你一样用
时间
字符串读取数据,并解析字符串:
df = pd.read_excel(r'test.xlsx', dtype={'Time':str})
# parse the string time
min_sec = ('0:' + df.Time).str.extract('(?P<min>\d+):(?P<sec>[\d\.]+)$').astype(float)
# convert to timedelta
df['Time'] = pd.to_timedelta(min_sec['min'] * 60 + min_sec['sec'], unit='s')
相关的:相关的:按预期工作。谢谢,一切正常。非常感谢。
df = pd.read_excel(r'test.xlsx', dtype={'Time':str})
# parse the string time
min_sec = ('0:' + df.Time).str.extract('(?P<min>\d+):(?P<sec>[\d\.]+)$').astype(float)
# convert to timedelta
df['Time'] = pd.to_timedelta(min_sec['min'] * 60 + min_sec['sec'], unit='s')
# min_sec
min sec
0 1.0 28.01
1 1.0 29.02
2 2.0 14.42
3 1.0 44.16
4 1.0 47.25
5 1.0 57.16
6 1.0 20.82
7 1.0 25.18
8 1.0 52.00
9 1.0 34.82
10 1.0 38.90
11 1.0 35.83
12 1.0 39.99
13 1.0 43.04
14 1.0 51.00
15 2.0 52.83
16 0.0 41.55
17 0.0 41.71
18 0.0 42.79
19 0.0 44.29
# df
Time Date
0 00:01:28.010000 1/11/2020
1 00:01:29.020000 2/29/2020
2 00:02:14.420000 11/17/2017
3 00:01:44.160000 6/30/2019
4 00:01:47.250000 5/19/2019