Pandas 熊猫从csv加载一天中的时间作为日期时间
我有一个如下格式的csv:Pandas 熊猫从csv加载一天中的时间作为日期时间,pandas,Pandas,我有一个如下格式的csv: start,end,name 12:00:00,14:00:00,thomas 14:00:00,16:00:00,hans 16:00:00,18:00:00,toby 18:00:00,20:00:00,ken 20:00:00,22:00:00,lisa 22:00:00,00:00:00,joe 我如何告诉pandas将开始和结束视为日期时间,即使他们在加载csv时没有附加任何日期?当您阅读csv时,您可以解析日期 df = pd.read_csv(file
start,end,name
12:00:00,14:00:00,thomas
14:00:00,16:00:00,hans
16:00:00,18:00:00,toby
18:00:00,20:00:00,ken
20:00:00,22:00:00,lisa
22:00:00,00:00:00,joe
我如何告诉pandas将开始和结束视为日期时间,即使他们在加载csv时没有附加任何日期?当您阅读csv时,您可以
解析日期
df = pd.read_csv(files, parse_dates=['start','end'],
date_parser=lambda x: pd.datetime(x).time())
另一种方法是使用转换器
cov = dict(start = pd.to_timedelta, end = pd.to_timedelta)
df = pd.read_csv(files, converters = cov)
当您阅读csv时,您可以
parse_dates
df = pd.read_csv(files, parse_dates=['start','end'],
date_parser=lambda x: pd.datetime(x).time())
另一种方法是使用转换器
cov = dict(start = pd.to_timedelta, end = pd.to_timedelta)
df = pd.read_csv(files, converters = cov)
您只需在调用
pd.read\u csv
时指定parse\u dates=['start','end']
例如,如果您的数据位于名为“data.csv”的文件中,则此代码将作为日期时间加载start
和end
:
df = pd.read_csv('data.csv', parse_dates=['start', 'end'])
最简单的例子:
import pandas as pd
contents = """start,end,name
12:00:00,14:00:00,thomas
14:00:00,16:00:00,hans
16:00:00,18:00:00,toby
18:00:00,20:00:00,ken
20:00:00,22:00:00,lisa
22:00:00,00:00:00,joe"""
with open('data.csv', 'w') as f_handle:
f_handle.write(contents)
df = pd.read_csv('data.csv', parse_dates=['start', 'end'])
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 start 6 non-null datetime64[ns]
1 end 6 non-null datetime64[ns]
2 name 6 non-null object
dtypes: datetime64[ns](2), object(1)
memory usage: 272.0+ bytes
给出:
start end name
0 2020-04-01 12:00:00 2020-04-01 14:00:00 thomas
1 2020-04-01 14:00:00 2020-04-01 16:00:00 hans
2 2020-04-01 16:00:00 2020-04-01 18:00:00 toby
3 2020-04-01 18:00:00 2020-04-01 20:00:00 ken
4 2020-04-01 20:00:00 2020-04-01 22:00:00 lisa
5 2020-04-01 22:00:00 2020-04-01 00:00:00 joe
如您所见,
pandas
假定时间是今天的日期。您只需在调用pd.read\u csv
时指定parse\u dates=['start','end']
例如,如果您的数据位于名为“data.csv”的文件中,则此代码将作为日期时间加载start
和end
:
df = pd.read_csv('data.csv', parse_dates=['start', 'end'])
最简单的例子:
import pandas as pd
contents = """start,end,name
12:00:00,14:00:00,thomas
14:00:00,16:00:00,hans
16:00:00,18:00:00,toby
18:00:00,20:00:00,ken
20:00:00,22:00:00,lisa
22:00:00,00:00:00,joe"""
with open('data.csv', 'w') as f_handle:
f_handle.write(contents)
df = pd.read_csv('data.csv', parse_dates=['start', 'end'])
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 start 6 non-null datetime64[ns]
1 end 6 non-null datetime64[ns]
2 name 6 non-null object
dtypes: datetime64[ns](2), object(1)
memory usage: 272.0+ bytes
给出:
start end name
0 2020-04-01 12:00:00 2020-04-01 14:00:00 thomas
1 2020-04-01 14:00:00 2020-04-01 16:00:00 hans
2 2020-04-01 16:00:00 2020-04-01 18:00:00 toby
3 2020-04-01 18:00:00 2020-04-01 20:00:00 ken
4 2020-04-01 20:00:00 2020-04-01 22:00:00 lisa
5 2020-04-01 22:00:00 2020-04-01 00:00:00 joe
正如你所看到的,
pandas
假设时间是今天的日期。啊,快告诉我!当然,唯一的问题是,通过使用pd.datetime.strtime
可以恢复为字符串。也许最好使用pd.to\u datetime
并指定格式
?啊,快告诉我!当然,唯一的问题是,通过使用pd.datetime.strtime
可以恢复为字符串。也许最好使用指定了格式的pd.to\u datetime
?