Python 从多个时区的时间戳中提取日期

Python 从多个时区的时间戳中提取日期,python,pandas,Python,Pandas,我有一个Pandas数据框,在该数据框中,我根据时区列将小时转换为本地小时。现在,我想将日期从local_hour提取为local_date,但我得到一个错误,即Tz-aware-datetime.datetime不能转换为datetime64,除非utc=True。我该怎么做 # Create dataframe import pandas as pd df = pd.DataFrame({ 'hour': ['2019-01-01 05:00:00', '2019-01-01 07:0

我有一个Pandas数据框,在该数据框中,我根据
时区
列将
小时
转换为
本地小时
。现在,我想将日期从
local_hour
提取为
local_date
,但我得到一个错误,即
Tz-aware-datetime.datetime不能转换为datetime64,除非utc=True
。我该怎么做

# Create dataframe
import pandas as pd
df = pd.DataFrame({
   'hour': ['2019-01-01 05:00:00', '2019-01-01 07:00:00', '2019-01-01 08:00:00'],
   'time_zone': ['US/Eastern', 'US/Central', 'US/Mountain']
})

# Convert hour to datetime and localize to UTC
df['hour'] = pd.to_datetime(df['hour']).dt.tz_localize('UTC')

# Get local_hour
df['local_hour'] = df.apply(lambda row: row['hour'].tz_convert(row['time_zone']), axis=1)

# Try to get local_date from local_hour
df['local_date'] = pd.to_datetime(df['local_hour'].dt.date)
ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True

似乎只有当
local\u hour
列包含不同的时区时,问题才会出现。如果一切都在同一个时区,这将是有效的:

# Work: the whole column in a single timezone
df['local_hour'] = df['hour'].dt.tz_convert('America/New_York')
df['local_hour'].dt.date

# Not work: column contains a mix of timezones
df['local_hour'] = df.apply(lambda row: row['hour'].tz_convert(row['time_zone']), axis=1)
df['local_hour'].dt.date

ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True
我建议你向熊猫队提出一个问题。同时,您可以使用本质上是一个循环的
apply

tmp = df['local_hour'].apply(lambda t: pd.Series({
    'date': t.date(),
    'hour': t.hour
}))
df = pd.concat([df, tmp], axis=1)

以下解决方案适合我:

df['local_date'] = pd.to_datetime(df['hour'], infer_datetime_format=True, utc=True )
df['local_date'] = df['local_date'].dt.tz_localize('Europe/Amsterdam')
显然,这些方法可以连接起来,但为了可读性可以分开

  • utc=True:返回utc DatetimeIndex,该索引是允许tz\u localize()方法所需的索引
  • expert\u datetime\u format=True:另一个有用的参数,用于尝试推断日期时间字符串的格式
引用