Python 如何添加具有时间增量的数据帧?
我有一个数据帧Python 如何添加具有时间增量的数据帧?,python,pandas,dataframe,datetime,timedelta,Python,Pandas,Dataframe,Datetime,Timedelta,我有一个数据帧df2,由于时差,我想创建一个新时间。数据帧来自的csv文件如下所示: ip date time zone 162.93.65.ggf 2014-03-06 2014-03-06 00:00:14 0 162.93.65.ggf 2014-07-10 2014-07-10 00:00:28 500 162.93.65.ggf 2013-11-21 2013-11-21 00:00:45 5
df2
,由于时差,我想创建一个新时间。数据帧来自的csv文件如下所示:
ip date time zone
162.93.65.ggf 2014-03-06 2014-03-06 00:00:14 0
162.93.65.ggf 2014-07-10 2014-07-10 00:00:28 500
162.93.65.ggf 2013-11-21 2013-11-21 00:00:45 500
162.93.65.ggf 2014-02-22 2014-02-22 00:00:51 0
162.93.65.ggf 2014-03-06 2014-03-06 00:01:05 0
162.93.65.ggf 2013-11-21 2013-11-21 00:01:06 0
162.93.65.ggf 2014-02-22 2014-02-22 00:01:11 400
162.93.65.ggf 2014-03-06 2014-03-06 00:01:13 400
162.93.65.ggf 2013-11-21 2013-11-21 00:01:32 400
162.93.65.ggf 2014-03-06 2014-03-06 00:01:58 0
162.93.65.ggf 2013-11-21 2013-11-21 00:02:10 0
...
zone
列包含值0、400和500,这意味着time
中的日期时间必须加0、4或5。之后,必须根据ip地址来自哪个时区添加时间和分钟。
我的代码:
df2 = pd.read_csv("file.csv", parse_dates=True)
df2['time'] = pd.to_datetime(df2['time'])
df2['zone2']= df2['zone'].astype(str).str[0]
df2['new_time']= df2['time']+ timedelta(hours=df2['zone2'])
小时和分钟是从另一个csv文件中调整的,这里要提到的很复杂,可以设置为零。问题出在hours=hours+df2['zone2']
中,但我不知道如何解决它
我的预期产出是:
ip date time zone new_time
162.93.65.ggf 2014-03-06 2014-03-06 00:00:14 0 2014-03-06 00:00:14
162.93.65.ggf 2014-07-10 2014-07-10 00:00:28 500 2014-07-10 05:00:28
162.93.65.ggf 2013-11-21 2013-11-21 00:00:45 500 2013-11-21 05:00:45
162.93.65.ggf 2014-02-22 2014-02-22 00:00:51 0 ...
162.93.65.ggf 2014-03-06 2014-03-06 00:01:05 0
162.93.65.ggf 2013-11-21 2013-11-21 00:01:06 0
162.93.65.ggf 2014-02-22 2014-02-22 00:01:11 400
162.93.65.ggf 2014-03-06 2014-03-06 00:01:13 400
162.93.65.ggf 2013-11-21 2013-11-21 00:01:32 400
162.93.65.ggf 2014-03-06 2014-03-06 00:01:58 0
162.93.65.ggf 2013-11-21 2013-11-21 00:02:10 0
...
假设可以安全地忽略第二位和以后的数字, 使用: 输出
ip date time zone new_time
0 162.93.65.ggf 2014-03-06 2014-03-06 00:00:14 0 2014-03-06 00:00:14
1 162.93.65.ggf 2014-07-10 2014-07-10 00:00:28 500 2014-07-10 05:00:28
2 162.93.65.ggf 2013-11-21 2013-11-21 00:00:45 500 2013-11-21 05:00:45
3 162.93.65.ggf 2014-02-22 2014-02-22 00:00:51 0 2014-02-22 00:00:51
4 162.93.65.ggf 2014-03-06 2014-03-06 00:01:05 0 2014-03-06 00:01:05
5 162.93.65.ggf 2013-11-21 2013-11-21 00:01:06 0 2013-11-21 00:01:06
6 162.93.65.ggf 2014-02-22 2014-02-22 00:01:11 400 2014-02-22 04:01:11
7 162.93.65.ggf 2014-03-06 2014-03-06 00:01:13 400 2014-03-06 04:01:13
8 162.93.65.ggf 2013-11-21 2013-11-21 00:01:32 400 2013-11-21 04:01:32
9 162.93.65.ggf 2014-03-06 2014-03-06 00:01:58 0 2014-03-06 00:01:58
10 162.93.65.ggf 2013-11-21 2013-11-21 00:02:10 0 2013-11-21 00:02:10
您的代码中的时间是多少?
time
是我发布的csv文件中的列。其他时候你可以忽略我会编辑它你到底在添加什么?就几分钟?因此,第二行将从00:00:28
更改为00:00:32
显示您的预期输出。因此zone
的第一个字母是必须添加到df['time']
的小时数,其余字母可以安全地忽略?编辑new\u time
的前三行,我正要发布我的答案,但意识到它和你的一样!
ip date time zone new_time
0 162.93.65.ggf 2014-03-06 2014-03-06 00:00:14 0 2014-03-06 00:00:14
1 162.93.65.ggf 2014-07-10 2014-07-10 00:00:28 500 2014-07-10 05:00:28
2 162.93.65.ggf 2013-11-21 2013-11-21 00:00:45 500 2013-11-21 05:00:45
3 162.93.65.ggf 2014-02-22 2014-02-22 00:00:51 0 2014-02-22 00:00:51
4 162.93.65.ggf 2014-03-06 2014-03-06 00:01:05 0 2014-03-06 00:01:05
5 162.93.65.ggf 2013-11-21 2013-11-21 00:01:06 0 2013-11-21 00:01:06
6 162.93.65.ggf 2014-02-22 2014-02-22 00:01:11 400 2014-02-22 04:01:11
7 162.93.65.ggf 2014-03-06 2014-03-06 00:01:13 400 2014-03-06 04:01:13
8 162.93.65.ggf 2013-11-21 2013-11-21 00:01:32 400 2013-11-21 04:01:32
9 162.93.65.ggf 2014-03-06 2014-03-06 00:01:58 0 2014-03-06 00:01:58
10 162.93.65.ggf 2013-11-21 2013-11-21 00:02:10 0 2013-11-21 00:02:10