Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/grails/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 比较给定列2×2的值的最佳方法_Python_Pandas - Fatal编程技术网

Python 比较给定列2×2的值的最佳方法

Python 比较给定列2×2的值的最佳方法,python,pandas,Python,Pandas,为了创建一个新的列,我将对数据帧中给定列的值进行二乘二的比较(前一列与当前列) 我的输入df如下所示: timestamp charging 0 2017-10-15 18:36:46 1 1 2017-10-15 18:41:54 1 2 2017-10-15 18:46:54 1 3 2017-10-15 18:50:35 1 4 2017-10-15 18:54:14 -1 5 2017

为了创建一个新的列,我将对数据帧中给定列的值进行二乘二的比较(前一列与当前列)

我的输入df如下所示:

            timestamp  charging
0 2017-10-15 18:36:46         1
1 2017-10-15 18:41:54         1
2 2017-10-15 18:46:54         1
3 2017-10-15 18:50:35         1
4 2017-10-15 18:54:14        -1
5 2017-10-15 18:57:54        -1
6 2017-10-15 19:02:47        -1
7 2017-10-15 19:11:41         1
8 2017-10-15 19:21:25         1
9 2017-10-15 19:31:04        -1
我只想在充电值从正变为负或从负变为正时,创建具有相同时间戳值的新列。 输出应为:

            timestamp  charging period start/end time
0 2017-10-15 18:36:46         1                   NaT
1 2017-10-15 18:41:54         1                   NaT
2 2017-10-15 18:46:54         1                   NaT
3 2017-10-15 18:50:35         1   2017-10-15 18:50:35
4 2017-10-15 18:54:14        -1   2017-10-15 18:54:14
5 2017-10-15 18:57:54        -1                   NaT
6 2017-10-15 19:02:47        -1   2017-10-15 19:02:47
7 2017-10-15 19:11:41         1   2017-10-15 19:11:41
8 2017-10-15 19:21:25         1   2017-10-15 19:21:25
9 2017-10-15 19:31:04        -1   2017-10-15 19:31:04
我这样做的方式不好(但可以使用以下代码):

df['period start/end time'] = pd.NaT

for ind in df.index:
    if ind > 0:
       if df.at[ind, 'charging'] > 0 and df.at[ind-1, 'charging'] < 0:
          df.at[ind-1, 'period start/end time'] = df.at[ind-1, 'timestamp']
          df.at[ind, 'period start/end time'] = df.at[ind, 'timestamp']

       if df.at[ind, 'charging'] < 0 and df.at[ind-1, 'charging'] > 0:
          df.at[ind-1, 'period start/end time'] = df.at[ind-1, 'timestamp']
          df.at[ind, 'period start/end time'] = df.at[ind, 'timestamp']
df[“时段开始/结束时间”]=pd.NaT
对于df.index中的ind:
如果ind>0:
如果[ind'充电']>0且[ind-1'充电']<0:
df.at[ind-1,'期间开始/结束时间']=df.at[ind-1,'时间戳']
df.at[ind,'期间开始/结束时间']=df.at[ind,'时间戳']
如果[ind'充电']<0且[ind-1'充电']>0时:
df.at[ind-1,'期间开始/结束时间']=df.at[ind-1,'时间戳']
df.at[ind,'期间开始/结束时间']=df.at[ind,'时间戳']
这太费时了!,有没有办法更快更好地完成这项工作?

IIUC

mask = (df.charging != df.charging.shift().bfill())
df.loc[mask | mask.shift(-1).fillna(False), 'new']  = df.timestamp

    timestamp             charging  new
0   2017-10-15 18:36:46   1         NaT
1   2017-10-15 18:41:54   1         NaT
2   2017-10-15 18:46:54   1         NaT
3   2017-10-15 18:50:35   1         2017-10-15 18:50:35
4   2017-10-15 18:54:14  -1         2017-10-15 18:54:14
5   2017-10-15 18:57:54  -1         NaT
6   2017-10-15 19:02:47  -1         2017-10-15 19:02:47
7   2017-10-15 19:11:41   1         2017-10-15 19:11:41
8   2017-10-15 19:21:25   1         2017-10-15 19:21:25
9   2017-10-15 19:31:04  -1         2017-10-15 19:31:04
IIUC

创建遮罩:

condition = df.charging.diff().bfill().ne(0) | df.charging.diff().shift(-1).ne(0)
使用
np.where

df['new'] = np.where(condition, df.timestamp, pd.NaT)   

            timestamp  charging                 new
0  2017-10-1518:36:46         1                 NaT
1  2017-10-1518:41:54         1                 NaT
2  2017-10-1518:46:54         1                 NaT
3  2017-10-1518:50:35         1  2017-10-1518:50:35
4  2017-10-1518:54:14        -1  2017-10-1518:54:14
5  2017-10-1518:57:54        -1                 NaT
6  2017-10-1519:02:47        -1  2017-10-1519:02:47
7  2017-10-1519:11:41         1  2017-10-1519:11:41
8  2017-10-1519:21:25         1  2017-10-1519:21:25
9  2017-10-1519:31:04        -1  2017-10-1519:31:04
创建遮罩:

condition = df.charging.diff().bfill().ne(0) | df.charging.diff().shift(-1).ne(0)
使用
np.where

df['new'] = np.where(condition, df.timestamp, pd.NaT)   

            timestamp  charging                 new
0  2017-10-1518:36:46         1                 NaT
1  2017-10-1518:41:54         1                 NaT
2  2017-10-1518:46:54         1                 NaT
3  2017-10-1518:50:35         1  2017-10-1518:50:35
4  2017-10-1518:54:14        -1  2017-10-1518:54:14
5  2017-10-1518:57:54        -1                 NaT
6  2017-10-1519:02:47        -1  2017-10-1519:02:47
7  2017-10-1519:11:41         1  2017-10-1519:11:41
8  2017-10-1519:21:25         1  2017-10-1519:21:25
9  2017-10-1519:31:04        -1  2017-10-1519:31:04

第8行不应该也有时间戳吗?是的,我的错误第8行不应该也有时间戳吗?是的,我的错误花了我一段时间来理解逻辑,但它相当聪明!,thx:)我花了一段时间才弄明白其中的逻辑,但它相当聪明!,thx:)