Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/301.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何在数据帧中筛选具有指定条件的行,并将它们放入新的数据帧中?_Python_Pandas - Fatal编程技术网

Python 如何在数据帧中筛选具有指定条件的行,并将它们放入新的数据帧中?

Python 如何在数据帧中筛选具有指定条件的行,并将它们放入新的数据帧中?,python,pandas,Python,Pandas,test.csv的数据如下所示: staff_id,clock_time,device_id,latitude,longitude 1001,2020/9/14 04:43:00,d_1,24.59652556,118.0824644 1001,2020/9/14 05:34:40,d_1,24.59732974,118.0859631 1001,2020/9/14 06:33:34,d_1,24.73208312,118.0957197 1001,2020/9/14 08:17:29,d_1,

test.csv的数据如下所示:

staff_id,clock_time,device_id,latitude,longitude
1001,2020/9/14 04:43:00,d_1,24.59652556,118.0824644
1001,2020/9/14 05:34:40,d_1,24.59732974,118.0859631
1001,2020/9/14 06:33:34,d_1,24.73208312,118.0957197
1001,2020/9/14 08:17:29,d_1,24.59222786,118.0955275
1001,2020/9/20 05:30:56,d_1,24.59689407,118.2863806
1001,2020/9/20 07:26:05,d_1,24.58237852,118.2858955
1001,2020/9/20 08:26:05,d_1,24.58237852,118.2858955
1001,2020/9/20 09:26:05,d_1,24.58237852,118.2858955
1001,2020/9/20 17:26:05,d_1,24.58237852,118.2858955
1001,2020/9/20 19:26:05,d_1,24.70237852,118.2858955
1001,2020/9/20 22:26:05,d_1,24.71237852,118.2858955
staff_id,clock_time,device_id,latitude,longitude
1001,2020/9/14 05:34:40,d_1,24.59732974,118.0859631
1001,2020/9/14 06:33:34,d_1,24.73208312,118.0957197
1001,2020/9/14 08:17:29,d_1,24.59222786,118.0955275
1001,2020/9/20 05:30:56,d_1,24.59689407,118.2863806
1001,2020/9/20 17:26:05,d_1,24.58237852,118.2858955
1001,2020/9/20 19:26:05,d_1,24.70237852,118.2858955
我想找到两个连续行的经度或纬度差大于0.1的任何行,然后将结果放入新的数据帧中

在我的示例中,第2、3、4、9、10行的纬度差大于0.1,第4、5行的经度差大于0.1

我希望新的数据帧如下所示:

staff_id,clock_time,device_id,latitude,longitude
1001,2020/9/14 04:43:00,d_1,24.59652556,118.0824644
1001,2020/9/14 05:34:40,d_1,24.59732974,118.0859631
1001,2020/9/14 06:33:34,d_1,24.73208312,118.0957197
1001,2020/9/14 08:17:29,d_1,24.59222786,118.0955275
1001,2020/9/20 05:30:56,d_1,24.59689407,118.2863806
1001,2020/9/20 07:26:05,d_1,24.58237852,118.2858955
1001,2020/9/20 08:26:05,d_1,24.58237852,118.2858955
1001,2020/9/20 09:26:05,d_1,24.58237852,118.2858955
1001,2020/9/20 17:26:05,d_1,24.58237852,118.2858955
1001,2020/9/20 19:26:05,d_1,24.70237852,118.2858955
1001,2020/9/20 22:26:05,d_1,24.71237852,118.2858955
staff_id,clock_time,device_id,latitude,longitude
1001,2020/9/14 05:34:40,d_1,24.59732974,118.0859631
1001,2020/9/14 06:33:34,d_1,24.73208312,118.0957197
1001,2020/9/14 08:17:29,d_1,24.59222786,118.0955275
1001,2020/9/20 05:30:56,d_1,24.59689407,118.2863806
1001,2020/9/20 17:26:05,d_1,24.58237852,118.2858955
1001,2020/9/20 19:26:05,d_1,24.70237852,118.2858955
我的代码:

import pandas as pd

df = pd.read_csv(r'E:/test.csv', encoding='utf-8', parse_dates=[1])
m1 = df[['latitude', 'longitude']].diff().abs().gt(0.1)
m2 = df[['latitude', 'longitude']].shift().diff().abs().gt(0.1)
new_dataframe = [...]
如何操作?

用于将布尔值的数据帧转换为
系列
,并用于移位掩码添加,使用
链进行按位
操作,最后一次添加用于避免警告,如果过滤后将以某种方式处理
新数据帧

m1 = df[['latitude', 'longitude']].diff().abs().gt(0.1).any(axis=1)

new_dataframe = df[m1 | m1.shift(-1)].copy()
print (new_dataframe)
   staff_id          clock_time device_id   latitude   longitude
1      1001  2020/9/14 05:34:40       d_1  24.597330  118.085963
2      1001  2020/9/14 06:33:34       d_1  24.732083  118.095720
3      1001  2020/9/14 08:17:29       d_1  24.592228  118.095527
4      1001  2020/9/20 05:30:56       d_1  24.596894  118.286381
8      1001  2020/9/20 17:26:05       d_1  24.582379  118.285896
9      1001  2020/9/20 19:26:05       d_1  24.702379  118.285896
用于将布尔值的数据帧转换为
系列
,并用于移位掩码添加,按位
使用
链,最后一次添加用于避免警告,如果过滤后将以某种方式处理
新数据帧

m1 = df[['latitude', 'longitude']].diff().abs().gt(0.1).any(axis=1)

new_dataframe = df[m1 | m1.shift(-1)].copy()
print (new_dataframe)
   staff_id          clock_time device_id   latitude   longitude
1      1001  2020/9/14 05:34:40       d_1  24.597330  118.085963
2      1001  2020/9/14 06:33:34       d_1  24.732083  118.095720
3      1001  2020/9/14 08:17:29       d_1  24.592228  118.095527
4      1001  2020/9/20 05:30:56       d_1  24.596894  118.286381
8      1001  2020/9/20 17:26:05       d_1  24.582379  118.285896
9      1001  2020/9/20 19:26:05       d_1  24.702379  118.285896

我可以使用
new|u dataframe=df[m1 | m1.shift(-1)]
?我可以使用
new|u dataframe=df[m1 | m1.shift(-1)]