Python Pandas-基于先前获得的子集从数据帧中删除行_Python_Pandas

Python Pandas-基于先前获得的子集从数据帧中删除行

python pandas

Python Pandas-基于先前获得的子集从数据帧中删除行,python,pandas,Python,Pandas,我正在运行安装了Pandas 0.11.0库的python2.7 我一直在四处寻找，还没有找到这个问题的答案，所以我希望比我更有经验的人能找到解决办法假设我的数据（在df1中）如下所示： df1= zip x y access 123 1 1 4 123 1 1 6 133 1 2 3 145 2 2 3 167 3 1 1 167 3 1 2 例如，使用df2=df1[df1['zip']==12

我正在运行安装了

Pandas 0.11.0

库的

python2.7

我一直在四处寻找，还没有找到这个问题的答案，所以我希望比我更有经验的人能找到解决办法

假设我的数据（在df1中）如下所示：

df1=

  zip  x  y  access
  123  1  1    4
  123  1  1    6
  133  1  2    3
  145  2  2    3
  167  3  1    1
  167  3  1    2

例如，使用

df2=df1[df1['zip']==123]

然后使用

df2=df2.join（df1[df1['zip']==133]）

我得到以下数据子集：

df2=

 zip  x  y  access
 123  1  1    4
 123  1  1    6
 133  1  2    3

 zip  x  y  access
 145  2  2    3
 167  3  1    1
 167  3  1    2

我想做的是：

1）从

df1

中删除使用

df2定义/连接的行
或
2） 创建df2
后，从df1
中删除由df2
组成的行（差异？）
希望所有这些都有意义。如果需要更多信息，请告诉我
编辑：
理想情况下，将创建第三个数据帧，如下所示：
df2=

 zip  x  y  access
 123  1  1    4
 123  1  1    6
 133  1  2    3

 zip  x  y  access
 145  2  2    3
 167  3  1    1
 167  3  1    2

也就是说，从df1
开始的所有内容都不在df2
中。谢谢
 我想到了两个选择。首先，使用isin和掩码：
>>> df
   zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2
>>> keep = [123, 133]
>>> df_yes = df[df['zip'].isin(keep)]
>>> df_no = df[~df['zip'].isin(keep)]
>>> df_yes
   zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3
>>> df_no
   zip  x  y  access
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2

第二，使用groupby
：
>>> grouped = df.groupby(df['zip'].isin(keep))

然后是任何一个
>>> grouped.get_group(True)
   zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3
>>> grouped.get_group(False)
   zip  x  y  access
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2
>>> [g for k,g in list(grouped)]
[   zip  x  y  access
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2,    zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3]
>>> dict(list(grouped))
{False:    zip  x  y  access
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2, True:    zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3}
>>> dict(list(grouped)).values()
[   zip  x  y  access
3  145  2  2       3
4  167  3  1       1
5  167  3  1       2,    zip  x  y  access
0  123  1  1       4
1  123  1  1       6
2  133  1  2       3]

这取决于上下文，但我想你明白了。
我不确定你想要什么输出。您是否只想将数据帧拆分为两个新的数据帧，一个由zip
列为123或133的行组成，另一个由其余的行组成？@DSM我编辑了这个问题-我要查找的是底部。谢谢