Python 熊猫删除标签中不包含的行_Python_Pandas_Dataframe

Python 熊猫删除标签中不包含的行

python pandas dataframe

Python 熊猫删除标签中不包含的行,python,pandas,dataframe,Python,Pandas,Dataframe,我有两个索引重叠的数据帧（A和B）。我想删除数据帧B中索引值在数据帧A中不存在的行我已经研究了DataFrames的Pandas方法，但是它会删除带有给定标签的列，我想删除没有给定标签的列目前，我设法做到这一点： B.drop(B.drop(A.index).index) 但这显然不是最好的方法（既不高效也不可读）。有更好的办法吗示例：数据帧A： index col1 1 some_data 2 some_dat

我有两个索引重叠的数据帧（A和B）。我想删除数据帧B中索引值在数据帧A中不存在的行

我已经研究了DataFrames的Pandas方法，但是它会删除带有给定标签的列，我想删除没有给定标签的列

目前，我设法做到这一点：

B.drop(B.drop(A.index).index)

但这显然不是最好的方法（既不高效也不可读）。有更好的办法吗

示例：

数据帧A：

   index       col1  
     1       some_data    
     2       some_data    
     3       some_data   
     4       some_data

   index       col2  
     1       other_data    
     2       other_data    
     3       other_data   
     4       other_data   
     5       other_data   
     6       other_data

   index       col2  
     1       other_data    
     2       other_data    
     3       other_data   
     4       other_data

数据帧B：

   index       col1  
     1       some_data    
     2       some_data    
     3       some_data   
     4       some_data

   index       col2  
     1       other_data    
     2       other_data    
     3       other_data   
     4       other_data   
     5       other_data   
     6       other_data

   index       col2  
     1       other_data    
     2       other_data    
     3       other_data   
     4       other_data

我想获取数据帧B'：

   index       col1  
     1       some_data    
     2       some_data    
     3       some_data   
     4       some_data

   index       col2  
     1       other_data    
     2       other_data    
     3       other_data   
     4       other_data   
     5       other_data   
     6       other_data

   index       col2  
     1       other_data    
     2       other_data    
     3       other_data   
     4       other_data

我认为您可以使用：

您可以使用

difference

查找不在其他df索引中的行标签：

In [6]:
df2.drop(df2.index.difference(df1.index))

Out[6]:
             col2
index            
1      other_data
2      other_data
3      other_data
4      other_data

A.index

将为您提供所需内容的索引，然后

.loc

允许您选择所需数据。上面我有一个NaN。我使用的是0.13.1版，因此可能与最新的0.18版有点不同步

ix（一些索引）：允许您在索引上子集一个数据帧
DataFrame.index.intersection（某些索引）：返回索引的交集

安装程序解决方案

这里唯一的问题是，如果A的索引中有不在B中的行标签，那么这将引发一个

KeyError

在这方面没有得到一个错误，只是在那些情况下得到了NaN。所以我认为这会引发一个键错误，也许行为已经发生了changed@EdChum如果B中只有A中的索引不可用，则会引发

keyror

，否则，即使至少有一个索引匹配，也不会引发错误。对于那些不匹配的索引，它将返回

NaN

@Abbas，我想这是有道理的，因为没有1个标签