Python 基于索引的数据帧的行子集_Python_Pandas_Indexing_Subset

Python 基于索引的数据帧的行子集

python pandas indexing

Python 基于索引的数据帧的行子集,python,pandas,indexing,subset,Python,Pandas,Indexing,Subset,我有两个具有相同类型索引（userid）的数据帧，但它们都不是另一个的子集。我想从较小的行中删除较大的行中未出现的所有行。我的印象是这是loc函数的预期用途，但它实际上添加了行 Largedf.shape Out[2]: (7341253, 39) Smalldf.shape Out[3]: (588939, 2) Smalldf = Smalldf.loc[Largedf.index] Smalldf.shape Out[5]: (7341253, 2) Largedf.shape O

我有两个具有相同类型索引（userid）的数据帧，但它们都不是另一个的子集。我想从较小的行中删除较大的行中未出现的所有行。我的印象是这是loc函数的预期用途，但它实际上添加了行

Largedf.shape
Out[2]: (7341253, 39)

Smalldf.shape
Out[3]: (588939, 2)

Smalldf = Smalldf.loc[Largedf.index]

Smalldf.shape
Out[5]: (7341253, 2)

Largedf.shape
Out[6]: (7341253, 39)

Smalldf中有几个用户不在Largedf中，所以我原以为它会变小。有没有更好的方法？请注意，行没有按任何方式排序，我唯一需要跟踪的是索引。

关于

isin

Smalldf = Smalldf[Smalldf.index.isin(Largedf.index))]

这不是一个bug，请参见我的评论：

loc

与切片/索引类似时，其作用类似于

reindex

。