Python 如何在数据帧中仅获取具有给定值（或一组值）的行*和列*_Python_Pandas_Dataframe_Select_Indexing

Python 如何在数据帧中仅获取具有给定值（或一组值）的行*和列*

python pandas dataframe select indexing

Python 如何在数据帧中仅获取具有给定值（或一组值）的行*和列*,python,pandas,dataframe,select,indexing,Python,Pandas,Dataframe,Select,Indexing,我只想在给定的一组值中查找具有字段值的行和列我可以得到行，但我不能限制列假设我有这个数据帧： print(df) # year 1970 1971 1972 1973 1974 1975 1976 1977 1978 # country # Malawi NaN NaN NaN 123 NaN 234 NaN NaN NaN # OtherC NaN NaN NaN 124 NaN 234 Na

我只想在给定的一组值中查找具有字段值的行和列

我可以得到行，但我不能限制列

假设我有这个数据帧：

print(df)

# year     1970  1971  1972  1973  1974  1975  1976  1977  1978
# country  
# Malawi    NaN   NaN   NaN   123   NaN   234   NaN   NaN   NaN
# OtherC    NaN   NaN   NaN   124   NaN   234   NaN   NaN   NaN
# OtherD    NaN   NaN   NaN   124   NaN   235   NaN   NaN   NaN

我要返回的是包含123或234的行和列：

# year     1973  1975
# country  
# Malawi    123   234
# OtherC    124   234

我可以这样做，只返回具有给定值的行，而不选择列：

print(df[df.isin([123, 234]).any(axis=1)])

# year     1970  1971  1972  1973  1974  1975  1976  1977  1978
# country  
# Malawi    NaN   NaN   NaN   123   NaN   234   NaN   NaN   NaN
# OtherC    NaN   NaN   NaN   124   NaN   234   NaN   NaN   NaN

但是，当我尝试这两个语句中的任何一个时，我会得到一个错误：

print(df[df.isin([123, 234]).any(axis=1)]\
[df.isin([123, 234]).any(axis=0)])

print(df[df.isin([123, 234]).any(axis=0)]\
[df.isin([123, 234]).any(axis=1)])

...
IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).

或者，使用实际的数据集：

print(df[[df.isin([123, 234]).any(axis=1)],\
[df.isin([123, 234]).any(axis=0)]])

# The following output is from using the real dataset, so it
# includes more rows and columns, and the sought-after values
# are in different locations, but you get the idea:

# TypeError: '([country
# Afghanistan            False
# Albania                False
# Andorra                False
# Angola                 False
# Antigua and Barbuda    False
#                        ...  
# Uruguay                False
# Vanuatu                False
# Venezuela              False
# Vietnam                False
# Zimbabwe               False
# Length: 160, dtype: bool], [year
# 1970    False
# 1971    False
# 1972    False
# 1973    False
# 1974    False
# 1975    False
# 1976    False
# 1977    False
# 1978    False
# 1979    False
# 1980    False
# 1981    False
# 1982    False
# 1983    False
# 1984    False
# 1985    False
# 1986    False
# 1987    False
# 1988    False
# 1989    False
# 1990    False
# 1991    False
# 1992    False
# 1993    False
# 1994    False
# 1995    False
# 1996    False
# 1997    False
# 1998    False
# 1999     True
# 2000     True
# 2001    False
# 2002    False
# 2003    False
# 2004    False
# 2005    False
# 2006    False
# 2007    False
# 2008    False
# 2009    False
# 2010    False
# 2011    False
# 2012    False
# 2013    False
# 2014    False
# 2015    False
# 2016    False
# 2017    False
# 2018    False
# dtype: bool])' is an invalid key

取anyaxis=1获取行，取anyaxis=0获取列。记住尽可能避免索引链接，例如df[][]

输出：

year     1973  1975
country            
Malawi    123   234
OtherC    124   234

取anyaxis=1获取行，取anyaxis=0获取列。记住尽可能避免索引链接，例如df[][]

输出：

year     1973  1975
country            
Malawi    123   234
OtherC    124   234

谢谢你不会相信我用了多少不同的方法得到这个！我很好奇为什么要避免索引链接，Quang。看看下面的答案。另外，使用索引链接，您可能会得到一个数据副本，修改该副本可能会失败，并使用CopyWarnJngThanks进行设置，我会检查它。Python复制和不复制的时间现在肯定在我的雷达上。似乎有点武断。谢谢！你不会相信我用了多少不同的方法得到这个！我很好奇为什么要避免索引链接，Quang。看看下面的答案。另外，使用索引链接，您可能会得到一个数据副本，修改该副本可能会失败，并使用CopyWarnJngThanks进行设置，我会检查它。Python复制和不复制的时间现在肯定在我的雷达上。似乎有点武断。