Python 如何在数据帧中仅获取具有给定值(或一组值)的行*和列*
我只想在给定的一组值中查找具有字段值的行和列 我可以得到行,但我不能限制列 假设我有这个数据帧:Python 如何在数据帧中仅获取具有给定值(或一组值)的行*和列*,python,pandas,dataframe,select,indexing,Python,Pandas,Dataframe,Select,Indexing,我只想在给定的一组值中查找具有字段值的行和列 我可以得到行,但我不能限制列 假设我有这个数据帧: print(df) # year 1970 1971 1972 1973 1974 1975 1976 1977 1978 # country # Malawi NaN NaN NaN 123 NaN 234 NaN NaN NaN # OtherC NaN NaN NaN 124 NaN 234 Na
print(df)
# year 1970 1971 1972 1973 1974 1975 1976 1977 1978
# country
# Malawi NaN NaN NaN 123 NaN 234 NaN NaN NaN
# OtherC NaN NaN NaN 124 NaN 234 NaN NaN NaN
# OtherD NaN NaN NaN 124 NaN 235 NaN NaN NaN
我要返回的是包含123或234的行和列:
# year 1973 1975
# country
# Malawi 123 234
# OtherC 124 234
我可以这样做,只返回具有给定值的行,而不选择列:
print(df[df.isin([123, 234]).any(axis=1)])
# year 1970 1971 1972 1973 1974 1975 1976 1977 1978
# country
# Malawi NaN NaN NaN 123 NaN 234 NaN NaN NaN
# OtherC NaN NaN NaN 124 NaN 234 NaN NaN NaN
但是,当我尝试这两个语句中的任何一个时,我会得到一个错误:
print(df[df.isin([123, 234]).any(axis=1)]\
[df.isin([123, 234]).any(axis=0)])
print(df[df.isin([123, 234]).any(axis=0)]\
[df.isin([123, 234]).any(axis=1)])
...
IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).
或者,使用实际的数据集:
print(df[[df.isin([123, 234]).any(axis=1)],\
[df.isin([123, 234]).any(axis=0)]])
# The following output is from using the real dataset, so it
# includes more rows and columns, and the sought-after values
# are in different locations, but you get the idea:
# TypeError: '([country
# Afghanistan False
# Albania False
# Andorra False
# Angola False
# Antigua and Barbuda False
# ...
# Uruguay False
# Vanuatu False
# Venezuela False
# Vietnam False
# Zimbabwe False
# Length: 160, dtype: bool], [year
# 1970 False
# 1971 False
# 1972 False
# 1973 False
# 1974 False
# 1975 False
# 1976 False
# 1977 False
# 1978 False
# 1979 False
# 1980 False
# 1981 False
# 1982 False
# 1983 False
# 1984 False
# 1985 False
# 1986 False
# 1987 False
# 1988 False
# 1989 False
# 1990 False
# 1991 False
# 1992 False
# 1993 False
# 1994 False
# 1995 False
# 1996 False
# 1997 False
# 1998 False
# 1999 True
# 2000 True
# 2001 False
# 2002 False
# 2003 False
# 2004 False
# 2005 False
# 2006 False
# 2007 False
# 2008 False
# 2009 False
# 2010 False
# 2011 False
# 2012 False
# 2013 False
# 2014 False
# 2015 False
# 2016 False
# 2017 False
# 2018 False
# dtype: bool])' is an invalid key
取anyaxis=1获取行,取anyaxis=0获取列。记住尽可能避免索引链接,例如df[][]
输出:
year 1973 1975
country
Malawi 123 234
OtherC 124 234
取anyaxis=1获取行,取anyaxis=0获取列。记住尽可能避免索引链接,例如df[][]
输出:
year 1973 1975
country
Malawi 123 234
OtherC 124 234
谢谢你不会相信我用了多少不同的方法得到这个!我很好奇为什么要避免索引链接,Quang。看看下面的答案。另外,使用索引链接,您可能会得到一个数据副本,修改该副本可能会失败,并使用CopyWarnJngThanks进行设置,我会检查它。Python复制和不复制的时间现在肯定在我的雷达上。似乎有点武断。谢谢!你不会相信我用了多少不同的方法得到这个!我很好奇为什么要避免索引链接,Quang。看看下面的答案。另外,使用索引链接,您可能会得到一个数据副本,修改该副本可能会失败,并使用CopyWarnJngThanks进行设置,我会检查它。Python复制和不复制的时间现在肯定在我的雷达上。似乎有点武断。