Python 熊猫数据帧（选择）_Python_Pandas

Python 熊猫数据帧（选择）

python pandas

Python 熊猫数据帧（选择）,python,pandas,Python,Pandas,这里的数据帧问题很简单我通过读取csv文件创建数据帧，然后打印它 <class 'pandas.core.frame.DataFrame'> Int64Index: 176 entries, 0 to 175 Data columns (total 8 columns): ID 176 non-null values study 176 non-null values center

这里的数据帧问题很简单

我通过读取csv文件创建数据帧，然后打印它

    <class 'pandas.core.frame.DataFrame'>
    Int64Index: 176 entries, 0 to 175
    Data columns (total 8 columns):
    ID            176  non-null values
    study         176  non-null values
    center        176  non-null values
    initials      176  non-null values
    age           147  non-null values
    sex           133  non-null values
    lesion age    35  non-null values
    group         35  non-null values
    dtypes: float64(2), int64(1), object(5)

错误信息：

    ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

提前非常感谢。

使用：

SUBJECTS[(SUBJECTS.study=='NO2') & (SUBJECTS.center=='Hermann')]

和

使Python评估

主题。研究=='NO2'

和

SUBJECTS.center=='Hermann'）

在布尔上下文中（要么

True

要么

False

）

在您的例子中，您不希望任何一个都被计算为布尔值。相反，您需要元素逻辑的

和。这由和指定，而不是和指定

错误
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

每当尝试在布尔上下文中计算NumPy数组或帧时，都会引发。考虑
bool(np.array([True, False]))

由于数组非空，一些用户可能希望返回True
。或者，有些人可能期望True
，因为数组中至少有一个元素是True
。其他人可能希望它返回False
，因为并非数组中的所有元素都是True
。由于对布尔上下文应该返回的内容有多个同样有效的期望，NumPy和Pandas的设计者决定强制用户明确：使用.all（）
或.any（）
或len（）
欢迎使用。考虑到以下示例，该错误是由于numpy
在pandas
的引擎盖下如何工作造成的：
In [158]:
a=np.array([1,2,1,1,1,1,2])
b=np.array([1,1,1,2,2,2,1])

In [159]:
#Array Boolean operation
a==1
Out[159]:
array([ True, False,  True,  True,  True,  True, False], dtype=bool)

In [160]:
#Array Boolean operation
b==1
Out[160]:
array([ True,  True,  True, False, False, False,  True], dtype=bool)

In [161]:
#and is not an array Boolean operation
(a==1) and (b==1)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-161-271ddf20f621> in <module>()
----> 1 (a==1) and (b==1)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [162]:
#But & operates on arrays 
(a==1) & (b==1)
Out[162]:
array([ True, False,  True, False, False, False, False], dtype=bool)

In [163]:
#Or *
(a==1) * (b==1)
Out[163]:
array([ True, False,  True, False, False, False, False], dtype=bool)

In [164]:
df=pd.DataFrame({'a':a, 'b':b})
In [166]:
#Therefore this is a good approach
df[(df.a==1) & (df.b==1)]
Out[166]:
a   b
0    1   1
2    1   1
2 rows × 2 columns

In [167]:
#This will also get you there, but it is not preferred.
df[df.a==1][df.b==1]
C:\Anaconda\lib\site-packages\pandas\core\frame.py:1686: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
  "DataFrame index.", UserWarning)
Out[167]:
a   b
0    1   1
2    1   1
2 rows × 2 columns

[158]中的
a=np.数组（[1,2,1,1,1,1,2]）
b=np.数组（[1,1,1,2,2,2,1]）
在[159]中：
#数组布尔运算
a==1
出[159]：
数组（[True，False，True，True，True，True，False]，dtype=bool）
在[160]中：
#数组布尔运算
b==1
Out[160]：
数组（[True，True，True，False，False，True]，dtype=bool）
在[161]中：
#并且不是数组布尔运算
（a==1）和（b==1）
---------------------------------------------------------------------------
ValueError回溯（最近一次调用上次）
在（）
---->1（a==1）和（b==1）
ValueError：包含多个元素的数组的真值不明确。使用a.any（）或a.all（）
在[162]中：
#但是&在阵列上运行
（a==1）和（b==1）
出[162]：
数组（[True，False，True，False，False，False]，dtype=bool）
在[163]中：
#或*
（a==1）*（b==1）
出[163]：
数组（[True，False，True，False，False，False]，dtype=bool）
在[164]中：
df=pd.DataFrame（{'a'：a，'b'：b}）
在[166]中：
#因此，这是一个好办法
df[（df.a==1）和（df.b==1）]
出[166]：
a b
0    1   1
2    1   1
2行×2列
在[167]中：
#这也会让你达到目的，但这不是首选。
df[df.a==1][df.b==1]
C:\Anaconda\lib\site packages\pandas\core\frame.py:1686:UserWarning:Boolean系列键将重新编制索引以匹配数据帧索引。
“数据帧索引”，用户警告）
出[167]：
a b
0    1   1
2    1   1
2行×2列
非常感谢朱CT，我已经阅读了您的所有代码，这对我的理解有很大帮助：）@CT朱
In [158]:
a=np.array([1,2,1,1,1,1,2])
b=np.array([1,1,1,2,2,2,1])

In [159]:
#Array Boolean operation
a==1
Out[159]:
array([ True, False,  True,  True,  True,  True, False], dtype=bool)

In [160]:
#Array Boolean operation
b==1
Out[160]:
array([ True,  True,  True, False, False, False,  True], dtype=bool)

In [161]:
#and is not an array Boolean operation
(a==1) and (b==1)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-161-271ddf20f621> in <module>()
----> 1 (a==1) and (b==1)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [162]:
#But & operates on arrays 
(a==1) & (b==1)
Out[162]:
array([ True, False,  True, False, False, False, False], dtype=bool)

In [163]:
#Or *
(a==1) * (b==1)
Out[163]:
array([ True, False,  True, False, False, False, False], dtype=bool)

In [164]:
df=pd.DataFrame({'a':a, 'b':b})
In [166]:
#Therefore this is a good approach
df[(df.a==1) & (df.b==1)]
Out[166]:
a   b
0    1   1
2    1   1
2 rows × 2 columns

In [167]:
#This will also get you there, but it is not preferred.
df[df.a==1][df.b==1]
C:\Anaconda\lib\site-packages\pandas\core\frame.py:1686: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
  "DataFrame index.", UserWarning)
Out[167]:
a   b
0    1   1
2    1   1
2 rows × 2 columns