Python.where具有2个以上的可能条件输入_Python_Pandas_Dataframe

Python.where具有2个以上的可能条件输入

python pandas dataframe

Python.where具有2个以上的可能条件输入,python,pandas,dataframe,Python,Pandas,Dataframe,我尝试使用.where pandas dataframe方法，只有我有两种以上的可能性（即我有if、elif、else，而不是默认行为if-else）请考虑以下数据文件： a1 = np.random.rand(7,2) a2 = np.random.randint(0,3,(7,1)) grid = np.append(a1, a2, axis=1) df = pd.DataFrame(grid) 我试过了 def test(x): if x[2] == 0: re

我尝试使用.where pandas dataframe方法，只有我有两种以上的可能性（即我有if、elif、else，而不是默认行为if-else）

请考虑以下数据文件：

a1 = np.random.rand(7,2)
a2 = np.random.randint(0,3,(7,1))
grid = np.append(a1, a2, axis=1)
df = pd.DataFrame(grid)

我试过了

def test(x):
    if x[2] == 0:
        return 5
    if x[2]==1:
        return 10
    if x[2] ==2:
        return 50

df.where(test)

但我收到错误消息“系列的真值不明确”。我怀疑这是正确的方向，但我对如何实现这一目标感到困惑。文档中说，如果条件是可调用的，则输入被认为是完整的df。然而，即便如此，它还是认为代码> x（2）< /代码>作为整个列2。没有办法实现该任务的矢量化操作吗？是否只能逐行迭代，无论是使用iterrows还是apply

这是一个玩具的例子，要在论坛上弄清楚，我不想做一个简单的。地图在我现实生活中的问题。请将“test”函数作为一个单独的函数保留，如果您回答，它需要通过，因为这是我的困难所在

np.random.seed(100)
a1 = np.random.rand(7,2)
a2 = np.random.randint(0,3,(7,1))
grid = np.append(a1, a2, axis=1)
df = pd.DataFrame(grid)
print (df)
          0         1    2
0  0.543405  0.278369  2.0
1  0.424518  0.844776  2.0
2  0.004719  0.121569  0.0
3  0.670749  0.825853  0.0
4  0.136707  0.575093  1.0
5  0.891322  0.209202  1.0
6  0.185328  0.108377  1.0

解决方案包括：

另一个解决方案包括：

编辑：

对于单独的功能，可以使用参数

axis=1

通过

行处理df
：
def test(x):
    #print (x)
    if x[2] == 0:
        return 5
    if x[2]==1:
        return 10
    if x[2] ==2:
        return 50

df['d'] = df.apply(test, axis=1)
print (df)
          0         1    2   d
0  0.543405  0.278369  2.0  50
1  0.424518  0.844776  2.0  50
2  0.004719  0.121569  0.0   5
3  0.670749  0.825853  0.0   5
4  0.136707  0.575093  1.0  10
5  0.891322  0.209202  1.0  10
6  0.185328  0.108377  1.0  10

但如果需要功能：
def test(x):
    return np.where(x == 0, 5, np.where(x== 1, 10,  50))

print (test(df[2]))
[50 50  5  5 10 10 10]

嗨，谢谢。您能否给出一个答案，将函数“test”作为一个单独的函数，在map或where中传递？这就是在我的现实生活中对我有帮助的例子。好的thx:所以我知道我必须在这里使用apply或ItErrors-没有办法像我认为的那样使用矢量化操作来实现结果？在where方法的文档中，他们提到了使用callable的可能性，这就是我在这里尝试做的：是的，我认为应该这样做。我会做测试时间，但我怀疑哪里比申请更快？在这种情况下，这正是我所期待的<代码>np。在

非常快的地方，我认为

map

更快，而

apply

def test(x):
    #print (x)
    if x[2] == 0:
        return 5
    if x[2]==1:
        return 10
    if x[2] ==2:
        return 50

df['d'] = df.apply(test, axis=1)
print (df)
          0         1    2   d
0  0.543405  0.278369  2.0  50
1  0.424518  0.844776  2.0  50
2  0.004719  0.121569  0.0   5
3  0.670749  0.825853  0.0   5
4  0.136707  0.575093  1.0  10
5  0.891322  0.209202  1.0  10
6  0.185328  0.108377  1.0  10

def test(x):
    return np.where(x == 0, 5, np.where(x== 1, 10,  50))

print (test(df[2]))
[50 50  5  5 10 10 10]