Pandas 熊猫替换一列上的多个值（您不知道）_Pandas_Data Cleaning

Pandas 熊猫替换一列上的多个值（您不知道）

pandas

Pandas 熊猫替换一列上的多个值（您不知道）,pandas,data-cleaning,Pandas,Data Cleaning,更改一列（“状态”）中与要分析的两个值不同的几个值的最佳方法是什么？例如，我的df是： Id Status Email Product Age 1 ok g@ A 20 5 not ok l@ J 45 1 A a@ A 27 2 B h@ B 25 2 ok t@ B

更改一列（“状态”）中与要分析的两个值不同的几个值的最佳方法是什么？
例如，我的df是：

Id  Status  Email   Product Age
1   ok          g@      A       20
5   not ok      l@      J       45
1   A           a@      A       27
2   B           h@      B       25 
2   ok          t@      B       33
3   C           b@      E       23
4   not ok      c@      D       30

最后，我想要：

Id  Status  Email   Product Age
1   ok          g@      A       20
5   not ok      l@      J       45
1   other       a@      A       27
2   other       h@      B       25 
2   ok          t@      B       33
3   other       b@      E       23
4   not ok      c@      D       30

最大的困难是我的df非常大，因此我不知道所有其他值与“ok”和“not ok”（我要分析的值）不同。

提前谢谢

np.where

isin

df.Status=np.where(df.Status.isin(['ok','not ok']),df.Status,'Others')
df
Out[384]: 
   Id  Status Email Product  Age
0   1      ok    g@       A   20
1   5  not ok    l@       J   45
2   1  Others    a@       A   27
3   2  Others    h@       B   25
4   2      ok    t@       B   33
5   3  Others    b@       E   23
6   4  not ok    c@       D   30

np.where

isin

df.Status=np.where(df.Status.isin(['ok','not ok']),df.Status,'Others')
df
Out[384]: 
   Id  Status Email Product  Age
0   1      ok    g@       A   20
1   5  not ok    l@       J   45
2   1  Others    a@       A   27
3   2  Others    h@       B   25
4   2      ok    t@       B   33
5   3  Others    b@       E   23
6   4  not ok    c@       D   30

使用应用程序

df['Status'] = df.apply(lambda x: 'other' if x['Status'] not in ['ok', 'not ok'] else x['Status'], axis=1)

使用应用程序

df['Status'] = df.apply(lambda x: 'other' if x['Status'] not in ['ok', 'not ok'] else x['Status'], axis=1)

答案中的错误选择，

apply

的加载速度比Wen的解决方案的

np.where

慢。答案中的错误选择，

apply

的加载速度比Wen的解决方案的

np.where

慢。