Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/292.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在无映射的情况下替换数据帧中的多个值的优雅方法?_Python_Python 3.x_Pandas_Dataframe_Str Replace - Fatal编程技术网

Python 在无映射的情况下替换数据帧中的多个值的优雅方法?

Python 在无映射的情况下替换数据帧中的多个值的优雅方法?,python,python-3.x,pandas,dataframe,str-replace,Python,Python 3.x,Pandas,Dataframe,Str Replace,我有一个如下所示的数据帧 import pandas as pd df1 = pd.DataFrame({'ethnicity': ['AMERICAN INDIAN/ALASKA NATIVE', 'WHITE - BRAZILIAN', 'WHITE-RUSSIAN','HISPANIC/LATINO - COLOMBIAN', 'HISPANIC/LATINO - MEXICAN','ASIAN','ASIAN - INDI

我有一个如下所示的数据帧

import pandas as pd
df1 = pd.DataFrame({'ethnicity': ['AMERICAN INDIAN/ALASKA NATIVE', 'WHITE - BRAZILIAN', 'WHITE-RUSSIAN','HISPANIC/LATINO - COLOMBIAN',
                                 'HISPANIC/LATINO - MEXICAN','ASIAN','ASIAN - INDIAN','ASIAN - KOREAN','PORTUGUESE','MIDDLE-EASTERN','UNKNOWN',
                                 'USER DECLINED','OTHERS']})

我想替换“种族”列的值。例如:如果值是
ASIAN-INDIAN
,我只想将其替换为
ASIAN

类似地,我想替换包含
美语
白色
西班牙裔
的字符串,其他字符串替换为
其他
。这就是我想要的

df1.loc[df.ethnicity.str.contains('WHITE'),'ethnicity'] = "WHITE"
df1.loc[df.ethnicity.str.contains('ASIAN'),'ethnicity'] = "ASIAN"
df1.loc[df.ethnicity.str.contains('HISPANIC'),'ethnicity'] = "HISPANIC"
df1.loc[df.ethnicity.str.contains('AMERICAN'),'ethnicity'] = "AMERICAN"
df1.loc[df.ethnicity.str.contains(other ethnicities),ethnicity] = "Others" # please note here I don't know how to replace all other ethnicities at once as others
我希望我的输出如下所示

import pandas as pd
df1 = pd.DataFrame({'ethnicity': ['AMERICAN INDIAN/ALASKA NATIVE', 'WHITE - BRAZILIAN', 'WHITE-RUSSIAN','HISPANIC/LATINO - COLOMBIAN',
                                 'HISPANIC/LATINO - MEXICAN','ASIAN','ASIAN - INDIAN','ASIAN - KOREAN','PORTUGUESE','MIDDLE-EASTERN','UNKNOWN',
                                 'USER DECLINED','OTHERS']})

列表值使用和为匹配返回
NaN
s,因此添加:

或者你也可以加入我们的行列:

df1.ethnicity = (df1.ethnicity.str.extract('(WHITE|ASIAN|AMERICAN|HISPANIC)', expand=False)
                    .fillna('Others'))


哇!只有一行。向上投票。str-extract的工作原理是否类似于“str.extract”(“WHITE”|“ASIAN”|“AMERICAN”|“HISPANIC”)?@SSMK-是的,您很接近
df1.ocidentity=df1.ocidentity.str.extract(“(WHITE | ASIAN | AMERICAN | HISPANIC)”,expand=False)。fillna('Others')
,因此
L
用于提取字符串(
亚裔-印第安人
)从dataframe中重新替换为
L
(`ASIAN)@SSMK-否,它仅用于创建
(白人|亚裔|西班牙裔|美国人)
从列表中的值
L
@SSMK-解决方案相同,首先从列表中创建
(白人|亚裔|西班牙裔|美国人)
,然后传递到
提取
,第二次传递
(白人|亚洲人|西班牙裔|美国人)
仅提取。
.map()
有什么问题吗?您可以始终使用
np。选择
来链接您的条件。我可能并不总是知道我的实际数据中可能包含哪些其他种族值,这些数据有超过百万行
print (df1)
   ethnicity
0   AMERICAN
1      WHITE
2      WHITE
3   HISPANIC
4   HISPANIC
5      ASIAN
6      ASIAN
7      ASIAN
8     Others
9     Others
10    Others
11    Others
12    Others