Python 如何根据critieria创建新列并为其赋值?
我有一个如下所示的数据帧:Python 如何根据critieria创建新列并为其赋值?,python,pandas,Python,Pandas,我有一个如下所示的数据帧: lname fname rno_cd eri_cd 0 CRUISE TOM E 1 1 DEPP JOHNNY Y 0 2 DICAPR LENARDO 1 3 PITT BRAD 1 4 MOST JEFF A 0
lname fname rno_cd eri_cd
0 CRUISE TOM E 1
1 DEPP JOHNNY Y 0
2 DICAPR LENARDO 1
3 PITT BRAD 1
4 MOST JEFF A 0
5 HANKS TOM 1
6 BRANDO MARLON C 1
7 WILLIAMS ROBIN F 1
8 DOWNEY ROBERT B 1
9 PACINO AL E 1
第['rno_cd']列中的代码定义为:
A = AI/AK Native
B = Asian
C = Black/AA
D = Hispanic
E = White
F = Asian
G = Asian
H = Haw/Pac Isl.
Y = White
1) 我需要定义这些代码并将其放入新列中。2)我还需要在某种程度上解释空白值 最终结果应该如下所示:
lname fname rno_cd eri_cd rno_defined
0 CRUISE TOM E 1 White
1 DEPP JOHNNY Y 0 White
2 DICAPR LENARDO 1 Unknown
3 PITT BRAD 1 Unknown
4 MOST JEFF A 0 AI/AK Native
5 HANKS TOM 1 Unknown
6 BRANDO MARLON C 1 Black/AA
7 WILLIAMS ROBIN F 1 Asian
8 DOWNEY ROBERT B 1 Asian
9 PACINO AL E 1 White
=======================================到目前为止我的编码==================
我使用了以下方法,但不确定它是否为固溶体
In[1]:
df1['rno_cd'][df1.rno_cd.str.contains('A')] = 'AI/AK Native'
df1['rno_cd'][df1.rno_cd.str.contains('B')] = 'Asian'
df1['rno_cd'][df1.rno_cd.str.contains('C')] = 'Black/AA'
df1['rno_cd'][df1.rno_cd.str.contains('D')] = 'Hispanic'
df1['rno_cd'][df1.rno_cd.str.contains('E')] = 'White'
df1['rno_cd'][df1.rno_cd.str.contains('F')] = 'Asian'
df1['rno_cd'][df1.rno_cd.str.contains('G')] = 'Asian'
df1['rno_cd'][df1.rno_cd.str.contains('H')] = 'HawPac'
df1['rno_cd'][df1.rno_cd.str.contains('Y')] = 'White'
In[1]: df1
Out[1]:
lname fname rno_cd eri_cd
0 SONJU LAURIE White 1
1 FORTHOFER KELLY White 0
2 PLILEY JODY 1
3 NOEL HEATHER 1
4 MANNING CYNTHIA White 0
5 NAUERTZ ELIZABETH 1
6 SCHMID DAVID White 1
7 HINTHER VICTORIA White 1
8 JOHNSON B. White 1
9 MOORE CAROL White 1
10 MARSHALL JOY 1
<> P>此代码的局限性是它不对原始数据集的空白值赋值。我也看不到验证值是否正确的原始代码
有什么建议/意见/建议吗
谢谢。系列(例如,数据帧的列)有一个方便的映射方法。您只需要以字典形式进行编码:
code_to_ethnicity: {'A': 'AI/AK Native',
'B': 'Asian'} #etc
df['rno_defined'] = df['rno_cd'].map(code_to_ethnicity)
当您描述“空白值”时,我假定您指的是空字符串:'
。如果你想为这些做一些特别的事情,你可以直接把它添加到字典中
code_to_ethnicity: {'A': 'AI/AK Native',
'B': 'Asian',
'': 'other}
序列(例如,数据帧的列)有一个方便的map
方法。您只需要以字典形式进行编码:
code_to_ethnicity: {'A': 'AI/AK Native',
'B': 'Asian'} #etc
df['rno_defined'] = df['rno_cd'].map(code_to_ethnicity)
当您描述“空白值”时,我假定您指的是空字符串:'
。如果你想为这些做一些特别的事情,你可以直接把它添加到字典中
code_to_ethnicity: {'A': 'AI/AK Native',
'B': 'Asian',
'': 'other}
序列(例如,数据帧的列)有一个方便的map
方法。您只需要以字典形式进行编码:
code_to_ethnicity: {'A': 'AI/AK Native',
'B': 'Asian'} #etc
df['rno_defined'] = df['rno_cd'].map(code_to_ethnicity)
当您描述“空白值”时,我假定您指的是空字符串:'
。如果你想为这些做一些特别的事情,你可以直接把它添加到字典中
code_to_ethnicity: {'A': 'AI/AK Native',
'B': 'Asian',
'': 'other}
序列(例如,数据帧的列)有一个方便的map
方法。您只需要以字典形式进行编码:
code_to_ethnicity: {'A': 'AI/AK Native',
'B': 'Asian'} #etc
df['rno_defined'] = df['rno_cd'].map(code_to_ethnicity)
当您描述“空白值”时,我假定您指的是空字符串:'
。如果你想为这些做一些特别的事情,你可以直接把它添加到字典中
code_to_ethnicity: {'A': 'AI/AK Native',
'B': 'Asian',
'': 'other}
您可以构建一个字典,其中键是引用,值是名称
D={"A":"AI/AK Native","B":"Asian","C":"Black/AA","D":"Hispanic","E":"White","F":"Asian","G":"Asian","H":"Haw/Pac Isl","Y":"White"}
然后浏览rno\u cd
列,并应用一个转换数据的函数。您可以使用apply
和函数tranform
验证x是否是键,以便使用字典D[x]
获取值。如果不是这样,您只需返回“未知”
另一种方法是:
df["rno_defined"]= map(lambda x: D[x] if x!="Nan" else "Unknown",df['rno_cd'].values)
输出:
lname fname rno_cd eri_cd rno_defined
0 CRUISE TOM E 1 White
1 DEPP JOHNNY Y 0 White
2 DICAPR LENARDO Nan 1 Unknown
3 PITT BRAD Nan 1 Unknown
4 MOST JEFF A 0 AI/AK Native
5 HANKS TOM Nan 1 Unknown
6 BRANDO MARLON C 1 Black/AA
7 WILLIAMS ROBIN F 1 Asian
8 DOWNEY ROBERT B 1 Asian
9 PACINO AL E 1 White
您可以构建一个字典,其中键是引用,值是名称
D={"A":"AI/AK Native","B":"Asian","C":"Black/AA","D":"Hispanic","E":"White","F":"Asian","G":"Asian","H":"Haw/Pac Isl","Y":"White"}
然后浏览rno\u cd
列,并应用一个转换数据的函数。您可以使用apply
和函数tranform
验证x是否是键,以便使用字典D[x]
获取值。如果不是这样,您只需返回“未知”
另一种方法是:
df["rno_defined"]= map(lambda x: D[x] if x!="Nan" else "Unknown",df['rno_cd'].values)
输出:
lname fname rno_cd eri_cd rno_defined
0 CRUISE TOM E 1 White
1 DEPP JOHNNY Y 0 White
2 DICAPR LENARDO Nan 1 Unknown
3 PITT BRAD Nan 1 Unknown
4 MOST JEFF A 0 AI/AK Native
5 HANKS TOM Nan 1 Unknown
6 BRANDO MARLON C 1 Black/AA
7 WILLIAMS ROBIN F 1 Asian
8 DOWNEY ROBERT B 1 Asian
9 PACINO AL E 1 White
您可以构建一个字典,其中键是引用,值是名称
D={"A":"AI/AK Native","B":"Asian","C":"Black/AA","D":"Hispanic","E":"White","F":"Asian","G":"Asian","H":"Haw/Pac Isl","Y":"White"}
然后浏览rno\u cd
列,并应用一个转换数据的函数。您可以使用apply
和函数tranform
验证x是否是键,以便使用字典D[x]
获取值。如果不是这样,您只需返回“未知”
另一种方法是:
df["rno_defined"]= map(lambda x: D[x] if x!="Nan" else "Unknown",df['rno_cd'].values)
输出:
lname fname rno_cd eri_cd rno_defined
0 CRUISE TOM E 1 White
1 DEPP JOHNNY Y 0 White
2 DICAPR LENARDO Nan 1 Unknown
3 PITT BRAD Nan 1 Unknown
4 MOST JEFF A 0 AI/AK Native
5 HANKS TOM Nan 1 Unknown
6 BRANDO MARLON C 1 Black/AA
7 WILLIAMS ROBIN F 1 Asian
8 DOWNEY ROBERT B 1 Asian
9 PACINO AL E 1 White
您可以构建一个字典,其中键是引用,值是名称
D={"A":"AI/AK Native","B":"Asian","C":"Black/AA","D":"Hispanic","E":"White","F":"Asian","G":"Asian","H":"Haw/Pac Isl","Y":"White"}
然后浏览rno\u cd
列,并应用一个转换数据的函数。您可以使用apply
和函数tranform
验证x是否是键,以便使用字典D[x]
获取值。如果不是这样,您只需返回“未知”
另一种方法是:
df["rno_defined"]= map(lambda x: D[x] if x!="Nan" else "Unknown",df['rno_cd'].values)
输出:
lname fname rno_cd eri_cd rno_defined
0 CRUISE TOM E 1 White
1 DEPP JOHNNY Y 0 White
2 DICAPR LENARDO Nan 1 Unknown
3 PITT BRAD Nan 1 Unknown
4 MOST JEFF A 0 AI/AK Native
5 HANKS TOM Nan 1 Unknown
6 BRANDO MARLON C 1 Black/AA
7 WILLIAMS ROBIN F 1 Asian
8 DOWNEY ROBERT B 1 Asian
9 PACINO AL E 1 White
使用下面Taha和exp1orer的两个答案帮助我回答了这个问题。谢谢你们。使用下面塔哈和exp1orer的两个答案帮助我回答了这个问题。谢谢你们。使用下面塔哈和exp1orer的两个答案帮助我回答了这个问题。谢谢你们。使用下面塔哈和exp1orer的两个答案帮助我回答了这个问题。谢谢你们两位。