Python 基于另一列的字符串创建新列
我有如下列A的dataframe,我想创建一个名为“基于列A的复杂性”的新列。但是输出不能反映我想要的输出。有人能帮忙吗Python 基于另一列的字符串创建新列,python,string,function,Python,String,Function,我有如下列A的dataframe,我想创建一个名为“基于列A的复杂性”的新列。但是输出不能反映我想要的输出。有人能帮忙吗 A dev DH dev DHGP dev SEA dev MONO dev SLIM DH dev SLIM MONO 输出 complexity_column(df,"A") output: A COMPLEXITY dev DH DH dev DHGP DH
A
dev DH
dev DHGP
dev SEA
dev MONO
dev SLIM DH
dev SLIM MONO
输出
complexity_column(df,"A")
output:
A COMPLEXITY
dev DH DH
dev DHGP DH
dev SEA SEA
dev MONO MONO
dev SLIM DH DH
dev SLIM MONO MONO
我的愿望输出如下
A COMPLEXITY
dev DH DH
dev DHGP DHGP
dev SEA SEA
dev MONO MONO
dev SLIM DH SLIM DH
dev SLIM MONO SLIM MONO
与其使用.str.contains作为字符串的子集,为什么不直接使用=,即:
def complexity_column(df,classes):
conditions_region = [
(df[classes] == "DH"),
(df[classes] == "DHGP")),
(df[classes] == "SEA")),
(df[classes] == "MONO")),
(df[classes] == "SLIM DH")),
(df[classes] == "SLIM MONO"))
]
从文档中:numpy.select(condlist,choicelist,default=0)
condlist:用于确定从choicelist中的哪个数组获取输出元素的条件列表。当满足多个条件时,将使用condlist中遇到的第一个条件
您需要对conditions\u区域中的元素重新排序
,以确保更具体的条件最先出现,而一般条件最后出现
就是
条件\u区域=[
df[classes].str.contains(“SLIM DH”),
df[classes].str.contains(“SLIM MONO”),
df[classes].str.contains(“DHGP”),
df[classes].str.contains(“DH”),
df[classes].str.contains(“SEA”),
df[classes].str.contains(“MONO”)
]
df['complexity']=df[column_name].apply(lambda x:column_maker(x,列出_字符串))
def complexity_column(df,classes):
conditions_region = [
(df[classes] == "DH"),
(df[classes] == "DHGP")),
(df[classes] == "SEA")),
(df[classes] == "MONO")),
(df[classes] == "SLIM DH")),
(df[classes] == "SLIM MONO"))
]
def column_maker(entry_row,list_of_strings):
output_string = ''
for i in list_of_strings:
if i in entry_row:
output_string = output_string +" "+i
return output_string