Python 在处理调查数据时，如何组合熊猫中的列？_Python_Python 3.x_Pandas_Dataframe

Python 在处理调查数据时，如何组合熊猫中的列？

python python-3.x pandas dataframe

Python 在处理调查数据时，如何组合熊猫中的列？,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我有一个调查，我想创建一行，合并所有的结果 survey = pd.DataFrame({ 'username':['Mat', 'Ryan', 'Judith', 'John'], 'choice [Website]':['Yes', 'No', 'No', 'No'] , 'choice [Friend]':['No', 'Yes', 'No', 'No'] , 'choice [Poster]':['No', 'No', 'Yes', 'No'] , 'choice [Other

我有一个调查，我想创建一行，合并所有的结果

survey = pd.DataFrame({
'username':['Mat', 'Ryan', 'Judith', 'John'],  
'choice [Website]':['Yes', 'No', 'No', 'No'] , 
'choice [Friend]':['No', 'Yes', 'No', 'No'] , 
'choice [Poster]':['No', 'No', 'Yes', 'No'] , 
'choice [Other]':['No', 'No', 'No', 'Yes'],
})
survey

预期结果（添加另一列）

应用函数

survey['answer?'] = survey.apply(lambda x: how_this_you_find_about_us(x))

我在尝试应用该函数时出错

KeyError: ('choice [Website]', 'occurred at index Response ID')

类型

一切正常，除了一件小事：

DataFrame.apply

，默认情况下，将函数应用于每一列

要修复此问题，请在函数调用中添加

axis=1

。然后，它将把函数应用于每一行（）

可能有一种更简单的方法，但我可以解决您在转换布尔值、查找idxmax并返回concating时遇到的问题

survey=survey.replace('No', False).replace('Yes', True)
temp=survey.drop('username',axis=1).idxmax(axis=1).reset_index()
pd.concat([survey,temp], axis=1)

使用

filter

，选择所需的值，然后使用bool选择

dot

，然后选择

str.extract

s=survey.filter(like='choice')
survey['New']=s.eq('Yes').dot(s.columns).str.extract('\[(\w+)\]',expand=False)
survey
  choice [Friend] choice [Other] choice [Poster] choice [Website] username  \
0              No             No              No              Yes      Mat   
1             Yes             No              No               No     Ryan   
2              No             No             Yes               No   Judith   
3              No            Yes              No               No     John   
       New  
0  Website  
1   Friend  
2   Poster  
3    Other

还有一种使用numpy的方式：

answer_ixs = np.argwhere(survey.values=='Yes').T[1]
survey['answer?'] = survey.columns[answer_ixs].str.extract('\[(\w+)\]')

survey=survey.replace('No', False).replace('Yes', True)
temp=survey.drop('username',axis=1).idxmax(axis=1).reset_index()
pd.concat([survey,temp], axis=1)

s=survey.filter(like='choice')
survey['New']=s.eq('Yes').dot(s.columns).str.extract('\[(\w+)\]',expand=False)
survey
  choice [Friend] choice [Other] choice [Poster] choice [Website] username  \
0              No             No              No              Yes      Mat   
1             Yes             No              No               No     Ryan   
2              No             No             Yes               No   Judith   
3              No            Yes              No               No     John   
       New  
0  Website  
1   Friend  
2   Poster  
3    Other

answer_ixs = np.argwhere(survey.values=='Yes').T[1]
survey['answer?'] = survey.columns[answer_ixs].str.extract('\[(\w+)\]')