Python Pandas:将通过API调用检索的非类型数组从列表转换为字符串(以启用透视表的使用)
我查询一个内部数据库,检索数据,并创建一个看起来类似于以下内容的dataframePython Pandas:将通过API调用检索的非类型数组从列表转换为字符串(以启用透视表的使用),python,pandas,Python,Pandas,我查询一个内部数据库,检索数据,并创建一个看起来类似于以下内容的dataframedf: import pandas as pd df = pd.DataFrame({'issue_key':['MED-187', 'MED-188', 'MED-190', 'MED-191'], 'creator': ['Smith, J', 'Williams, S', 'Wilson, T', 'Smith, J'], 'ma
df
:
import pandas as pd
df = pd.DataFrame({'issue_key':['MED-187', 'MED-188', 'MED-190', 'MED-191'],
'creator': ['Smith, J', 'Williams, S', 'Wilson, T', 'Smith, J'],
'manufacturer': ['Mercedes', 'Audi', 'Mercedes', 'BMW'],
'department': [['Sales'], ['Finance'], ['Operations'], ['Sales']],
'expense': [29181, 12809, 837, 2817]})
issue_key creator manufacturer department expense
MED-187 Smith, J Mercedes [Sales] 29181
MED-188 Williams, S Audi [Finance] 12809
MED-190 Wilson, T Mercedes [Operations] 837
MED-191 Smith, J BMW [Sales] 2817
import numpy as np
pivoted_data = pd.pivot_table(data=df, index='department', values='expense', aggfunc=np.mean)
df['new_department'] = [','.join(map(str, l)) for l in df['department']]
`df['activity'].unique()`
然后,我尝试使用以下方法来透视数据:
import pandas as pd
df = pd.DataFrame({'issue_key':['MED-187', 'MED-188', 'MED-190', 'MED-191'],
'creator': ['Smith, J', 'Williams, S', 'Wilson, T', 'Smith, J'],
'manufacturer': ['Mercedes', 'Audi', 'Mercedes', 'BMW'],
'department': [['Sales'], ['Finance'], ['Operations'], ['Sales']],
'expense': [29181, 12809, 837, 2817]})
issue_key creator manufacturer department expense
MED-187 Smith, J Mercedes [Sales] 29181
MED-188 Williams, S Audi [Finance] 12809
MED-190 Wilson, T Mercedes [Operations] 837
MED-191 Smith, J BMW [Sales] 2817
import numpy as np
pivoted_data = pd.pivot_table(data=df, index='department', values='expense', aggfunc=np.mean)
df['new_department'] = [','.join(map(str, l)) for l in df['department']]
`df['activity'].unique()`
但是,我得到了这个错误:
TypeError: unhashable type: 'list'
因此,我使用以下命令将department列从我认为是一个列表(稍后将详细介绍)转换为一个字符串:
import pandas as pd
df = pd.DataFrame({'issue_key':['MED-187', 'MED-188', 'MED-190', 'MED-191'],
'creator': ['Smith, J', 'Williams, S', 'Wilson, T', 'Smith, J'],
'manufacturer': ['Mercedes', 'Audi', 'Mercedes', 'BMW'],
'department': [['Sales'], ['Finance'], ['Operations'], ['Sales']],
'expense': [29181, 12809, 837, 2817]})
issue_key creator manufacturer department expense
MED-187 Smith, J Mercedes [Sales] 29181
MED-188 Williams, S Audi [Finance] 12809
MED-190 Wilson, T Mercedes [Operations] 837
MED-191 Smith, J BMW [Sales] 2817
import numpy as np
pivoted_data = pd.pivot_table(data=df, index='department', values='expense', aggfunc=np.mean)
df['new_department'] = [','.join(map(str, l)) for l in df['department']]
`df['activity'].unique()`
正如预期的那样,其结果是:
df = pd.DataFrame({'issue_key':['MED-187', 'MED-188', 'MED-190', 'MED-191'],
'creator': ['Smith, J', 'Williams, S', 'Wilson, T', 'Smith, J'],
'manufacturer': ['Mercedes', 'Audi', 'Mercedes', 'BMW'],
'department': [['Sales'], ['Finance'], ['Operations'], ['Sales']],
'expense': [29181, 12809, 837, 2817],
'new_department': ['Sales', 'Finance', 'Operations', 'Sales']
})
issue_key creator manufacturer department expense new_department
MED-187 Smith, J Mercedes [Sales] 29181 Sales
MED-188 Williams, S Audi [Finance] 12809 Finance
MED-190 Wilson, T Mercedes [Operations] 837 Operations
MED-191 Smith, J BMW [Sales] 2817 Sales
array(["[<JIRA CustomFieldOption: value='Sales', id='174'>]",
"[<JIRA CustomFieldOption: value='Finance', id='179'>]",
"[<JIRA CustomFieldOption: value='Operations', id='102'>]",
'None'], dtype=object)
现在,我能够成功地透视数据
但是在生产数据上,我得到以下错误:
df['new_department'] = [','.join(map(str, l)) for l in df['department']]
TypeError: 'NoneType' object is not iterable
在研究如何从API返回数据时,我调用了以下命令:
import pandas as pd
df = pd.DataFrame({'issue_key':['MED-187', 'MED-188', 'MED-190', 'MED-191'],
'creator': ['Smith, J', 'Williams, S', 'Wilson, T', 'Smith, J'],
'manufacturer': ['Mercedes', 'Audi', 'Mercedes', 'BMW'],
'department': [['Sales'], ['Finance'], ['Operations'], ['Sales']],
'expense': [29181, 12809, 837, 2817]})
issue_key creator manufacturer department expense
MED-187 Smith, J Mercedes [Sales] 29181
MED-188 Williams, S Audi [Finance] 12809
MED-190 Wilson, T Mercedes [Operations] 837
MED-191 Smith, J BMW [Sales] 2817
import numpy as np
pivoted_data = pd.pivot_table(data=df, index='department', values='expense', aggfunc=np.mean)
df['new_department'] = [','.join(map(str, l)) for l in df['department']]
`df['activity'].unique()`
其结果是:
df = pd.DataFrame({'issue_key':['MED-187', 'MED-188', 'MED-190', 'MED-191'],
'creator': ['Smith, J', 'Williams, S', 'Wilson, T', 'Smith, J'],
'manufacturer': ['Mercedes', 'Audi', 'Mercedes', 'BMW'],
'department': [['Sales'], ['Finance'], ['Operations'], ['Sales']],
'expense': [29181, 12809, 837, 2817],
'new_department': ['Sales', 'Finance', 'Operations', 'Sales']
})
issue_key creator manufacturer department expense new_department
MED-187 Smith, J Mercedes [Sales] 29181 Sales
MED-188 Williams, S Audi [Finance] 12809 Finance
MED-190 Wilson, T Mercedes [Operations] 837 Operations
MED-191 Smith, J BMW [Sales] 2817 Sales
array(["[<JIRA CustomFieldOption: value='Sales', id='174'>]",
"[<JIRA CustomFieldOption: value='Finance', id='179'>]",
"[<JIRA CustomFieldOption: value='Operations', id='102'>]",
'None'], dtype=object)
数组([“[]”,
"[]",
"[]",
“无”],数据类型=对象)
API返回的结果显示None
,因此错误消息完全有意义
你知道如何克服这个问题吗?是否可以以某种方式转换None
类型,以便遍历列表?或者,直接访问数组中的值
我已经调查了几个小时,无法解决这个问题
提前感谢您的帮助 如果问题是“部门”列中的rogue
None
值,则可以使用列表中的条件语句来处理它:
df['new_department'] = [','.join(map(str, l)) if l is not None else 'NA' for l in df['department']]
或者更一般地说:
unwanted_values = [None,np.NaN]
df['new_department'] = ['NA' if l in unwanted_values else ','.join(map(str, l)) for l in df['department']]