Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/331.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将数据更改为浮动后,数据透视表中没有要聚合的数值类型_Python_Pandas - Fatal编程技术网

Python 将数据更改为浮动后,数据透视表中没有要聚合的数值类型

Python 将数据更改为浮动后,数据透视表中没有要聚合的数值类型,python,pandas,Python,Pandas,我需要透视一个数据帧(dfM),它看起来像 Task Question Answer analystID x a 1 u y b 2 i z c 3 o 一开始我以为我的支点 dfM = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID', values = ['Ans

我需要透视一个数据帧(dfM),它看起来像

Task Question Answer analystID
x    a        1      u
y    b        2      i
z    c        3      o
一开始我以为我的支点

 dfM = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID', 
                     values = ['Answer'])
正在获取该错误(没有要聚合的数字类型),因为答案列下的数字可能是字符串,所以我尝试了

dfM.apply(pd.to_numeric, errors='ignore')
但我最终还是犯了同样的错误


有什么方法可以修复此错误吗?

尝试将您的apply语句重新分配回dfM

dfM = dfM.apply(pd.to_numeric, errors='ignore')
dfM = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID', 
                     values = ['Answer'])

              Answer          
analystID          i    o    u
Task Question                 
x    a           NaN  NaN  1.0
y    b           2.0  NaN  NaN
z    c           NaN  3.0  NaN

我认为您需要将
Answer
列转换为
int
float
仅,然后将
[]
值中删除列中无
多索引的
参数:

dfM['Answer'] = dfM['Answer'].astype(int)
df = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID', 
                     values = 'Answer')
print (df)
analystID        i    o    u
Task Question               
x    a         NaN  NaN  1.0
y    b         2.0  NaN  NaN
z    c         NaN  3.0  NaN
如果第一个解决方案失败,则存在一些非数值。因此,需要使用参数
errors='concurve'
将非数字替换为
NaN
s

dfM['Answer'] = pd.to_numeric(dfM['Answer'], errors='coerce')
df = pd.pivot_table(dfM, index = ['Task', 'Question'], columns = 'analystID', 
                     values = 'Answer')
print (df)
analystID        i    o    u
Task Question               
x    a         NaN  NaN  1.0
y    b         2.0  NaN  NaN
z    c         NaN  3.0  NaN
一行解决方案,包括:

编辑:

如果使用参数
errors='ignore'
和一些非数值,仍然会得到错误:

print (dfM)
  Task Question Answer analystID
0    x        a      r         u <-first ansswer value was changed to `r`
1    y        b      2         i
2    z        c      3         o

dfM['Answer'] = pd.to_numeric(dfM['Answer'], errors='ignore')
打印(dfM)
任务问题答案分析

0 x a r u Try
dfM.pivot_表(index=['Task','Question'],columns='analystID',values='Answer',aggfunc='sum')
-显式传递agg函数?嘿,这似乎可行,但由于某些原因,我的问题现在都乱了序,有可能的解释吗?是的,问题是索引和列的排序。pivot_表之后的列数是多少?因为如果没有太多的值,则在pivot_表之后可以使用
df=df.reindex(列=['o','i','u'])
,对于索引
df=df.reindex(索引=['x','z','y'],级别=0)
在pivot表之后有4列,这是预期的(示例没有显示整个数据集),我还注意到答案栏中的所有内容都是NaN
print (dfM)
  Task Question Answer analystID
0    x        a      r         u <-first ansswer value was changed to `r`
1    y        b      2         i
2    z        c      3         o

dfM['Answer'] = pd.to_numeric(dfM['Answer'], errors='ignore')