Python 将函数应用于列会产生错误“;abs()的操作数类型错误:';str'&引用;

Python 将函数应用于列会产生错误“;abs()的操作数类型错误:';str'&引用;,python,pandas,Python,Pandas,我想从我的熊猫数据帧中找出每5个值的绝对最大值。这就是我所做的: import pandas as pd import numpy as np df = pd.DataFrame(np.random.uniform(-100,100,size=(20, 4)), columns=list('ABCD')) n = 5 absmax = lambda x: max(x, key=abs) df_max = df.groupby(np.arange(len(df))//n).apply(absm

我想从我的
熊猫数据帧中找出每5个值的绝对最大值。这就是我所做的:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.uniform(-100,100,size=(20, 4)), columns=list('ABCD'))

n = 5
absmax = lambda x: max(x, key=abs)
df_max = df.groupby(np.arange(len(df))//n).apply(absmax)
我不断得到以下错误:

TypeError: bad operand type for abs(): 'str'
但是,如果我对单列执行相同的操作,代码将起作用:

df_max = df.iloc[:,0].groupby(np.arange(len(df))//n).apply(absmax)
absmax
应用于整个
DataFrame
有什么问题

编辑:

屈服

A    float64
B    float64
C    float64
D    float64
dtype: object
我不推荐你

使用+

输出

           A          B          C          D
0  95.483784  72.261024  90.289008 -99.557204
1  92.303663 -92.933734  98.741863 -73.221129
2 -94.858459  91.925163  90.394739  94.129047
3  85.727608  96.168424  69.747412  74.943672


问题是
Groupby.apply
将函数应用于每个daframe,然后max接收一个数据帧
arg
。。。我们可以通过以下方式进行检查:

def absmax(x):
    print('This is a DataFrame arg: \n', x)
    return x.apply(lambda y: max(y, key=abs))

df_max = df.groupby(np.arange(len(df))//n).apply(absmax)

This is a DataFrame arg: 
            A          B          C          D
0  23.080753  89.918599 -62.273324   0.674636
1 -61.096176  68.840583  20.359616 -82.110220
2 -86.942716  97.269852  57.320944  84.340152
3  36.632979  53.376849  95.817563  39.398515
4 -36.960907 -12.796490 -79.833804  32.708664
This is a DataFrame arg: 
            A          B          C          D
5  24.360600 -49.486819 -65.995965 -93.884078
6 -46.623600 -57.999896 -86.946372  -0.250644
7 -35.103092  61.971385  86.165203 -32.619381
8  50.155064 -38.999355  98.856747 -56.195841
9 -92.646512  94.217864 -86.628196 -55.859978
This is a DataFrame arg: 
             A          B          C          D
10 -55.846413  34.281246 -90.523268  71.148029
11  22.753896 -33.659637  74.225409  24.498337
12 -52.384172  16.169118 -10.788839 -99.874961
13  49.235215 -74.372901  11.509361 -43.676953
14  67.255287 -84.477123  12.725054 -85.892184
This is a DataFrame arg: 
             A          B          C          D
15  72.522972 -13.079824  48.973703 -87.913843
16 -64.110924 -81.324560   7.067080  97.073997
17  87.319482  76.021534  80.780322 -90.320084
18 -84.848110  70.732111  34.160013  99.269365
19 -17.924337  12.191496  46.020178  30.532568

该错误表明您正在向abs()方法传递字符串。请尝试检查要传递的数据帧的类型sure@FlavioMoraes我添加了
dtypes
的结果,以显示所有列都是
float
s。谢谢,但我想保留值的符号。我已经更改了示例,以显示我的数据中也有负值。请现在检查:)我认为问题在于
max()
在使用groupby.apply时收到类似于
DataFrame的输入。请检查解决方案更新@AnjamThank,这很有意义@出于好奇,我尝试分析这两种解决方案。shape(1e6,4)的应用解决方案在约2分钟内快速完成,而另一个基于替换的解决方案正在快速消耗我的内存(16 GB中消耗了约12 GB),但仍未完成。
           A          B          C          D
0  95.483784  72.261024  90.289008 -99.557204
1  92.303663 -92.933734  98.741863 -73.221129
2 -94.858459  91.925163  90.394739  94.129047
3  85.727608  96.168424  69.747412  74.943672
print(df)
 # df = pd.read_clipboard() # read

            A          B          C          D
0  -49.710250 -44.714960  90.289008  78.021054
1   15.779849  72.261024 -80.564941 -99.557204
2  -25.535893  44.418568  -3.654055  -4.656012
3   -1.792691  52.828214 -24.383645  54.337480
4   95.483784 -33.604944  60.210426 -68.157859
5   85.614669 -88.756766  14.634241 -73.221129
6   -1.461207  41.078068  98.741863   0.152652
7   92.303663 -77.230608  63.205845 -45.439176
8   36.255957 -92.933734  -8.668916  24.251590
9   15.387012 -17.044411 -84.098159  53.797730
10 -15.277586  91.925163  90.394739  94.129047
11 -94.858459  19.069534  62.672051  10.852176
12 -48.550836  59.084142 -22.185758 -58.797477
13 -16.430060 -26.718411 -23.169127  90.198812
14  14.495206  14.054623 -59.593696  35.043442
15  21.148221  16.673029  42.788121 -23.932640
16  74.617433 -53.114081  69.747412  74.943672
17  85.727608  96.168424  41.474511   7.672894
18  -9.282754  -0.151546  -8.765613 -26.973899
19 -39.272002  85.819052  21.355006  67.018427
def absmax(x):
    print('This is a DataFrame arg: \n', x)
    return x.apply(lambda y: max(y, key=abs))

df_max = df.groupby(np.arange(len(df))//n).apply(absmax)

This is a DataFrame arg: 
            A          B          C          D
0  23.080753  89.918599 -62.273324   0.674636
1 -61.096176  68.840583  20.359616 -82.110220
2 -86.942716  97.269852  57.320944  84.340152
3  36.632979  53.376849  95.817563  39.398515
4 -36.960907 -12.796490 -79.833804  32.708664
This is a DataFrame arg: 
            A          B          C          D
5  24.360600 -49.486819 -65.995965 -93.884078
6 -46.623600 -57.999896 -86.946372  -0.250644
7 -35.103092  61.971385  86.165203 -32.619381
8  50.155064 -38.999355  98.856747 -56.195841
9 -92.646512  94.217864 -86.628196 -55.859978
This is a DataFrame arg: 
             A          B          C          D
10 -55.846413  34.281246 -90.523268  71.148029
11  22.753896 -33.659637  74.225409  24.498337
12 -52.384172  16.169118 -10.788839 -99.874961
13  49.235215 -74.372901  11.509361 -43.676953
14  67.255287 -84.477123  12.725054 -85.892184
This is a DataFrame arg: 
             A          B          C          D
15  72.522972 -13.079824  48.973703 -87.913843
16 -64.110924 -81.324560   7.067080  97.073997
17  87.319482  76.021534  80.780322 -90.320084
18 -84.848110  70.732111  34.160013  99.269365
19 -17.924337  12.191496  46.020178  30.532568