Python 将函数应用于列会产生错误“;abs()的操作数类型错误:';str'&引用;
我想从我的Python 将函数应用于列会产生错误“;abs()的操作数类型错误:';str'&引用;,python,pandas,Python,Pandas,我想从我的熊猫数据帧中找出每5个值的绝对最大值。这就是我所做的: import pandas as pd import numpy as np df = pd.DataFrame(np.random.uniform(-100,100,size=(20, 4)), columns=list('ABCD')) n = 5 absmax = lambda x: max(x, key=abs) df_max = df.groupby(np.arange(len(df))//n).apply(absm
熊猫数据帧中找出每5个值的绝对最大值。这就是我所做的:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.uniform(-100,100,size=(20, 4)), columns=list('ABCD'))
n = 5
absmax = lambda x: max(x, key=abs)
df_max = df.groupby(np.arange(len(df))//n).apply(absmax)
我不断得到以下错误:
TypeError: bad operand type for abs(): 'str'
但是,如果我对单列执行相同的操作,代码将起作用:
df_max = df.iloc[:,0].groupby(np.arange(len(df))//n).apply(absmax)
将absmax
应用于整个DataFrame
有什么问题
编辑:
屈服
A float64
B float64
C float64
D float64
dtype: object
我不推荐你
使用+
输出
A B C D
0 95.483784 72.261024 90.289008 -99.557204
1 92.303663 -92.933734 98.741863 -73.221129
2 -94.858459 91.925163 90.394739 94.129047
3 85.727608 96.168424 69.747412 74.943672
问题是Groupby.apply
将函数应用于每个daframe,然后max接收一个数据帧arg
。。。我们可以通过以下方式进行检查:
def absmax(x):
print('This is a DataFrame arg: \n', x)
return x.apply(lambda y: max(y, key=abs))
df_max = df.groupby(np.arange(len(df))//n).apply(absmax)
This is a DataFrame arg:
A B C D
0 23.080753 89.918599 -62.273324 0.674636
1 -61.096176 68.840583 20.359616 -82.110220
2 -86.942716 97.269852 57.320944 84.340152
3 36.632979 53.376849 95.817563 39.398515
4 -36.960907 -12.796490 -79.833804 32.708664
This is a DataFrame arg:
A B C D
5 24.360600 -49.486819 -65.995965 -93.884078
6 -46.623600 -57.999896 -86.946372 -0.250644
7 -35.103092 61.971385 86.165203 -32.619381
8 50.155064 -38.999355 98.856747 -56.195841
9 -92.646512 94.217864 -86.628196 -55.859978
This is a DataFrame arg:
A B C D
10 -55.846413 34.281246 -90.523268 71.148029
11 22.753896 -33.659637 74.225409 24.498337
12 -52.384172 16.169118 -10.788839 -99.874961
13 49.235215 -74.372901 11.509361 -43.676953
14 67.255287 -84.477123 12.725054 -85.892184
This is a DataFrame arg:
A B C D
15 72.522972 -13.079824 48.973703 -87.913843
16 -64.110924 -81.324560 7.067080 97.073997
17 87.319482 76.021534 80.780322 -90.320084
18 -84.848110 70.732111 34.160013 99.269365
19 -17.924337 12.191496 46.020178 30.532568
该错误表明您正在向abs()方法传递字符串。请尝试检查要传递的数据帧的类型sure@FlavioMoraes我添加了dtypes
的结果,以显示所有列都是float
s。谢谢,但我想保留值的符号。我已经更改了示例,以显示我的数据中也有负值。请现在检查:)我认为问题在于max()
在使用groupby.apply时收到类似于DataFrame的输入。请检查解决方案更新@AnjamThank,这很有意义@出于好奇,我尝试分析这两种解决方案。shape(1e6,4)的应用解决方案在约2分钟内快速完成,而另一个基于替换的解决方案正在快速消耗我的内存(16 GB中消耗了约12 GB),但仍未完成。
A B C D
0 95.483784 72.261024 90.289008 -99.557204
1 92.303663 -92.933734 98.741863 -73.221129
2 -94.858459 91.925163 90.394739 94.129047
3 85.727608 96.168424 69.747412 74.943672
print(df)
# df = pd.read_clipboard() # read
A B C D
0 -49.710250 -44.714960 90.289008 78.021054
1 15.779849 72.261024 -80.564941 -99.557204
2 -25.535893 44.418568 -3.654055 -4.656012
3 -1.792691 52.828214 -24.383645 54.337480
4 95.483784 -33.604944 60.210426 -68.157859
5 85.614669 -88.756766 14.634241 -73.221129
6 -1.461207 41.078068 98.741863 0.152652
7 92.303663 -77.230608 63.205845 -45.439176
8 36.255957 -92.933734 -8.668916 24.251590
9 15.387012 -17.044411 -84.098159 53.797730
10 -15.277586 91.925163 90.394739 94.129047
11 -94.858459 19.069534 62.672051 10.852176
12 -48.550836 59.084142 -22.185758 -58.797477
13 -16.430060 -26.718411 -23.169127 90.198812
14 14.495206 14.054623 -59.593696 35.043442
15 21.148221 16.673029 42.788121 -23.932640
16 74.617433 -53.114081 69.747412 74.943672
17 85.727608 96.168424 41.474511 7.672894
18 -9.282754 -0.151546 -8.765613 -26.973899
19 -39.272002 85.819052 21.355006 67.018427
def absmax(x):
print('This is a DataFrame arg: \n', x)
return x.apply(lambda y: max(y, key=abs))
df_max = df.groupby(np.arange(len(df))//n).apply(absmax)
This is a DataFrame arg:
A B C D
0 23.080753 89.918599 -62.273324 0.674636
1 -61.096176 68.840583 20.359616 -82.110220
2 -86.942716 97.269852 57.320944 84.340152
3 36.632979 53.376849 95.817563 39.398515
4 -36.960907 -12.796490 -79.833804 32.708664
This is a DataFrame arg:
A B C D
5 24.360600 -49.486819 -65.995965 -93.884078
6 -46.623600 -57.999896 -86.946372 -0.250644
7 -35.103092 61.971385 86.165203 -32.619381
8 50.155064 -38.999355 98.856747 -56.195841
9 -92.646512 94.217864 -86.628196 -55.859978
This is a DataFrame arg:
A B C D
10 -55.846413 34.281246 -90.523268 71.148029
11 22.753896 -33.659637 74.225409 24.498337
12 -52.384172 16.169118 -10.788839 -99.874961
13 49.235215 -74.372901 11.509361 -43.676953
14 67.255287 -84.477123 12.725054 -85.892184
This is a DataFrame arg:
A B C D
15 72.522972 -13.079824 48.973703 -87.913843
16 -64.110924 -81.324560 7.067080 97.073997
17 87.319482 76.021534 80.780322 -90.320084
18 -84.848110 70.732111 34.160013 99.269365
19 -17.924337 12.191496 46.020178 30.532568