Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/284.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 排序u值错误_Python_Pandas_Dataframe - Fatal编程技术网

Python 排序u值错误

Python 排序u值错误,python,pandas,dataframe,Python,Pandas,Dataframe,我不知道我的代码出了什么问题 import pandas as pd import numpy as np woe = [1.1147295474833758,0.364043491078754,-0.05525053172192353,-0.3950007109750665,-0.6784658191115104,-0.9522135140050229,-1.1441658353033486] iv = [0.29078213954085946,0.29078213954085946,0.29

我不知道我的代码出了什么问题

import pandas as pd
import numpy as np
woe = [1.1147295474833758,0.364043491078754,-0.05525053172192353,-0.3950007109750665,-0.6784658191115104,-0.9522135140050229,-1.1441658353033486]
iv = [0.29078213954085946,0.29078213954085946,0.29078213954085946,0.29078213954085946,0.29078213954085946,0.29078213954085946,0.29078213954085946]
lis = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
fin = [lis,woe,iv]
fin = np.array(fin).T  
df_disc = pd.DataFrame(fin,columns=['Label','WoE','IV'])
print(df_disc)
df_disc = df_disc.sort_values(by=['WoE'])
df_disc = df_disc.reset_index(drop=True)
print(df_disc)
结果

  Label                   WoE                   IV
0     A    1.1147295474833758  0.29078213954085946
1     B     0.364043491078754  0.29078213954085946
2     C  -0.05525053172192353  0.29078213954085946
3     D   -0.3950007109750665  0.29078213954085946
4     E   -0.6784658191115104  0.29078213954085946
5     F   -0.9522135140050229  0.29078213954085946
6     G   -1.1441658353033486  0.29078213954085946
  Label                   WoE                   IV
0     C  -0.05525053172192353  0.29078213954085946
1     D   -0.3950007109750665  0.29078213954085946
2     E   -0.6784658191115104  0.29078213954085946
3     F   -0.9522135140050229  0.29078213954085946
4     G   -1.1441658353033486  0.29078213954085946
5     B     0.364043491078754  0.29078213954085946
6     A    1.1147295474833758  0.29078213954085946

我认为正确的标签应该是标签G、F、E、D、C、B、A,但结果似乎是错误的

问题在于数据帧中,列由对象填充,而不是数字

在代码中,如果转换字符串和数值,所有值都将转换为对象:

fin = np.array(fin).T  
解决方案是按列名称使用字典,并传递给:

如果将字典传递给
DataFrame
constructor,则可以防止:

df_disc = pd.DataFrame({'Label':lis,'WoE':woe,'IV':iv})
print(df_disc)
    
df_disc = df_disc.sort_values(by=['WoE'], ignore_index=True)
print(df_disc)
  Label       WoE        IV
0     G -1.144166  0.290782
1     F -0.952214  0.290782
2     E -0.678466  0.290782
3     D -0.395001  0.290782
4     C -0.055251  0.290782
5     B  0.364043  0.290782
6     A  1.114730  0.290782

您的列
WoE
IV
属于
dtype
对象
。需要将其转换为
浮点值
,以便正确进行
排序

In [2723]: df_disc.dtypes
Out[2723]: 
Label    object
WoE      object
IV       object
dtype: object

In [2725]: df_disc.WoE = df_disc.WoE.astype(float)

In [2726]: df_disc.sort_values(by=['WoE'])
Out[2726]: 
  Label       WoE                   IV
6     G -1.144166  0.29078213954085946
5     F -0.952214  0.29078213954085946
4     E -0.678466  0.29078213954085946
3     D -0.395001  0.29078213954085946
2     C -0.055251  0.29078213954085946
1     B  0.364043  0.29078213954085946
0     A  1.114730  0.29078213954085946

如上所述,该列包含字符串。要保持精度,请将序列转换为十进制:

from decimal import Decimal

# ...

df_disc['WoE'] = df_disc['WoE'].apply(Decimal)
df_disc = df_disc.sort_values(by='WoE')
print(df_disc)
印刷品:

  Label                   WoE                   IV
6     G   -1.1441658353033486  0.29078213954085946
5     F   -0.9522135140050229  0.29078213954085946
4     E   -0.6784658191115104  0.29078213954085946
3     D   -0.3950007109750665  0.29078213954085946
2     C  -0.05525053172192353  0.29078213954085946
1     B     0.364043491078754  0.29078213954085946
0     A    1.1147295474833758  0.29078213954085946
  Label                   WoE                   IV
6     G   -1.1441658353033486  0.29078213954085946
5     F   -0.9522135140050229  0.29078213954085946
4     E   -0.6784658191115104  0.29078213954085946
3     D   -0.3950007109750665  0.29078213954085946
2     C  -0.05525053172192353  0.29078213954085946
1     B     0.364043491078754  0.29078213954085946
0     A    1.1147295474833758  0.29078213954085946