Python 为什么数据帧相关性显示这些结果
给定一个数据帧,我尝试从几次尝试(以列表示)中选择最适合Python 为什么数据帧相关性显示这些结果,python,pandas,Python,Pandas,给定一个数据帧,我尝试从几次尝试(以列表示)中选择最适合target列的一次 import pandas as pd # tried with pandas 0.22 and pandas 0.20 data = {0.1: {10000.0: 1.1417023723316702, 20000.0: 1.675669860738065, 30000.0: 2.1391047345794565, 40000.0: 2.588884897140648}
target
列的一次
import pandas as pd # tried with pandas 0.22 and pandas 0.20
data = {0.1: {10000.0: 1.1417023723316702,
20000.0: 1.675669860738065,
30000.0: 2.1391047345794565,
40000.0: 2.588884897140648},
0.3: {10000.0: 3.4251071169950102,
20000.0: 5.027009582214195,
30000.0: 6.4173142037383695,
40000.0: 7.766654691421943},
0.5: {10000.0: 5.708511861658351,
20000.0: 8.378349303690324,
30000.0: 10.695523672897282,
40000.0: 12.94442448570324},
0.7: {10000.0: 7.99191660632169,
20000.0: 11.729689025166454,
30000.0: 14.973733142056194,
40000.0: 18.122194279984534},
0.9: {10000.0: 10.275321350985031,
20000.0: 15.081028746642584,
30000.0: 19.25194261121511,
40000.0: 23.29996407426583},
'target': {10000.0: 8.95547589186585,
20000.0: 12.664955463781974,
30000.0: 15.511339250669858,
40000.0: 17.9109517837317}}
values = pd.DataFrame(data)
values
Out[4]:
0.1 0.3 0.5 0.7 0.9 target
10000.0 1.141702 3.425107 5.708512 7.991917 10.275321 8.955476
20000.0 1.675670 5.027010 8.378349 11.729689 15.081029 12.664955
30000.0 2.139105 6.417314 10.695524 14.973733 19.251943 15.511339
40000.0 2.588885 7.766655 12.944424 18.122194 23.299964 17.910952
我的计划是利用熊猫来得到一个快速的提示。然而,我得到的结果并不是usefull,因为target
vs.try的所有值都等于1
这种方法有什么问题:
values.corr()
Out[5]:
0.1 0.3 0.5 0.7 0.9 target
0.1 1.000000 1.000000 1.000000 1.000000 1.000000 0.998252
0.3 1.000000 1.000000 1.000000 1.000000 1.000000 0.998252
0.5 1.000000 1.000000 1.000000 1.000000 1.000000 0.998252
0.7 1.000000 1.000000 1.000000 1.000000 1.000000 0.998252
0.9 1.000000 1.000000 1.000000 1.000000 1.000000 0.998252
target 0.998252 0.998252 0.998252 0.998252 0.998252 1.000000
您的所有列都与目标以及彼此相关。使用其中任何一个——都没关系。好吧,我只是试着模拟一下,所有列都与所有其他列相关联(也称为“两两关联”),但数据是随机的,而你认为不是。