python-scipy-spearman关联

python-scipy-spearman关联,python,pandas,scipy,sklearn-pandas,pearson-correlation,Python,Pandas,Scipy,Sklearn Pandas,Pearson Correlation,我试图从dataframe(df)中获取列名,并将它们与spearmanr相关函数生成的结果数组相关联。我需要将列名(a-j)与相关值(spearman)和p值(spearman_pvalue)关联起来。是否有一种直观的方法来执行此任务 from scipy.stats import pearsonr,spearmanr import numpy as np import pandas as pd df=pd.DataFrame(np.random.randint(0,100,size= (1

我试图从dataframe(df)中获取列名,并将它们与spearmanr相关函数生成的结果数组相关联。我需要将列名(a-j)与相关值(spearman)和p值(spearman_pvalue)关联起来。是否有一种直观的方法来执行此任务

from scipy.stats import pearsonr,spearmanr
import numpy as np
import pandas as pd

df=pd.DataFrame(np.random.randint(0,100,size= (100,10)),columns=list('abcdefghij'))

def binary(row):
    if row>=50:
        return 1
    else:
        return 0
df['target']=df.a.apply(binary)

spearman,spearman_pvalue=spearmanr(df.drop(['target'],axis=1),df.target)
print(spearman)
print(spearman_pvalue)
看来你需要:

from scipy.stats import spearmanr

df=pd.DataFrame(np.random.randint(0,100,size= (100,10)),columns=list('abcdefghij'))
#print (df)

#faster for binary df
df['target'] = (df['a'] >= 50).astype(int)
#print (df)

spearman,spearman_pvalue=spearmanr(df.drop(['target'],axis=1),df.target)

df1 = pd.DataFrame(spearman.reshape(-1, 11), columns=df.columns)
#print (df1)

df2 = pd.DataFrame(spearman_pvalue.reshape(-1, 11), columns=df.columns)
#print (df2)

### Kyle, we can assign the index back to the column names for the total matrix:
df2=df2.set_index(df.columns)
df1=df1.set_index(df.columns)
或:


嗨,耶兹雷尔,我试着用df['target']实现这一点,但在重塑时失败了。请您调整代码,使spearman如下:spearman,spearman\u pvalue=spearman(df.drop(['target'],axis=1),df.target)。我需要这个来将统计数据与spearman关联的二进制目标关联起来,否则我会使用pearson(离散vs连续)。哎呀,我忘记了
target
列。现在应该很好用了
df1 = pd.DataFrame(spearman.reshape(-1, 11), 
                  columns=df.columns, 
                  index=df.columns)
df2 = pd.DataFrame(spearman_pvalue.reshape(-1, 11), 
                   columns=df.columns, 
                   index=df.columns)