Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/343.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 我的Kruskal Wallis课程出了什么问题?_Python_Python 3.x_Scipy_Kruskal Wallis - Fatal编程技术网

Python 我的Kruskal Wallis课程出了什么问题?

Python 我的Kruskal Wallis课程出了什么问题?,python,python-3.x,scipy,kruskal-wallis,Python,Python 3.x,Scipy,Kruskal Wallis,我试图构建一个可以执行Kruskal-Wallis测试的类。该类使用以下公式计算H: 但是,它产生的H值与scipy的kruskal函数不同。有人知道为什么会这样吗 import numpy as np from scipy.stats import rankdata from scipy.stats import kruskal class Kruskal_Wallis(): def __init__(self): pass def fit(self, g

我试图构建一个可以执行Kruskal-Wallis测试的类。该类使用以下公式计算H:

但是,它产生的H值与scipy的
kruskal
函数不同。有人知道为什么会这样吗

import numpy as np
from scipy.stats import rankdata
from scipy.stats import kruskal

class Kruskal_Wallis():
    def __init__(self):
        pass

    def fit(self, groups):
        """
        Performs Kruskal-Wallis test.

        :param groups: list containing 1D group arrays

        Adds the following attributes:
            - n: size of total population
            - n_groups: number of groups (n_groups = len(n_i) = len(r_i))
            - n_i: array containing group sizes
            - df: degrees of freedom
            - r2_i: array containing the square of the sum of ranks for each group
            - h: kruskal-wallis statistic
        """

        def sum_ranks_per_group(groups):
            n_groups = len(groups)
            n_i = np.array([group.shape[0] for group in groups])

            data = np.array([])
            for group in groups:
                data = np.concatenate((data, group), axis=0)

            ranked_data = rankdata(data, method="average")
            ranked_groups = ranked_data.reshape((n_groups, n_i[0])) #works only if groups have equal size
            summed_ranks = ranked_groups.sum(axis=1)

            return summed_ranks

        def get_h(n, r2_i, n_i):
            summed_r2_i_per_n_i = (r2_i/n_i).sum()
            h = (12/(n*(n-1)) * summed_r2_i_per_n_i) - 3*(n+1)

            return h

        n_groups = len(groups)
        n_i = np.array([group.shape[0] for group in groups])
        n = sum(n_i)
        df = n_groups - 1
        r2_i = sum_ranks_per_group(groups)**2
        h = get_h(n, r2_i, n_i)

        self.n_groups = n_groups
        self.n_i = n_i
        self.n = n
        self.df = df
        self.r2_i = r2_i
        self.h = h


## Compare results yielded by  scipy.stats.kruskal and Kruskal_Wallis class
groups = [np.arange(1,3),
        np.arange(3,5)]

res = kruskal(groups[0], groups[1])

kruskal_wallis = Kruskal_Wallis()
kruskal_wallis.fit(groups)
print(res)
print(kruskal_wallis.h)

答案之间的差异可能是由python处理


不要使用pythonic division(
/
),尝试使用numpy的谢谢您的回复。我解决了这个问题。我课本上给出的公式有错。计算H的正确公式是12/(N*(N+1))而不是12/(N*(N-1))