Python 如何创建非中心学生&x2019；s T分布以及使用该分布的优先级是什么？_Python_Statistics_Distribution_Pymc3_Pymc

Python 如何创建非中心学生&x2019；s T分布以及使用该分布的优先级是什么？

python statistics

Python 如何创建非中心学生&x2019；s T分布以及使用该分布的优先级是什么？,python,statistics,distribution,pymc3,pymc,Python,Statistics,Distribution,Pymc3,Pymc,我一直在使用以下链接，我一直在使用我的数据从链接到代码，发现我的数据的常见分布是非中心学生的T分布。我在pymc3包中找不到发行版，因此，我决定与scipy一起查看，以了解发行版是如何形成的。我创建了一个自定义发行版，我有几个问题：我想知道我创建发行版的方法是否正确如何将自定义分发实现到模型中关于先验分布，我是否在正态分布先验（mu和sigma）中使用相同的步骤，结合自由度和非中心值的半赋范我的自定义分发： import numpy as np import theano.ten

我一直在使用以下链接，

我一直在使用我的数据从链接到代码，发现我的数据的常见分布是非中心学生的T分布。我在pymc3包中找不到发行版，因此，我决定与scipy一起查看，以了解发行版是如何形成的。我创建了一个自定义发行版，我有几个问题：

我想知道我创建发行版的方法是否正确
如何将自定义分发实现到模型中
关于先验分布，我是否在正态分布先验（mu和sigma）中使用相同的步骤，结合自由度和非中心值的半赋范

我的自定义分发：

import numpy as np
import theano.tensor as tt
from scipy import stats
from scipy.special import hyp1f1, nctdtr
import warnings
from pymc3.theanof import floatX
from pymc3.distributions.dist_math import bound, gammaln
from pymc3.distributions.continuous import assert_negative_support, get_tau_sigma
from pymc3.distributions.distribution import Continuous, draw_values, generate_samples

class NonCentralStudentT(Continuous):
    """
    Parameters
    ----------
    nu: float
        Degrees of freedom, also known as normality parameter (nu > 0).
    mu: float
        Location parameter.
    sigma: float
        Scale parameter (sigma > 0). Converges to the standard deviation as nu increases. (only required if lam is not specified)
    lam: float
        Scale parameter (lam > 0). Converges to the precision as nu increases. (only required if sigma is not specified)
    """

    def __init__(self, nu, nc, mu=0, lam=None, sigma=None, sd=None, *args, **kwargs):
        super().__init__(*args, **kwargs)
        super(NonCentralStudentT, self).__init__(*args, **kwargs)
        if sd is not None:
            sigma = sd
            warnings.warn("sd is deprecated, use sigma instead", DeprecationWarning)
        self.nu = nu = tt.as_tensor_variable(floatX(nu))
        self.nc = nc = tt.as_tensor_variable(floatX(nc))
        lam, sigma = get_tau_sigma(tau=lam, sigma=sigma)
        self.lam = lam = tt.as_tensor_variable(lam)
        self.sigma = self.sd = sigma = tt.as_tensor_variable(sigma)
        self.mean = self.median = self.mode = self.mu = mu = tt.as_tensor_variable(mu)
        self.variance = tt.switch((nu > 2) * 1, (1 / self.lam) * (nu / (nu - 2)), np.inf)

        assert_negative_support(lam, 'lam (sigma)', 'NonCentralStudentT')
        assert_negative_support(nu, 'nu', 'NonCentralStudentT')
        assert_negative_support(nc, 'nc', 'NonCentralStudentT')

    def random(self, point=None, size=None):
        """
        Draw random values from Non-Central Student's T distribution.
        Parameters
        ----------
        point: dict, optional
            Dict of variable values on which random values are to be
            conditioned (uses default point if not specified).
        size: int, optional
            Desired size of random sample (returns one sample if not
            specified).
        Returns
        -------
        array
        """
        nu, nc, mu, lam = draw_values([self.nu, self.nc, self.mu, self.lam], point=point, size=size)
        return generate_samples(stats.nct.rvs, nu, nc, loc=mu, scale=lam ** -0.5, dist_shape=self.shape, size=size)

    def logp(self, value):
        """
        Calculate log-probability of Non-Central Student's T distribution at specified value.
        Parameters
        ----------
        value: numeric
            Value(s) for which log-probability is calculated. If the log probabilities for multiple
            values are desired the values must be provided in a numpy array or theano tensor
        Returns
        -------
        TensorVariable
        """
        nu = self.nu
        nc = self.nc
        mu = self.mu
        lam = self.lam

        n = nu * 1.0
        nc = nc * 1.0
        x2 = value * value
        ncx2 = nc * nc * x2
        fac1 = n + x2
        trm1 = n / 2. * tt.log(n) + gammaln(n + 1)
        trm1 -= n * tt.log(2) + nc * nc / 2. + (n / 2.) * tt.log(fac1) + gammaln(n / 2.)
        Px = tt.exp(trm1)
        valF = ncx2 / (2 * fac1)
        trm1 = tt.sqrt(2) * nc * value * hyp1f1(n / 2 + 1, 1.5, valF)
        trm1 /= np.asarray(fac1 * tt.gamma((n + 1) / 2))
        trm2 = hyp1f1((n + 1) / 2, 0.5, valF)
        trm2 /= np.asarray(np.sqrt(fac1) * tt.gamma(n / 2 + 1))
        Px *= trm1 + trm2
        return bound(Px, lam > 0, nu > 0, nc > 0)

    def logcdf(self, value):
        """
        Compute the log of the cumulative distribution function for Non-Central Student's T distribution
        at the specified value.
        Parameters
        ----------
        value: numeric
            Value(s) for which log CDF is calculated. If the log CDF for multiple
            values are desired the values must be provided in a numpy array or theano tensor.
        Returns
        -------
        TensorVariable
        """
        nu = self.nu
        nc = self.nc

        return nctdtr(nu, nc, value)

我的自定义模型：

with pm.Model() as model:
    # Prior Distributions for unknown model parameters:
    mu = pm.Normal('sigma', 0, 10)
    sigma = pm.Normal('sigma', 0, 10)
    nc= pm.HalfNormal('nc', sigma=10)
    nu= pm.HalfNormal('nu', sigma=1)

    # Observed data is from a Likelihood distributions (Likelihood (sampling distribution) of observations):
    => (input custom distribution) observed_data = pm.Beta('observed_data', alpha=alpha, beta=beta, observed=data)

    # draw 5000 posterior samples
    trace = pm.sample(draws=5000, tune=2000, chains=3, cores=1)

    # Obtaining Posterior Predictive Sampling:
    post_pred = pm.sample_posterior_predictive(trace, samples=3000)
    print(post_pred['observed_data'].shape)
    print('\nSummary: ')
    print(pm.stats.summary(data=trace))
    print(pm.stats.summary(data=post_pred))

编辑1:

我重新设计了自定义模型，以包含自定义分布，但是，我不断根据用于获得似然分布的方程得到错误，有时张量锁定，代码冻结。在下面找到我的代码

with pm.Model() as model:
                # Prior Distributions for unknown model parameters:
                mu = pm.Normal('mu', mu=0, sigma=1)
                sd = pm.HalfNormal('sd', sigma=1)
                nc = pm.HalfNormal('nc', sigma=10)
                nu = pm.HalfNormal('nu', sigma=1)

                # Custom distribution:
                # observed_data = pm.DensityDist('observed_data', NonCentralStudentT, observed=data_list)

                # Observed data is from a Likelihood distributions (Likelihood (sampling distribution) of observations):
                observed_data = NonCentralStudentT('observed_data', mu=mu, sd=sd, nc=nc, nu=nu, observed=data_list)

                # draw 5000 posterior samples
                trace_S = pm.sample(draws=5000, tune=2000, chains=3, cores=1)

                # Obtaining Posterior Predictive Sampling:
                post_pred_S = pm.sample_posterior_predictive(trace_S, samples=3000)
                print(post_pred_S['observed_data'].shape)
                print('\nSummary: ')
                print(pm.stats.summary(data=trace_S))
                print(pm.stats.summary(data=post_pred_S))

编辑2:

为了将函数转换为theano，我在网上查找，我发现定义函数的唯一方法是从下面的GitHub链接

这是否足以用于将函数转换为序号
另外，我有一个问题，可以将NumPy数组与theano一起使用吗

另外，我想到了另一种方法，但我不确定这是否可以实现，我研究了scipy中的nct函数，他们写了以下内容

如果Y是标准正态随机变量，V是独立变量具有k个自由度的卡方随机变量（chi2），然后

X=（Y+c）/sqrt（V/k）

在实线上有一个非中心学生的t分布。这个自由度参数k（在实现中表示为df）满足k>0和非中心参数c（在实现）是一个实数

上述概率密度以“标准化”形式定义。要移动和/或缩放分布，请使用loc和scale 参数。具体来说，nct.pdf（x，df，nc，loc，scale）是等同于nct.pdf（y，df，nc）/带有y=（x）的刻度- loc）/比例尺

所以，我想只使用先验变量作为正态变量和chi2随机变量的代码部分，在它们的分布中，使用代码中提到的自由度变量，在SciPy中提到的方程中，得到分布就足够了吗

编辑3:

我设法运行了链接中关于拟合经验分布的代码，发现第二个最好的是student t分布，所以我将使用这个。谢谢你的帮助。我只是有个小问题，我用student t分布运行了我的模型，但我得到了以下警告：

调谐后出现52次发散。增加目标_接受或重新参数化。接受概率与目标不匹配。它是0.7037574708196309，但应该接近0.8。设法增加调整步骤的数量。有效样本数较少对于某些参数，不超过10%

我只是对这些警告感到困惑，你知道这是什么意思吗？我知道这不会影响我的代码，但是，我可以减少分歧吗？关于有效样本，我需要增加跟踪代码中的样本数吗？

我从几天开始尝试从移位的t分布生成随机数，但这似乎与称为非中心参数的东西有关。这是一个场景。我想从f生成twp样本，从f生成另一个f（x-delta），X~f（X）和y~f（X-delta）。我从t分布生成，所以尝试了xy-rt（10,1）和y