Python 从数据样本计算逆CDF_Python_Numpy_Statistics_Cdf

Python 从数据样本计算逆CDF

python numpy statistics

Python 从数据样本计算逆CDF,python,numpy,statistics,cdf,Python,Numpy,Statistics,Cdf,如果我得到一个随机样本数据： X=np.random.random(100)*100 我需要得到X_I的值，CDF=34%或其他值。我现在唯一能思考的方法是使用反向CDF。我原以为百分位数是相等的，但有人告诉我这很接近，但并不精确这将为您提供索引X，其中cdf为0.34： X=np.random.random(100)*100 cdf_frac_to_find = 0.34 cdf = np.cumsum(X)/np.sum(X) #take the cumulative sum of x

如果我得到一个随机样本数据：

X=np.random.random(100)*100

我需要得到X_I的值，CDF=34%或其他值。我现在唯一能思考的方法是使用反向CDF。我原以为百分位数是相等的，但有人告诉我这很接近，但并不精确

这将为您提供索引

，其中cdf为0.34：

X=np.random.random(100)*100
cdf_frac_to_find = 0.34
cdf = np.cumsum(X)/np.sum(X) #take the cumulative sum of x and normalize so that it's max value is 1
X_index = np.argmin(np.abs(cdf-cdf_pct_to_find))
X_index
#out: 32 -- note that this will likely change because you're generating random numbers for X.

CDF表示累积分布函数。因此，它是基础分布函数的积分。你知道随机值的分布吗

np.random.random

具有均匀分布，在这种情况下，CDF和百分位在统计学上匹配。然而，对于某个样本集，特别是如果它是一个小样本集，CDF（某种意义上的期望值）和实际百分位数可能会有很大的变化。您是希望从分布中获得真实的CDF，还是应该从样本中估算？