直方图中的Matlab绘图_Matlab_Statistics

直方图中的Matlab绘图

matlab statistics

直方图中的Matlab绘图,matlab,statistics,Matlab,Statistics,假设y是一个随机数服从分布的向量f（x）=sqrt（4-x^2）/（2*pi）。现在我使用命令hist（y，30）。如何将分布函数f（x）=sqrt（4-x^2）/（2*pi）绘制到相同的直方图中？让我们以另一个分布函数为例，标准正态分布函数。要完全按照你说的做，你可以这样做： nRand = 10000; y = randn(1,nRand); [myHist, bins] = hist(y,30); pdf = normpdf(bins); figure, bar(bins, myHist,

假设

是一个随机数服从分布的向量

f（x）=sqrt（4-x^2）/（2*pi）

。现在我使用命令

hist（y，30）

。如何将分布函数

f（x）=sqrt（4-x^2）/（2*pi）

绘制到相同的直方图中？

让我们以另一个分布函数为例，标准正态分布函数。要完全按照你说的做，你可以这样做：

nRand = 10000;
y = randn(1,nRand);
[myHist, bins] = hist(y,30);
pdf = normpdf(bins);
figure, bar(bins, myHist,1); hold on; plot(bins,pdf,'rx-'); hold off;

但这可能不是你真正想要的。为什么？你会注意到密度函数在直方图底部看起来像一条细线。这是因为柱状图是存储箱中的数字计数，而密度函数则被归一化为一个积分。如果一个箱子中有数百个项目，则密度函数无法与比例中的匹配，因此存在缩放或标准化问题。您必须对直方图进行规格化，或者绘制缩放分布函数。我更喜欢缩放分布函数，这样当我查看直方图时，我的计数是感性的：

normalizedpdf = pdf/sum(pdf)*sum(myHist);
figure, bar(bins, myHist,1); hold on; plot(bins,normalizedpdf,'rx-'); hold off;

您的情况与此相同，只是您将使用指定的函数f（x）而不是normpdf命令。

让我们以另一个分布函数为例，即标准正态分布函数。要完全按照你说的做，你可以这样做：

nRand = 10000;
y = randn(1,nRand);
[myHist, bins] = hist(y,30);
pdf = normpdf(bins);
figure, bar(bins, myHist,1); hold on; plot(bins,pdf,'rx-'); hold off;

normalizedpdf = pdf/sum(pdf)*sum(myHist);
figure, bar(bins, myHist,1); hold on; plot(bins,normalizedpdf,'rx-'); hold off;

您的情况与此相同，只是您将使用指定的函数f（x）而不是normpdf命令。

您也可以通过以下方式找到理论比例因子，而不是进行数值规格化

nbins = 30;
nsamples = max(size(y));
binsize = (max(y)-min(y)) / nsamples
hist(y,nbins)
hold on
x1=linspace(min(y),max(y),100);
scalefactor = nsamples * binsize 
y1=scalefactor * sqrt(4-x^2)/(2*pi)
plot(x1,y1)

更新：它的工作原理。

对于任何足够大的数据集（称之为f（x）），该域上的f（x）积分将近似为1。然而，我们知道，任何直方图下的面积都精确等于样本总数乘以箱子宽度

因此，使pdf与柱状图一致的一个非常简单的比例因子是Ns*Wb，即样本点的总数乘以箱子的宽度。

您也可以通过以下方法找到一个理论比例因子，而不是进行数值标准化

nbins = 30;
nsamples = max(size(y));
binsize = (max(y)-min(y)) / nsamples
hist(y,nbins)
hold on
x1=linspace(min(y),max(y),100);
scalefactor = nsamples * binsize 
y1=scalefactor * sqrt(4-x^2)/(2*pi)
plot(x1,y1)

更新：它的工作原理。

对于任何足够大的数据集（称之为f（x）），该域上的f（x）积分将近似为1。然而，我们知道，任何直方图下的面积都精确等于样本总数乘以箱子宽度

因此，使pdf与直方图一致的一个非常简单的比例因子是Ns*Wb，即样本点的总数乘以箱子的宽度。

让我在混合中添加另一个示例：

%# some normally distributed random data
data = randn(1e3,1);

%# histogram
numbins = 30;
hist(data, numbins);
h(1) = get(gca,'Children');
set(h(1), 'FaceColor',[.8 .8 1])

%# figure out how to scale the pdf (with area = 1), to the area of the histogram
[bincounts,binpos] = hist(data, numbins);
binwidth = binpos(2) - binpos(1);
histarea = binwidth*sum(bincounts);

%# fit a gaussian
[muhat,sigmahat] = normfit(data);
x = linspace(binpos(1),binpos(end),100);
y = normpdf(x, muhat, sigmahat);
h(2) = line(x, y*histarea, 'Color','b', 'LineWidth',2);

%# kernel estimator
[f,x,u] = ksdensity( data );
h(3) = line(x, f*histarea, 'Color','r', 'LineWidth',2);

legend(h, {'freq hist','fitted Gaussian','kernel estimator'})

让我再加一个例子：

%# some normally distributed random data
data = randn(1e3,1);

%# histogram
numbins = 30;
hist(data, numbins);
h(1) = get(gca,'Children');
set(h(1), 'FaceColor',[.8 .8 1])

%# figure out how to scale the pdf (with area = 1), to the area of the histogram
[bincounts,binpos] = hist(data, numbins);
binwidth = binpos(2) - binpos(1);
histarea = binwidth*sum(bincounts);

%# fit a gaussian
[muhat,sigmahat] = normfit(data);
x = linspace(binpos(1),binpos(end),100);
y = normpdf(x, muhat, sigmahat);
h(2) = line(x, y*histarea, 'Color','b', 'LineWidth',2);

%# kernel estimator
[f,x,u] = ksdensity( data );
h(3) = line(x, f*histarea, 'Color','r', 'LineWidth',2);

legend(h, {'freq hist','fitted Gaussian','kernel estimator'})

您不是已经在用

hist

绘制它了吗？粗略的想法是用两个输出调用hist，这样它就可以为您提供bin中心和计数。然后手动进行条形图绘制，并在条形图顶部叠加缩放密度函数。请看下面我的答案。您不是已经在用

hist

绘制它了吗？粗略的想法是用两个输出调用hist，因此它会给您bin中心和计数。然后手动进行条形图绘制，并在条形图顶部叠加缩放密度函数。请参阅下面我的答案。此答案本身没有足够的上下文。你能改进答案使其独立吗？我认为从提供的代码中可以清楚地看到“scalefactor”的计算。但是，我可以添加一些关于为什么这个特定比例因子有效的更多细节。请参阅更新的答案。此答案本身没有足够的上下文。你能改进答案使其独立吗？我认为从提供的代码中可以清楚地看到“scalefactor”的计算。但是，我可以添加一些关于为什么这个特定比例因子有效的更多细节。请参阅更新的答案。