Algorithm 从计数向量中随机选取元素

Algorithm 从计数向量中随机选取元素,algorithm,matlab,statistics,octave,Algorithm,Matlab,Statistics,Octave,我目前正试图通过改变算法来优化一些MATLAB/Octave代码,但不知道如何处理一些随机性。假设我有一个整数向量V,每个元素代表一些东西的计数,在我的例子中是光子。现在,我想随机选取一些“东西”,创建一个大小相同的新向量,但调整计数 我现在是这样做的: function W = photonfilter(V, eff) % W = photonfilter(V, eff) % Randomly takes photons from V according to the given effici

我目前正试图通过改变算法来优化一些MATLAB/Octave代码,但不知道如何处理一些随机性。假设我有一个整数向量V,每个元素代表一些东西的计数,在我的例子中是光子。现在,我想随机选取一些“东西”,创建一个大小相同的新向量,但调整计数

我现在是这样做的:

function W = photonfilter(V, eff)
% W = photonfilter(V, eff)
% Randomly takes photons from V according to the given efficiency.
%
% Args:
%  V: Input vector containing the number of emitted photons in each
%     timeslot (one element is one timeslot). The elements are rounded
%     to integers before processing.
%  eff: Filter efficiency. On the average, every 1/eff photon will be
%       taken. This value must be in the range 0 < eff <= 1.
%  W: Output row vector with the same length as V and containing the number
%     of received photons in each timeslot.
%
% WARNING: This function operates on a photon-by-photon basis in that it
% constructs a vector with one element per photon. The storage requirements
% therefore directly depend on sum(V), not only on the length of V.

% Round V and make it flat.
Ntot = length(V);
V = round(V);
V = V(:);

% Initialize the photon-based vector, so that each element contains
% the original index of the photon.
idxV = zeros(1, sum(V), 'uint32');
iout = 1;
for i = 1:Ntot
  N = V(i);
  idxV(iout:iout+N-1) = i;
  iout = iout + N;
end;

% Take random photons.
idxV = idxV(randperm(length(idxV)));
idxV = idxV(1:round(length(idxV)*eff));

% Generate the output vector by placing the remaining photons back
% into their timeslots.
[W, trash] = hist(idxV, 1:Ntot);
功能W=光电过滤器(V,eff)
%W=光滤波器(V,eff)
%根据给定的效率从V中随机获取光子。
%
%Args:
%V:输入向量,包含每个阵列中发射的光子数
%时隙(一个元素就是一个时隙)。元素是圆形的
%在处理之前将其转换为整数。
%eff:过滤效率。平均而言,每1/eff光子将

%拿走了。此值必须在0W
的方法。事实上,可以进行一些优化。一旦这样做不是计算光子指数,让我们想象一组箱子。每个箱子都有一定的概率,所有箱子的概率之和加起来等于1。我们将段[0,1]分成若干部分,其长度与料仓的概率成正比。现在,对于我们生成的[0,1]中的每个随机数,我们可以快速找到它所属的箱子。最后,我们计算箱子中的数字以获得最终结果。下面的代码说明了这一想法

% Population size (number of photons).
N = 1000000;
% Sample size, size of V and W as well.
% For convenience of plotting, V and W are of the same size, but
% the algorithm doesn't enforce this constraint.
M = 10000;
% Number of Monte Carlo iterations, greater numbers give better quality.
K = 100000;

% Generate population of counts, use gaussian distribution to test the method.
% If implemented correctly histograms should have the same shape eventually.
V = hist(randn(1, N), M);
P = cumsum(V / sum(V));
% For every generated random value find its bin and then count the bins.
% Finally we normalize counts by the ration of N / K.
W = hist(lookup(P, rand(1, K)), M) * N / K;
% Compare distribution plots, they should be the same.
hold on;
plot(W, '+r');
plot(V, '*b');
pause

根据Alexander Solovets的回答,代码现在是这样的:

function W = photonfilter(V, eff, impl=1)

Ntot = length(V);
V = V(:);

if impl == 0
  % Original "straightforward" solution.
  V = round(V);
  idxV = zeros(1, sum(V), 'uint32');
  iout = 1;
  for i = 1:Ntot
    N = V(i);
    idxV(iout:iout+N-1) = i;
    iout = iout + N;
  end;
  idxV = idxV(randperm(length(idxV)));
  idxV = idxV(1:round(length(idxV)*eff));
  [W, trash] = hist(idxV, 1:Ntot);

else
  % Monte Carlo approach.
  Nphot = sum(V);
  P = cumsum(V / Nphot);
  W = hist(lookup(P, rand(1, round(Nphot * eff))), 0:Ntot-1);

end;
只要eff不太接近1(eff=1时,原始解产生W=V,而蒙特卡罗方法仍具有一定的随机性,从而违反了上限约束),则结果是相当可比的

在交互式倍频程外壳中进行测试:

octave:1> T=linspace(0,10*pi,10000);
octave:2> V=100*(1+sin(T));
octave:3> W1=photonfilter(V, 0.1, 0);
octave:4> W2=photonfilter(V, 0.1, 1);
octave:5> plot(T,V,T,W1,T,W2);
octave:6> legend('V','Random picking','Monte Carlo')
octave:7> sum(W1)
ans =  100000
octave:8> sum(W2)
ans =  100000
绘图:

对于K=100K,绘制的向量不是很好,虽然它仍然类似于原始形状。我建议您尝试使用K=1M。此外,规范化也不是最好的,我只是无法想出更好的方法,而不会给示例带来不必要的复杂性。太好了,谢谢!我不得不稍微调整它以实现我的“效率”系数发挥作用(事实上,我的K不是随机的,而是所需光子的数量)。这使得性能取决于输出光子的数量,但这是可以的,因为这通常比总和(V)小得多。如果您同意,我会将我的修改整合到您的答案中,并将其标记为已接受。是的,当然!您也可以将代码发送给我,我将更新答案。好的,编辑答案被拒绝,因为“编辑没有意义,让它成为答案”。我将这样做,因为我找不到直接与您联系的方法。