计算百分位数(Ruby)

计算百分位数(Ruby),ruby,percentile,quantile,Ruby,Percentile,Quantile,我的代码基于所描述的方法和 输出: 0.0: 95.061 0.1: 95.1205 0.2: 95.1325 0.3: 95.1689 0.4: 95.1692 0.5: 95.1615 0.6: 95.1773 0.7: 95.1862 0.8: 95.2102 0.9: 95.1981 1.0: 95.199 0.0: 95.061 0.1: 95.0939 0.2: 95.1091 0.3: 95.12691 0.4: 95.1492 0.5: 95.1579 0.6: 95.1645

我的代码基于所描述的方法和

输出:

0.0: 95.061
0.1: 95.1205
0.2: 95.1325
0.3: 95.1689
0.4: 95.1692
0.5: 95.1615
0.6: 95.1773
0.7: 95.1862
0.8: 95.2102
0.9: 95.1981
1.0: 95.199
0.0: 95.061
0.1: 95.0939
0.2: 95.1091
0.3: 95.12691
0.4: 95.1492
0.5: 95.1579
0.6: 95.16456
0.7: 95.1745
0.8: 95.1904
0.9: 95.19568
1.0: 95.199
这里的问题是,第80百分位高于第90和第100百分位。然而,据我所知,我的实现是如所述的,它返回了给定示例(0.9)的正确答案

我的代码中是否有错误,我没有看到?或者有更好的方法吗?

script 这听起来像是个家庭作业问题。不管怎样,这是一件很有趣的事

# Score class
class Score
  attr_accessor :value, :percentile
  def initialize(score)
    self.value = score.to_f
  end
  def <=>(foo)
    self.value <=> foo.value
  end
end

# load scores
scores = []
DATA.each do |line|
  scores << Score.new(line)
end
scores.sort!
scores_count = scores.size

# iterate through scores and calculate percentile
scores.each_with_index do |s, i|

  # L/N(100) = P
  # L = number of scores beneath this score (score array index)
  # N = total number of scores
  # P = percentile
  s.percentile = (i.to_f/scores_count.to_f*100).ceil
end

# output
puts "What is the precise percentile of each score"
scores.each_with_index do |s,i|
  puts "#{s.value} is in the #{s.percentile} percentile"
end

# bonus: what score is in the Xth percentile?
puts "\nWhat score is in the Xth percentile?"
percentiles = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
percentiles.each do |p|

  # P/100(N) = L
  # P = percentile
  # N = total number of scores
  # L = score array index
  l = (p.to_f/100*scores_count).ceil
  puts "#{p} percentile? #{scores[l].value}"
end


__END__
95.1772
95.1567
95.1937
95.1959
95.1442
95.061
95.1591
95.1195
95.1065
95.0925
95.199
95.1682

让它工作起来。将
-Infinity
添加到数组中,以便我可以使用范围
1-N
中的索引。我还将最后一行中的值乘以错误的变量

def percentile(param_array, percentage)
  another_array = param_array.to_a.dup
  another_array.push(-1.0/0.0)                   # add -Infinity to be 0th index
  another_array.sort!
  another_array_size = another_array.size - 1    # disregard -Infinity
  r = percentage.to_f * (another_array_size - 1) + 1
  if r <= 1 then return another_array[1]
  elsif r >= another_array_size then return another_array[another_array_size]
  end
  ir = r.truncate
  fr = fraction? r
  another_array[ir] + fr*(another_array[ir+1] - another_array[ir])
end

您还可以使用monkeypatch枚举:

module Enumerable

  def rank value, n_tiles
    count = self.length

    raise "You cannot split an array of #{count} elements into #{n_tiles} tiles!" if n_tiles > count 

    ordered_array = self.sort
    split_size = count / n_tiles

    boundaries = []
    (n_tiles - 1).times do |i|
      boundaries << ordered_array[(i + 1) * split_size - 1]
    end

    boundaries.each_with_index do |boundary, i|
      if value > boundaries.last
        return n_tiles
      elsif value <= boundary
        return (i + 1)
      end
    end
  end

end

@zxcvbnm这里有一个很好的描述:就像(大多数)任何其他比较函数一样,它返回-1、0或1表示b。@zxcvbnm,
Score
类是有效的,因为你不必麻烦同步数组键来保持分数值与百分位数对齐。尽管如此,我相信其他Ruby genii可能会做得更好。
def percentile(param_array, percentage)
  another_array = param_array.to_a.dup
  another_array.push(-1.0/0.0)                   # add -Infinity to be 0th index
  another_array.sort!
  another_array_size = another_array.size - 1    # disregard -Infinity
  r = percentage.to_f * (another_array_size - 1) + 1
  if r <= 1 then return another_array[1]
  elsif r >= another_array_size then return another_array[another_array_size]
  end
  ir = r.truncate
  fr = fraction? r
  another_array[ir] + fr*(another_array[ir+1] - another_array[ir])
end
0.0: 95.061
0.1: 95.0939
0.2: 95.1091
0.3: 95.12691
0.4: 95.1492
0.5: 95.1579
0.6: 95.16456
0.7: 95.1745
0.8: 95.1904
0.9: 95.19568
1.0: 95.199
module Enumerable

  def rank value, n_tiles
    count = self.length

    raise "You cannot split an array of #{count} elements into #{n_tiles} tiles!" if n_tiles > count 

    ordered_array = self.sort
    split_size = count / n_tiles

    boundaries = []
    (n_tiles - 1).times do |i|
      boundaries << ordered_array[(i + 1) * split_size - 1]
    end

    boundaries.each_with_index do |boundary, i|
      if value > boundaries.last
        return n_tiles
      elsif value <= boundary
        return (i + 1)
      end
    end
  end

end
a = [1,4,2,5,3,6]

# Test in which range (rank) the number '1' would be places, if the array is ordered and spit into 3 pieces:  
a.rank(1,3)
#=> 1