Python 对于小分数(参见比较)。非常简洁,但我认为比较圆形版ed版会更公平;如果接近perc%的元素比“物理”分布更重要,你应该从这里开始。@jornsharpe。事实上,您是对的,将其更改为四舍五入版本。@jornsharpe对于round版本,最大偏差从100
Python 对于小分数(参见比较)。非常简洁,但我认为比较圆形版ed版会更公平;如果接近perc%的元素比“物理”分布更重要,你应该从这里开始。@jornsharpe。事实上,您是对的,将其更改为四舍五入版本。@jornsharpe对于round版本,最大偏差从100,python,list,distribution,percentage,Python,List,Distribution,Percentage,对于小分数(参见比较)。非常简洁,但我认为比较圆形版ed版会更公平;如果接近perc%的元素比“物理”分布更重要,你应该从这里开始。@jornsharpe。事实上,您是对的,将其更改为四舍五入版本。@jornsharpe对于round版本,最大偏差从100%降至45%左右。 def select_elements(seq, perc): """Select a defined percentage of the elements of seq.""" return seq[::i
对于小分数(参见比较)。非常简洁,但我认为比较
圆形版
ed版会更公平;如果接近perc
%的元素比“物理”分布更重要,你应该从这里开始。@jornsharpe。事实上,您是对的,将其更改为四舍五入版本。@jornsharpe对于round
版本,最大偏差从100%降至45%左右。
def select_elements(seq, perc):
"""Select a defined percentage of the elements of seq."""
return seq[::int(100.0/perc)]
>>> select_elements(range(10), 50)
[0, 2, 4, 6, 8]
>>> select_elements(range(10), 33)
[0, 3, 6, 9]
>>> select_elements(range(10), 25)
[0, 4, 8]
>>> int(3.6)
3
>>> int(round(3.6))
4
from bisect import bisect_left
def equal_dist_els(my_list, fraction):
"""
Chose a fraction of equally distributed elements.
:param my_list: The list to draw from
:param fraction: The ideal fraction of elements
:return: Elements of the list with the best match
"""
length = len(my_list)
list_indexes = range(length)
nbr_bins = int(round(length * fraction))
step = length / float(nbr_bins) # the size of a single bin
bins = [step * i for i in xrange(nbr_bins)] # list of bin ends
# distribute indexes into the bins
splits = [bisect_left(list_indexes, wall) for wall in bins]
splits.append(length) # add the end for the last bin
# get a list of (start, stop) indexes for each bin
bin_limits = [(splits[i], splits[i + 1]) for i in xrange(len(splits) - 1)]
out = []
for bin_lim in bin_limits:
f, t = bin_lim
in_bin = my_list[f:t] # choose the elements in my_list belonging in this bin
out.append(in_bin[int(0.5 * len(in_bin))]) # choose the most central element
return out
from matplotlib import pyplot as plt
# def of equal_dist_els see above
def select_els(seq, perc):
"""Select a defined percentage of the elements of seq."""
return seq[::int(round(1./perc if perc != 0 else 0))]
list_length = 50
my_list = range(list_length)
percentages = range(1, 101)
fracts = map(lambda x: x * 0.01, percentages)
equal_dist = map(lambda x: abs(len(equal_dist_els(my_list, x)) / float(len(my_list)) - x), fracts)
slicing = map(lambda x: abs(len(select_els(my_list, x)) / float(len(my_list)) - x), fracts)
plt.plot(fracts, equal_dist, color='blue', alpha=0.8, linewidth=2, label=r'equal_dist_elements')
plt.plot(fracts, slicing, color='red', alpha=0.8, linewidth=2, label=r'select_elements by @jonrsharpe')
plt.title('Choosing equally dist. fraction of els from a list of length %s' % str(list_length))
plt.xlabel('requested fraction')
plt.ylabel('absolute deviation')
plt.legend(loc='upper left')
plt.show()