Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/305.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从二进制数组中逐行随机选择百分比?_Python_Random_Binary_Choice - Fatal编程技术网

Python 从二进制数组中逐行随机选择百分比?

Python 从二进制数组中逐行随机选择百分比?,python,random,binary,choice,Python,Random,Binary,Choice,我有一个二进制文件数组。。。我想要的是能够从每一行中选择一个特定的百分比。。。 f、 假设每行有100个,我想从第一行随机取回20%,从第二行随机取回10%, 第三名为40%,第四名为30%(当然全部为100%) 这很容易,只要在每一行上做随机选择(一个_idx,%)。问题是目标的位数也必须是100。。 i、 e.如果某些位重叠,并且随机选择选取了它们,则总数将不同于100位 另外,在每一行上,它应该尝试选择以前没有选择的位,至少作为一个选项 有什么想法吗 例如,我用于直截了当的情况的代码(不

我有一个二进制文件数组。。。我想要的是能够从每一行中选择一个特定的百分比。。。 f、 假设每行有100个,我想从第一行随机取回20%,从第二行随机取回10%, 第三名为40%,第四名为30%(当然全部为100%)

这很容易,只要在每一行上做随机选择(一个_idx,%)。问题是目标的位数也必须是100。。 i、 e.如果某些位重叠,并且随机选择选取了它们,则总数将不同于100位

另外,在每一行上,它应该尝试选择以前没有选择的位,至少作为一个选项

有什么想法吗


例如,我用于直截了当的情况的代码(不考虑所选索引是否跨行重复,仅在一行内重复):



我只能挑那些。。这就是为什么我使用索引,使用字符串列表作为一种方便,而不是位数组,并获得4个样本

In [39]: data = ['10000101', 
    ...:         '11110000', 
    ...:         '00011000']                                                    

In [40]: idxs = random.sample(range(len(data[0])), 4)                           

In [41]: # 20% row 1, 30% row 2, 50% row 3                                      

In [42]: row_selections = random.choices(range(len(data)), [0.2, 0.3, 0.5], k=len(idxs))                                                               

In [43]: idxs                                                                   
Out[43]: [7, 3, 1, 4]

In [44]: row_selections                                                         
Out[44]: [0, 2, 0, 1]

In [45]: picks = [ data[r][c] for (r, c) in zip(row_selections, idxs)]          

In [46]: picks                                                                  
Out[46]: ['1', '1', '0', '0']

好的,根据您的评论,这应该可以更好地作为一个示例,说明如何仅按比例从每个列表/数组中选择一个:

import random
a1= '10001010111110001101010101'
a2= '00101010001011010010100010'
a1 = [int(t) for t in a1]
a2 = [int(t) for t in a2]
a1_one_locations= [idx for idx, v in enumerate(a1) if v==1]
a2_one_locations= [idx for idx, v in enumerate(a2) if v==1]

# lists of indices where 1 exists in each list...
print(a1_one_locations)
print(a2_one_locations)

n_samples = 6 # total desired

# 40% from a1, remainder from a2
a1_samples = int(n_samples * 0.4)
a2_samples = n_samples - a1_samples
a1_picks = random.sample(a1_one_locations, a1_samples)
a2_picks = random.sample(a2_one_locations, a2_samples)

# print results
print('indices from a1: ', a1_picks)
print('indices from a2: ', a2_picks)
输出:

[0, 4, 6, 8, 9, 10, 11, 12, 16, 17, 19, 21, 23, 25]
[2, 4, 6, 10, 12, 13, 15, 18, 20, 24]
indices from a1:  [6, 21]
indices from a2:  [10, 15, 4, 20]

根据您的描述推断,是否要避免从不同行中的相同索引位置拾取位?(不清楚你所说的重叠和上面没有100b是什么意思…)所以使用“random.sample()”来选择100个索引位置,然后对于每个绘图,使用加权分布来选择要从中选择的行。我明白了,所以你创建了随机2D索引。。然后把这些碎片捡起来。。。对的顺便说一句,我只需要挑一个
import random
a1= '10001010111110001101010101'
a2= '00101010001011010010100010'
a1 = [int(t) for t in a1]
a2 = [int(t) for t in a2]
a1_one_locations= [idx for idx, v in enumerate(a1) if v==1]
a2_one_locations= [idx for idx, v in enumerate(a2) if v==1]

# lists of indices where 1 exists in each list...
print(a1_one_locations)
print(a2_one_locations)

n_samples = 6 # total desired

# 40% from a1, remainder from a2
a1_samples = int(n_samples * 0.4)
a2_samples = n_samples - a1_samples
a1_picks = random.sample(a1_one_locations, a1_samples)
a2_picks = random.sample(a2_one_locations, a2_samples)

# print results
print('indices from a1: ', a1_picks)
print('indices from a2: ', a2_picks)
[0, 4, 6, 8, 9, 10, 11, 12, 16, 17, 19, 21, 23, 25]
[2, 4, 6, 10, 12, 13, 15, 18, 20, 24]
indices from a1:  [6, 21]
indices from a2:  [10, 15, 4, 20]