Python 创建包含随机整数的numpy数组,每行包含另一个范围
我需要制作一个快速的numpy数组,在每一行中生成不同范围的随机整数 我的工作代码,但当我将向量数增加到300000时速度较慢:Python 创建包含随机整数的numpy数组,每行包含另一个范围,python,arrays,numpy,random,Python,Arrays,Numpy,Random,我需要制作一个快速的numpy数组,在每一行中生成不同范围的随机整数 我的工作代码,但当我将向量数增加到300000时速度较慢: import numpy as np import random population_size = 4 vectors_number = population_size * 3 add_matrix = [] for i in range(0, int(vectors_number/population_size)): candidates = lis
import numpy as np
import random
population_size = 4
vectors_number = population_size * 3
add_matrix = []
for i in range(0, int(vectors_number/population_size)):
candidates = list(range(population_size*i, population_size*(i+1)))
random_index = random.sample(candidates, 4)
add_matrix.append(random_index)
winning_matrix = np.row_stack(add_matrix)
print(winning_matrix)
每行从变量范围中选择4个随机数
输出:
[[ 3 0 1 2]
[ 4 6 7 5]
[11 9 8 10]]
array([[ 0, 1, 3, 2],
[ 5, 6, 4, 7],
[11, 10, 9, 8]])
array([[2, 5, 2, 5],
[5, 6, 2, 4],
[1, 3, 2, 3],
[4, 2, 4, 4],
[2, 6, 4, 6],
[7, 2, 6, 3],
[4, 5, 3, 5],
[4, 6, 3, 6],
[3, 6, 3, 6]])
最好只使用不带循环的numpy创建矩阵在您的情况下,可以使用
map
和列表理解来压缩循环
winning_matrix = np.vstack ([random.sample (candidate, d2) for candidate in map (lambda i: list(range(population_size*i, population_size*(i+1))), range(0, int(vectors_number/population_size)))])
输出:
[[ 3 0 1 2]
[ 4 6 7 5]
[11 9 8 10]]
array([[ 0, 1, 3, 2],
[ 5, 6, 4, 7],
[11, 10, 9, 8]])
array([[2, 5, 2, 5],
[5, 6, 2, 4],
[1, 3, 2, 3],
[4, 2, 4, 4],
[2, 6, 4, 6],
[7, 2, 6, 3],
[4, 5, 3, 5],
[4, 6, 3, 6],
[3, 6, 3, 6]])
这可以分解为
# This is your loop generating the arrays from where you are sampling
range_list = map (lambda i: list(range(population_size*i, population_size*(i+1))), range(0, int(vectors_number/population_size)))
# This does the generation of the matrix, using exactly following your method
winning_matrix = np.vstack ([random.sample (candidate, d2) for candidate in range_list])
如果生成具有不同范围的随机整数(而不是从样本生成),则可以遵循以下方法 像这样的怎么样
# Generating upper and lower bounds for each row.
pair_ranges = product (list (range (1, 5)), list (range (5, 9)))
d2 = 4
np.vstack ([np.random.random_integers (x, y, [1, d2]) for x, y in pair_ranges])
输出:
[[ 3 0 1 2]
[ 4 6 7 5]
[11 9 8 10]]
array([[ 0, 1, 3, 2],
[ 5, 6, 4, 7],
[11, 10, 9, 8]])
array([[2, 5, 2, 5],
[5, 6, 2, 4],
[1, 3, 2, 3],
[4, 2, 4, 4],
[2, 6, 4, 6],
[7, 2, 6, 3],
[4, 5, 3, 5],
[4, 6, 3, 6],
[3, 6, 3, 6]])
这些行将具有介于范围之间的随机整数
array([[1, 5],
[1, 6],
[1, 7],
[2, 5],
[2, 6],
[2, 7],
[3, 5],
[3, 6],
[3, 7]])
下面是一种矢量化方法,用于提取唯一的随机样本-
ncols = 4
N = int(vectors_number/population_size)
offset = np.arange(N)[:,None]*population_size
winning_matrix = np.random.rand(N,population_size).argsort(1)[:,:ncols] + offset
我们还可以利用np.argpartition
替换最后一步-
r = np.random.rand(N,population_size)
out = r.argpartition(ncols,axis=1)[:,:ncols] + offset
时间安排-
In [63]: import numpy as np
...: import random
...:
...: population_size = 64
...: vectors_number = population_size * 300000
In [64]: %%timeit
...: add_matrix = []
...: for i in range(0, int(vectors_number/population_size)):
...: candidates = list(range(population_size*i, population_size*(i+1)))
...: random_index = random.sample(candidates, 4)
...: add_matrix.append(random_index)
...:
...: winning_matrix = np.row_stack(add_matrix)
1 loop, best of 3: 1.82 s per loop
In [65]: %%timeit
...: ncols = 4
...: N = int(vectors_number/population_size)
...: offset = np.arange(N)[:,None]*population_size
...: out = np.random.rand(N,population_size).argsort(1)[:,:ncols] + offset
1 loop, best of 3: 718 ms per loop
In [66]: %%timeit
...: ncols = 4
...: N = int(vectors_number/population_size)
...: offset = np.arange(N)[:,None]*population_size
...: r = np.random.rand(N,population_size)
...: out = r.argpartition(ncols,axis=1)[:,:ncols] + offset
1 loop, best of 3: 428 ms per loop
由于我们只在
64
中选择4
,因此很少会发生碰撞,因此我们可以使用替换进行绘制,然后进行更正
import numpy as np
def multiperm(y, x, factor=16, remap=False):
draw = np.random.randint(0, factor*x, (y, x))
idx = np.full((y, factor*x), -1, dtype=np.int8 if factor*x < 128 else int)
yi, xi = np.ogrid[:y, :x]
idx[yi, draw] = xi
yd, xd = np.where(idx[yi, draw] != xi)
while yd.size > 0:
ndraw = np.random.randint(0, factor*x, yd.shape)
draw[yd, xd] = ndraw
good = idx[yd, ndraw] == -1
idx[yd[good], ndraw[good]] = xd[good]
good[good] = idx[yd[good], ndraw[good]] == xd[good]
yd, xd = yd[~good], xd[~good]
if remap:
idx = np.zeros((y, factor*x), dtype=np.int8)
idx[yi, draw] = 1
idx[0, 0] -= 1
return idx.ravel().cumsum().reshape(idx.shape)[yi, draw]
else:
return draw + factor*x*yi
from timeit import timeit
print(timeit("multiperm(300_000, 4)", globals=globals(), number=100)*10, 'ms')
# sanity checks
check = multiperm(300_000, 4)
print(np.all(np.arange(300_000) * 64 <= check.T) and np.all(np.arange(1, 300_001) * 64 > check.T))
print(len(set(check.ravel().tolist())) == check.size)
通过更好的理解重新审视这个问题,意识到结果只需要64人中的第一个随机4人,我得出了这个答案。仍然有一个循环,但它是对少量必需列的循环,它基本上只将前4列(最终入围者)与随机的其他列交换:
import numpy as np
PLAYERS = 64 # per game
GAMES = 300000
FINALISTS = 4 # we only want to know the first four
# every player in every game has a unique id
matrix = np.arange(PLAYERS * GAMES).reshape((GAMES, PLAYERS))
games = np.arange(GAMES)
swaps = np.random.randint(0, PLAYERS, size=(FINALISTS, GAMES))
for i in range(FINALISTS):
# some trickey stuff to create tuples for indexing
dst = tuple(np.vstack([ games, i * np.ones(GAMES, dtype=np.int) ]))
src = tuple(np.vstack([ games, swaps[i] ]))
# do the a swap for location i
matrix[dst], matrix[src] = matrix[src], matrix[dst]
winning_matrix = matrix[:,:FINALISTS]
print(winning_matrix)
为什么不这样做:
range_1 = np.array([1,2,3,4]
range_2 = np.array([10,20,30,40]
第一行的值介于[1,10]之间,第二行的值介于[2,20]之间,依此类推
np.transpose(np.random.randint(range_1,range_2,(4,4)))
In [34]: np.transpose(np.random.randint(range_1,range_2,(4,4)))
Out[34]:
array([[ 2, 2, 6, 3],
[ 9, 5, 13, 11],
[ 3, 9, 14, 27],
[22, 15, 22, 32]])
在您的实际情况中,
vectors\u number
和population\u size
是什么?population\u size是此处未显示的一个数组块中的行数。vector_number是整个数组的行数。那么,在您的实际用例中,vector_number
和population_size
的典型值是什么?假设分别为300000
和4
安全吗?为了解决问题,我使用population\u size=64,vector\u number=population\u size*300000,我按时间衡量计算时间。process\u time(),这就是为什么我得到不同的结果acceleration@ZarakiKenpachi如果你的问题得到了回答,请考虑接受一个最好的。更多信息-