Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/javascript/448.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Javascript中的GPU并行速度变慢_Javascript_Parallel Processing_Webgl_Gpu_Gpu.js - Fatal编程技术网

Javascript中的GPU并行速度变慢

Javascript中的GPU并行速度变慢,javascript,parallel-processing,webgl,gpu,gpu.js,Javascript,Parallel Processing,Webgl,Gpu,Gpu.js,这是一个特殊的问题。我最近测试了一下。这个库应该通过使用webgl来并行计算来加速计算。我做了一个快速测试: var gpu = new GPU(); function product(v, u) { return gpu.createKernel(function(X, Y) { return X[this.thread.x] * Y[this.thread.x]; }).dimensions([v.length])(v, u); } var before = new

这是一个特殊的问题。我最近测试了一下。这个库应该通过使用webgl来并行计算来加速计算。我做了一个快速测试:

var gpu = new GPU();

function product(v, u) {
  return gpu.createKernel(function(X, Y) {
      return X[this.thread.x] * Y[this.thread.x];
  }).dimensions([v.length])(v, u);
}


var before = new Date().getTime();
console.log(product(numeric.random([100000]), numeric.random([100000])).length);
console.log('Parallel Time: ', (new Date().getTime()) - before);

before = new Date().getTime();
v = numeric.random([100000])
u = numeric.random([100000])
for(var i = 0; i < v.length; i++){
  v[i] = v[i] * u[i];
}
console.log(v.length);
console.log('Procedural Time: ', (new Date().getTime()) - before);
平行时间慢了一个数量级。有什么理由会这样吗?我在几个不同GPU的机器上试过这个。我也尝试过一些类似的手术。我是做错了什么,还是图书馆出了问题?有什么方法可以改进吗?

使用:

t0=performance.now();
yourFunctionCall();
t1=性能。现在();

log(“函数yourFunctionCall接受了”+(t1-t0)+“ms.”处理GPU时,您必须注意开销

调用
gpu.createKernel
可能非常昂贵,因为它必须解析JavaScript代码,创建适当的GLSL代码,并将其发送到WebGL进行编译和链接

至少您希望调用该命令一次,并将结果存储在全局变量中,以便在每次调用
product
时重复使用


同样值得注意的是,将数据移入和移出GPU所需的工作量不是零,因此,通过更复杂的计算,您将看到更多的收益。

我仔细阅读了他们基准测试的源代码,发现只有在连续运行大量操作时,您才能获得加速。我确实认为这是一个开销问题。我创建了以下超级简单的基准测试,比较gpu.js和numeric.js。如果有人感兴趣,请点击这里:

var gpu = new GPU();

var size = 512;
var scale = 10;
var iterations = 100;

// Scaling up the matricies decreases the effect of precision errors
A = numeric.mul(numeric.random([size, size]), scale)
B = numeric.mul(numeric.random([size, size]), scale)

// I know eval is dangerous but I couldn't get the size in any other way
function multGen(size) {
  return eval("(function(A, B) { var sum = 0; for (var i=0; i<"+ size +"; i++) {sum += A[this.thread.y][i] * B[i][this.thread.x];} return sum;})")
}

var mat_mult = gpu.createKernel(multGen(size)).dimensions([size, size]);

var before = new Date().getTime();
var parallel = mat_mult(A, B);

// Need to do many computations to get the advantages of the GPU
for(var i = 0; i < iterations; i++) {
  parallel = mat_mult(A, B);
}
var parTime = (new Date().getTime()) - before;
console.log('Parallel Time: ', parTime);

before = new Date().getTime();
var procedural = numeric.dot(A, B);

// Need to do many computations to get the advantages of the GPU
for(var i = 0; i < iterations; i++) {
  procedural = numeric.dot(A, B);
}
var procTime = (new Date().getTime()) - before;
console.log('Procedural Time: ', procTime);

console.log((procTime / parTime) + ' times faster');

// This is for RMSD nornalization, flattening and doing min and max that way exceeded the call stack
var max = Math.max(Math.max(...A.map((function(row) {return Math.max(...row);}))), Math.max(...B.map((function(row) {return Math.max(...row);}))))

var min = Math.min(Math.min(...A.map((function(row) {return Math.min(...row);}))), Math.min(...B.map((function(row) {return Math.min(...row);}))))

// The matricies will be different due to precision issues so the Normalized RMDS can give you an idea of the difference
var nrmsd = Math.sqrt(numeric.sum(numeric.pow(numeric.sub(parallel, procedural), 2)) / size) / (max - min);

console.log('Normalized RMSD: ', nrmsd);

这些结果相当不错。eval不公平地减慢了平行的速度,但它仍然总是更快。我不认为这样的设置对生产有好处,但它在这里仍然有效

如果在
http://gpu.rocks/
您是否获得了显著的加速?是的,我最初是这样做的,速度快了5.72倍。我不确定我做错了什么。谢谢你的回答,我读的时候正在打字。我确实认为这是一个开销问题,但我每次打电话都能解决这个问题。如果我真的用它做点什么,我一定会听从你的建议。谢谢获得更具可比性的基准的一个好地方是将
维度([v.length])(v,u)
分解为
var kernel=..维度([v.length]);内核(v,u)
并在调用中间
内核(v,u)
之前启动计时器。这不是问题的核心,但感谢您的提示。我让它和date一起工作,但如果我继续这样做,我会使用性能。谢谢您可以将
size
指定为常量:
var gpu = new GPU();

var size = 512;
var scale = 10;
var iterations = 100;

// Scaling up the matricies decreases the effect of precision errors
A = numeric.mul(numeric.random([size, size]), scale)
B = numeric.mul(numeric.random([size, size]), scale)

// I know eval is dangerous but I couldn't get the size in any other way
function multGen(size) {
  return eval("(function(A, B) { var sum = 0; for (var i=0; i<"+ size +"; i++) {sum += A[this.thread.y][i] * B[i][this.thread.x];} return sum;})")
}

var mat_mult = gpu.createKernel(multGen(size)).dimensions([size, size]);

var before = new Date().getTime();
var parallel = mat_mult(A, B);

// Need to do many computations to get the advantages of the GPU
for(var i = 0; i < iterations; i++) {
  parallel = mat_mult(A, B);
}
var parTime = (new Date().getTime()) - before;
console.log('Parallel Time: ', parTime);

before = new Date().getTime();
var procedural = numeric.dot(A, B);

// Need to do many computations to get the advantages of the GPU
for(var i = 0; i < iterations; i++) {
  procedural = numeric.dot(A, B);
}
var procTime = (new Date().getTime()) - before;
console.log('Procedural Time: ', procTime);

console.log((procTime / parTime) + ' times faster');

// This is for RMSD nornalization, flattening and doing min and max that way exceeded the call stack
var max = Math.max(Math.max(...A.map((function(row) {return Math.max(...row);}))), Math.max(...B.map((function(row) {return Math.max(...row);}))))

var min = Math.min(Math.min(...A.map((function(row) {return Math.min(...row);}))), Math.min(...B.map((function(row) {return Math.min(...row);}))))

// The matricies will be different due to precision issues so the Normalized RMDS can give you an idea of the difference
var nrmsd = Math.sqrt(numeric.sum(numeric.pow(numeric.sub(parallel, procedural), 2)) / size) / (max - min);

console.log('Normalized RMSD: ', nrmsd);
scriptfour.js:26 Parallel Time:  20490
scriptfour.js:36 Procedural Time:  28736
scriptfour.js:38 1.402440214738897 times faster
scriptfour.js:48 Normalized RMSD:  0.009671934749138042