Ios 金属核平均值的计算

Ios 金属核平均值的计算,ios,gpu,metal,compute-shader,Ios,Gpu,Metal,Compute Shader,有人知道用金属内核中的随机浮点数计算缓冲区平均值的正确方法吗 在compute命令编码器上分派工作: threadsPerGroup = MTLSizeMake(1, 1, inputTexture.arrayLength); numThreadGroups = MTLSizeMake(1, 1, inputTexture.arrayLength / threadsPerGroup.depth); [commandEncoder dispatchThreadgroups:numThreadGr

有人知道用金属内核中的随机浮点数计算缓冲区平均值的正确方法吗

在compute命令编码器上分派工作:

threadsPerGroup = MTLSizeMake(1, 1, inputTexture.arrayLength);
numThreadGroups = MTLSizeMake(1, 1, inputTexture.arrayLength / threadsPerGroup.depth);

[commandEncoder dispatchThreadgroups:numThreadGroups
               threadsPerThreadgroup:threadsPerGroup];
内核代码:

kernel void mean(texture2d_array<float, access::read> inTex [[ texture(0) ]],
             device float *means                            [[ buffer(1) ]],
             uint3 id                                       [[ thread_position_in_grid ]]) {

    if (id.x == 0 && id.y == 0) {
        float mean = 0.0;
        for (uint i = 0; i < inTex.get_width(); ++i) {
            for (uint j = 0; j < inTex.get_height(); ++j) {
                    mean += inTex.read(uint2(i, j), id.z)[0];
            }
        }

        float textureArea = inTex.get_width() * inTex.get_height();
        mean /= textureArea;
        out[id.z] = mean;
    }
}
kernel void mean(texture2d_数组inTex[[texture(0)],
设备浮动*指[[缓冲器(1)]],
uint3 id[[螺纹位置在网格中]]{
如果(id.x==0&&id.y==0){
浮动平均值=0.0;
对于(uint i=0;i

缓冲区以纹理2D_数组类型的纹理表示,采用r32浮点像素格式。

如果可以使用uint数组(而不是浮点)作为数据源,我建议使用“原子提取和修改函数”(如金属着色语言中所述)以原子方式写入缓冲区

下面是一个内核函数的示例,它接受一个输入缓冲区(数据:一个浮点数组),并将缓冲区的总和写入一个原子缓冲区(sum,指向uint的指针):

在swift文件中,您可以设置缓冲区:

...
let data: [UInt] = [1, 2, 3, 4]
let dataBuffer = device.makeBuffer(bytes: &data, length: (data.count * MemoryLayout<UInt>.size), options: [])
commandEncoder.setBuffer(dataBuffer, offset: 0, at: 0)

var sum:UInt = 0
let sumBuffer = device!.makeBuffer(bytes: &sum, length: MemoryLayout<UInt>.size, options: [])
commandEncoder.setBuffer(sumBuffer, offset: 0, at: 1)
commandEncoder.endEncoding()
或者,如果您的初始数据源必须是一个浮点数组,您可以使用加速框架的方法,该方法对于此类计算非常快速


我希望这有帮助,干杯

我的浮点值从~1E-15到~1E8不等,也有负值。我不能以可接受的精度将它们转换为int或uint。
...
let data: [UInt] = [1, 2, 3, 4]
let dataBuffer = device.makeBuffer(bytes: &data, length: (data.count * MemoryLayout<UInt>.size), options: [])
commandEncoder.setBuffer(dataBuffer, offset: 0, at: 0)

var sum:UInt = 0
let sumBuffer = device!.makeBuffer(bytes: &sum, length: MemoryLayout<UInt>.size, options: [])
commandEncoder.setBuffer(sumBuffer, offset: 0, at: 1)
commandEncoder.endEncoding()
commandBuffer.commit()
commandBuffer.waitUntilCompleted()

let nsData = NSData(bytesNoCopy: sumBuffer.contents(),
                        length: sumBuffer.length,
                        freeWhenDone: false)
nsData.getBytes(&sum, length:sumBuffer.length)

let mean = Float(sum/data.count)
print(mean)