Ios 金属顶点着色器绘制纹理的点_Ios_Metal_Vertex Shader_Opengl Es 3.0

Ios 金属顶点着色器绘制纹理的点

ios

Ios 金属顶点着色器绘制纹理的点,ios,metal,vertex-shader,opengl-es-3.0,Ios,Metal,Vertex Shader,Opengl Es 3.0,我想执行金属（或OpenGLES 3.0）着色器，该着色器使用混合绘制点基本体。为此，我需要将纹理的所有像素坐标作为顶点传递给顶点着色器，从而计算要传递给片段着色器的顶点的位置。片段着色器仅输出启用混合的点的颜色。我的问题是，如果有一个有效的方法，那就是将顶点的坐标传递给顶点着色器，因为对于1920x1080图像来说，会有太多的顶点，而这需要在一秒钟内完成30次？就像我们在计算着色器中使用dispatchThreadgroups命令所做的一样，只是计算着色器不能绘制启用混合的几何体编辑：这就是

我想执行金属（或OpenGLES 3.0）着色器，该着色器使用混合绘制点基本体。为此，我需要将纹理的所有像素坐标作为顶点传递给顶点着色器，从而计算要传递给片段着色器的顶点的位置。片段着色器仅输出启用混合的点的颜色。我的问题是，如果有一个有效的方法，那就是将顶点的坐标传递给顶点着色器，因为对于1920x1080图像来说，会有太多的顶点，而这需要在一秒钟内完成30次？就像我们在计算着色器中使用dispatchThreadgroups命令所做的一样，只是计算着色器不能绘制启用混合的几何体

编辑：这就是我所做的-

  let vertexFunctionRed = library!.makeFunction(name: "vertexShaderHistogramBlenderRed")

    let fragmentFunctionAccumulator = library!.makeFunction(name: "fragmentShaderHistogramAccumulator")


    let renderPipelineDescriptorRed = MTLRenderPipelineDescriptor()
    renderPipelineDescriptorRed.vertexFunction = vertexFunctionRed
    renderPipelineDescriptorRed.fragmentFunction = fragmentFunctionAccumulator
    renderPipelineDescriptorRed.colorAttachments[0].pixelFormat = .bgra8Unorm
    renderPipelineDescriptorRed.colorAttachments[0].isBlendingEnabled = true
    renderPipelineDescriptorRed.colorAttachments[0].rgbBlendOperation = .add
    renderPipelineDescriptorRed.colorAttachments[0].sourceRGBBlendFactor = .one
    renderPipelineDescriptorRed.colorAttachments[0].destinationRGBBlendFactor = .one

    do {
        histogramPipelineRed = try device.makeRenderPipelineState(descriptor: renderPipelineDescriptorRed)
    } catch {
        print("Unable to compile render pipeline state Histogram Red!")
        return
    }

图纸代码：

  let commandBuffer = commandQueue?.makeCommandBuffer()
        let renderEncoder = commandBuffer?.makeRenderCommandEncoder(descriptor: renderPassDescriptor!)
        renderEncoder?.setRenderPipelineState(histogramPipelineRed!)
        renderEncoder?.setVertexTexture(metalTexture, index: 0)
        renderEncoder?.drawPrimitives(type: .point, vertexStart: 0, vertexCount: 1, instanceCount: metalTexture!.width*metalTexture!.height)
  renderEncoder?.drawPrimitives(type: .point, vertexStart: 0, vertexCount: metalTexture!.width*metalTexture!.height, instanceCount: 1)

和着色器：

  vertex MappedVertex vertexShaderHistogramBlenderRed (texture2d<float, access::sample> inputTexture [[ texture(0) ]],
                                                 unsigned int vertexId [[vertex_id]])
  {
        MappedVertex out;

constexpr sampler s(s_address::clamp_to_edge, t_address::clamp_to_edge, min_filter::linear, mag_filter::linear, coord::pixel);

ushort width = inputTexture.get_width();
ushort height = inputTexture.get_height();

float X = (vertexId % width)/(1.0*width);
float Y = (vertexId/width)/(1.0*height);

 int red = inputTexture.sample(s, float2(X,Y)).r;

 out.position = float4(-1.0 + (red * 0.0078125), 0.0, 0.0, 1.0);
 out.pointSize = 1.0;
 out.colorFactor = half3(1.0, 0.0, 0.0);

 return out;
 }

   fragment half4 fragmentShaderHistogramAccumulator ( MappedVertex in [[ stage_in ]]
                                              )
 {
    half3 colorFactor = in.colorFactor;
    return half4(colorFactor*(1.0/256.0), 1.0); 
}

vertex MappedVertexShaderHistogramblered（纹理2D输入纹理[[纹理（0）]），
无符号整数vertexId[[vertex\u id]]
{
映射出顶点；
constexpr采样器s（s_地址：：钳制到边缘，t_地址：：钳制到边缘，最小过滤器：：线性，最大过滤器：：线性，坐标：：像素）；
ushort width=inputTexture.get_width（）；
ushort height=inputTexture.get_height（）；
浮动X=（顶点ID%宽度）/（1.0*宽度）；
浮动Y=（顶点ID/宽度）/（1.0*高度）；
int red=inputTexture.sample（s，float2（X，Y））.r；
out.position=float4（-1.0+（红色*0.0078125），0.0,0.0,1.0）；
out.pointSize=1.0；
out.colorFactor=half3（1.0,0.0,0.0）；
返回；
}
fragment half4 fragmentShaderHistogramAccumulator（映射顶点在[[stage_in]]
)
{
half3 colorFactor=英寸colorFactor；
返回半4（颜色因子*（1.0/256.0），1.0）；
}

也许您可以绘制一个实例为1920x1080次的单点。比如：

vertex float4 my_func(texture2d<float, access::read> image [[texture(0)]],
                      constant uint &width [[buffer(0)]],
                      uint instance_id [[instance_id]])
{
    // decompose the instance ID to a position
    uint2 pos = uint2(instance_id % width, instance_id / width);
    return float4(image.read(pos).r * 255, 0, 0, 0);
}

vertex float4 my_func（纹理2D图像[[texture（0）]），
常量uint和宽度[[缓冲区（0）]]，
uint实例id[[实例id]]
{
//将实例ID分解到一个位置
uint2 pos=uint2（实例id%宽度，实例id/宽度）；
返回float4（image.read（pos.r*255,0,0,0）；
}

嗯？您试图为渲染目标中的每个像素使用点基本体？为什么要使用（或试图使用）点原语？这听起来像是一个任务，你只需要画一个四边形。Uff，我尝试了各种方法来计算图像统计信息，比如直方图。尝试使用MPSImageHistogram，自定义金属计算着色器，它使用原子单位来增加统计信息，两者每帧都需要25毫秒。可能是原子操作太糟糕了，所以尝试另一种方法，简单地将纹理的每个像素映射到一个位置（0到255），片段着色器在启用添加混合的情况下简单地将颜色写入该点。目前这似乎非常缓慢，不知道为什么。当使用您尝试过的任何技术时，您如何确定是着色器花费了时间？你确定你没有耽误管道吗？您应用了常规吗？这是一个很好的观点，但我如何知道管道是否停止运行或GPU正在忙于处理？请给我指一些可以解决问题的工具。是的，MPS调优提示很好。谢谢你指出。检查和。我做了类似的事情，但是用了[vertex_id]而不是[instance_id]（更新了我的答案）。到底有什么区别？顺便说一句，着色器太慢了。真的没有太大的区别。是否有任何方法可以使着色器更快，我原以为混合会比原子更快，但它太慢了。很难知道。Xcode的GPU调试器可以提供一些见解。你需要取样吗？阅读速度可能稍快一些。无论如何，从理论上讲，

MPSImageHistogram

应该尽可能快地让苹果知道如何做到这一点。不，我不需要采样。事实上，我从read切换到sample，看看这是否提高了性能。