Swift RGB到灰度，但像素值似乎错误_Swift_Rgb_Pixel_Tensorflow2.0_Grayscale

Swift RGB到灰度，但像素值似乎错误

swift

Swift RGB到灰度，但像素值似乎错误,swift,rgb,pixel,tensorflow2.0,grayscale,Swift,Rgb,Pixel,Tensorflow2.0,Grayscale,我有一个使用tensorflow 2.0/Keras构建的模型。输入是一个具有一个通道的28x28的图像。模型保存并转换为.tflite，并在我的swift ios应用程序中使用。不幸的是，当调用解释器时，我得到了与预期截然不同的预测。当我进一步调查，似乎我的图像准备可能是错误的。以下是我在将像素阵列输入模型之前所采取的步骤加载图像将图像转换为灰度通过除以255来规范化像素值像素缓冲区？{ 返回像素缓冲区（宽度：宽度，高度：高度，像素格式类型：kCVPixelFormatType_On

我有一个使用tensorflow 2.0/Keras构建的模型。输入是一个具有一个通道的28x28的图像。模型保存并转换为.tflite，并在我的swift ios应用程序中使用。不幸的是，当调用解释器时，我得到了与预期截然不同的预测。当我进一步调查，似乎我的图像准备可能是错误的。以下是我在将像素阵列输入模型之前所采取的步骤

加载图像

将图像转换为灰度

通过除以255来规范化像素值

像素缓冲区？{ 返回像素缓冲区（宽度：宽度，高度：高度，像素格式类型：kCVPixelFormatType_OneComponent8，颜色空间：CGColorSpaceCreateDeviceGray（）， alphaInfo:（无） } func pixelBuffer（宽度：Int，高度：Int，pixelFormatType:OSType， colorSpace:CGColorSpace，alphaInfo:CGImageAlphaInfo）->CVPixelBuffer？{ var maybePixelBuffer:CVPixelBuffer？设attrs=[kCVPixelBufferCGImageCompatibilityKey:kCFBooleanTrue， kCVPixelBufferCGBitmapContextCompatibilityKey:kCFBooleanTrue] 让状态=CVPixelBufferCreate（kCFAllocatorDefault，宽度，高度，像素格式类型， ATTR作为CFDictionary， &可能是像素缓冲区）保护状态==kCVReturnSuccess，让pixelBuffer=maybePixelBuffer else{ 归零 } CVPixelBufferLockBaseAddress（pixelBuffer，CVPixelBufferLockFlags（rawValue:0））让pixelData=CVPixelBufferGetBaseAddress（pixelBuffer） guard let context=CGContext（数据：pixelData，宽度：宽度，高度:高度,，比特组件：8， bytesPerRow:CVPixelBufferGetBytesPerRow（pixelBuffer），空间：色彩空间， bitmapInfo:alphaInfo.rawValue）否则{ 归零 } UIGraphicsPushContext（上下文） context.translateBy（x:0，y:CGFloat（高度）） context.scaleBy（x:1，y:-1）自绘制（in:CGRect（x:0，y:0，宽度：宽度，高度：高度）） UIGraphicsPopContext（） CVPixelBufferUnlockBaseAddress（pixelBuffer，CVPixelBufferLockFlags（rawValue:0））返回像素缓冲区 } } 扩展像素缓冲区{ func正规化（）{ // 1 让bytesPerRow=CVPixelBufferGetBytesPerRow（自）让totalBytes=CVPixelBufferGetDataSize（自身） let width=bytesPerRow/MemoryLayout.size let height=totalBytes/bytesperow // 2 CVPixelBufferLockBaseAddress（自身，CVPixelBufferLockFlags（原始值：0）） // 3 让floatBuffer=unsafeBitCast( CVPixelBufferGetBaseAddress（自），收件人：unsafemeutablepointer.self） // 4 var minPixel:Double=1.0 var maxPixel:Double=0.0 // 5 对于0中的i..<宽度*高度{ 让像素=浮动缓冲区[i] 最小像素=最小（像素，最小像素） maxPixel=max（像素，maxPixel） } // 6 让范围=最大像素-最小像素 // 7 对于0中的i..<宽度*高度{ 让像素=浮动缓冲区[i] 浮动缓冲区[i]=（像素-最小像素）/范围 } // 8 CVPixelBufferUnlockBaseAddress（self，CVPixelBufferLockFlags（rawValue:0）） } 您的

规范化（）

完全不符合您的目的

它将基于

Double

的像素缓冲区标准化为0.0…1.0，但您没有创建

Double

的像素缓冲区

您的

pixelBufferGray（宽度：高度：）

为

pixelFormatType

提供

kCVPixelFormatType\u OneComponent8

时，将创建

UInt8

的像素缓冲区

移除

i.normalize（）

并检查像素缓冲区。您将看到您期望的结果

您可能需要打包像素缓冲区，因为它只使用每个64字节行中的28字节，但这是另一个问题。

检查

CVPixelBufferGetBytesPerRow（pixelBuffer）

CVPixelBufferGetBytesPerRow（self）是64，而CVPixelBufferGetDataSize（self）是64is 1792您应该编写代码来执行RGB->Grayscale转换并测试该逻辑，而不涉及像素缓冲区和CoreVideo API。您需要处理颜色空间转换和gamma。然后，一旦您的代码完全测试并已知正常工作，请将该逻辑逐行应用于CoreVideo提供的像素。有关RGB的详细信息->灰度查看此问题（还提供了源代码）

let im =  UIImage(named: "dotsgray")!
let i = (im.pixelBufferGray(width: 28, height: 28))!
i.normalize()



extension UIImage {
  public func pixelBufferGray(width: Int, height: Int) -> CVPixelBuffer? {
        return pixelBuffer(width: width, height: height,
                           pixelFormatType: kCVPixelFormatType_OneComponent8,
                           colorSpace: CGColorSpaceCreateDeviceGray(),
                           alphaInfo: .none)
    }


    func pixelBuffer(width: Int, height: Int, pixelFormatType: OSType,
                     colorSpace: CGColorSpace, alphaInfo: CGImageAlphaInfo) -> CVPixelBuffer? {
        var maybePixelBuffer: CVPixelBuffer?
        let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
                     kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue]
        let status = CVPixelBufferCreate(kCFAllocatorDefault,
                                         width,
                                         height,
                                         pixelFormatType,
                                         attrs as CFDictionary,
                                         &maybePixelBuffer)

        guard status == kCVReturnSuccess, let pixelBuffer = maybePixelBuffer else {
            return nil
        }

        CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
        let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer)

        guard let context = CGContext(data: pixelData,
                                      width: width,
                                      height: height,
                                      bitsPerComponent: 8,
                                      bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer),
                                      space: colorSpace,
                                      bitmapInfo: alphaInfo.rawValue)
            else {
                return nil
        }

        UIGraphicsPushContext(context)
        context.translateBy(x: 0, y: CGFloat(height))
        context.scaleBy(x: 1, y: -1)
        self.draw(in: CGRect(x: 0, y: 0, width: width, height: height))
        UIGraphicsPopContext()

        CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
        return pixelBuffer
    }
}

extension CVPixelBuffer {

func normalize() {
  // 1
  let bytesPerRow = CVPixelBufferGetBytesPerRow(self)
  let totalBytes = CVPixelBufferGetDataSize(self)

  let width = bytesPerRow / MemoryLayout<UInt8>.size
  let height = totalBytes / bytesPerRow

  // 2
  CVPixelBufferLockBaseAddress(self, CVPixelBufferLockFlags(rawValue: 0))

  // 3
  let floatBuffer = unsafeBitCast(
    CVPixelBufferGetBaseAddress(self),
    to: UnsafeMutablePointer<Double>.self)

  // 4
  var minPixel: Double = 1.0
  var maxPixel: Double = 0.0

  // 5
  for i in 0 ..< width * height {
    let pixel = floatBuffer[i]
    minPixel = min(pixel, minPixel)
    maxPixel = max(pixel, maxPixel)
  }

  // 6
  let range = maxPixel - minPixel

  // 7
  for i in 0 ..< width * height {
    let pixel = floatBuffer[i]
    floatBuffer[i] = (pixel - minPixel) / range
  }

  // 8
  CVPixelBufferUnlockBaseAddress(self, CVPixelBufferLockFlags(rawValue: 0))
}