Python 3.x 修复从PIL图像到OpenCV Mat的低效图像转换_Python 3.x_Numpy_Opencv_Python Imaging Library

Python 3.x 修复从PIL图像到OpenCV Mat的低效图像转换

python-3.x numpy opencv

Python 3.x 修复从PIL图像到OpenCV Mat的低效图像转换,python-3.x,numpy,opencv,python-imaging-library,Python 3.x,Numpy,Opencv,Python Imaging Library,我在一个800x600大小的实时屏幕截图流上运行一个神经网络。因为我的速度只有3fps左右，所以我进行了一些故障排除，并发现每个步骤大约花费了多少时间：截图：12毫秒图像处理：280ms 物体检测和盒子可视化：16ms 显示图像：0.5ms 我正在使用mss拍摄截图（）以下是没有目标检测部分的代码： import numpy as np import cv2 from PIL import Image import mss monitor = {"top": 40, "left": 0

我在一个800x600大小的实时屏幕截图流上运行一个神经网络。因为我的速度只有3fps左右，所以我进行了一些故障排除，并发现每个步骤大约花费了多少时间：

截图：12毫秒
图像处理：280ms
物体检测和盒子可视化：16ms
显示图像：0.5ms

我正在使用mss拍摄截图（）

以下是没有目标检测部分的代码：

import numpy as np
import cv2
from PIL import Image
import mss
monitor = {"top": 40, "left": 0, "width": 800, "height": 600}

with mss.mss() as sct:
    while True:

        # # Screenshot:
        image = sct.grab(monitor)

        # # Image processing:
        image = Image.frombytes("RGB", image.size, image.bgra, "raw", "RGBX")
        (im_width, im_height) = image.size
        image_np = np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)

        # # Object detection and box visualisation:
        # ...

        # # Displaying image:
        cv2.imshow("Object Detection", image_np)

有什么办法可以加快速度吗？

每帧处理280ms，每秒可以处理3-4帧。你几乎只有两个选择

要么分享你的代码，希望我们能改进它

或者，使用4个CPU核的多处理，将第一帧分配给第一个核，第二帧分配给第二个核，依此类推，循环，您可能会每70毫秒输出一帧，产生14 fps。

每帧处理280ms，您将获得3-4帧/秒。你几乎只有两个选择

要么分享你的代码，希望我们能改进它

或者，使用4个CPU核的多处理，并将第一帧分配给第一个核，第二帧分配给第二个核，以此类推，循环使用，您可能会每70毫秒输出一帧，导致每秒14帧。

问题是，您的方法从BGRA图像格式开始。这是大量的数据，可能没有必要。可能有更有效的方法抓取屏幕截图并将其转换为OpenCV图像。在我的慢速机器上大约需要56ms：

import ctypes
import datetime
import cv2
import numpy as np

from PIL import ImageGrab


# workaround to allow ImageGrab to capture the whole screen
user32 = ctypes.windll.user32
user32.SetProcessDPIAware()

# measure running time
start_time = datetime.datetime.now()

# take a full screenshot of the desktop
image = np.array(ImageGrab.grab( bbox= (40, 0, 800, 600) ))

# convert from RGB to BGR order so that colors are displayed correctly
mat = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

# compute elapsed time
delta = datetime.datetime.now() - start_time
elapsed_time_ms = int(delta.total_seconds() * 1000)
print('* Elapsed time:', elapsed_time_ms, 'ms')

cv2.imshow('mat', mat)
cv2.waitKey()

import ctypes
import datetime
import cv2
import numpy as np

from PIL import ImageGrab


# workaround to allow ImageGrab to capture the whole screen
user32 = ctypes.windll.user32
user32.SetProcessDPIAware()

# measure running time
start_time = datetime.datetime.now()

# take a full screenshot of the desktop
image = np.array(ImageGrab.grab( bbox= (40, 0, 800, 600) ))

# convert from RGB to BGR order so that colors are displayed correctly
mat = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

# compute elapsed time
delta = datetime.datetime.now() - start_time
elapsed_time_ms = int(delta.total_seconds() * 1000)
print('* Elapsed time:', elapsed_time_ms, 'ms')

cv2.imshow('mat', mat)
cv2.waitKey()

使用这些行而不是“图像处理：”我的第一篇文章中的行解决了我的问题：

image = sct.grab(monitor)
image_np = np.array(image)
image_np = cv2.cvtColor(image_np, cv2.COLOR_RGBA2RGB)

我之前已经尝试只使用前两行，但我遇到了以下错误：

ValueError: Cannot feed value of shape (1, 600, 800, 4) for Tensor 'image_tensor:0', which has shape '(?, ?, ?, 3)'

我没有想到将图像从rgba转换为rgb会解决这个问题。我现在的速度大约是30 fps。

使用这些行而不是“图像处理”：我第一篇文章中的行解决了我的问题：

image = sct.grab(monitor)
image_np = np.array(image)
image_np = cv2.cvtColor(image_np, cv2.COLOR_RGBA2RGB)

我之前已经尝试只使用前两行，但我遇到了以下错误：

ValueError: Cannot feed value of shape (1, 600, 800, 4) for Tensor 'image_tensor:0', which has shape '(?, ?, ?, 3)'

我没有想到将图像从rgba转换为rgb会解决这个问题。我现在的速度大约是30fps。

如果没有需要280ms的完整“图像处理”代码，我将无法帮助您。我在代码中添加了一些注释，以显示哪些行可以做什么。“图像处理”下面的3行需要280毫秒。这就是你的意思吗？分别测量每一行以找出罪魁祸首：我的猜测是

Image.frombytes（）

是延迟的原因。frombytes（）只需要大约2ms，size需要0.5ms。第三行是造成延迟的原因。为什么要从屏幕抓取字节制作一个PIL图像，然后从PIL图像中取出所有字节，列出错误的形状，然后制作一个Numpy数组，然后调整大小？你可以用

img=np.array（sct.grab（monitor））

直接抓取到一个Numpy数组。如果没有需要280ms的完整“图像处理”代码，这将帮不了你。我在代码中添加了一些注释来说明哪些行可以做什么。“图像处理”下面的3行需要280毫秒。这就是你的意思吗？分别测量每一行以找出罪魁祸首：我的猜测是

Image.frombytes（）

img=np.array（sct.grab（monitor））

直接抓取一个Numpy数组。我今天没票了，但我同意。我看不出有什么明显的不好。谢谢你的回复。我已经添加了完整的代码，但我真的不明白为什么这会有帮助，因为我已经在第一段代码中包含了“图像处理”行。我今天没有投票，但我同意。我看不出有什么明显的不好。谢谢你的回复。我已经添加了完整的代码，但我真的不明白为什么这会有帮助，因为我已经在第一段代码中包含了“图像处理”行。如果您不需要BGR格式的图像，您可以通过去掉

cv2.cvtColor（）

并使此解决方案更快，将其保留为RGB。运行您的答案，通过尝试一些东西，我自己解决了我的问题。我用mss模块代替了ImageGrab，以拍摄屏幕截图（在我的机器上是100毫秒对10毫秒）。然后我使用了RGBA2RGB，因为我得到了一个错误：无法为具有形状“（？，？，？，3）”的张量“image_Tensor:0”提供形状（1600800，4）的值。现在它的运行速度约为30fps。另外，您正在运行哪些硬件？你提到56ms，但在我的笔记本电脑（i9-9980hk，rtx2080 max q）上，它的速度是56ms的两倍。我怀疑ImageGrab受到了集成图形（UHD graphics 630）的制约。带有Intel i5-8250U@1.6GHz 1.8GHz的笔记本电脑。视频卡是Intel UHD Graphics 620。我们理解，您提出的最终理想实现可能包含本线程最初未提及的改进，但如果您思考是谁帮助您确定了问题，然后提供了替代解决方案。。。你可能会得出这样的结论：这个答案值得选中复选框作为正式的问题解决方案。你的答案确实帮助了我，这就是为什么我对它投了更高的票，但是如果我将标题编辑为“将截图流输入对象检测神经网络的更快方式”，这个页面对未来的访问者不是会更有帮助吗并标记了我自己的ans