Colors Gstreamer如何解释视频/x-raw的输出，格式=RGB？_Colors_Gstreamer_Rgb_H.264_Video Processing

Colors Gstreamer如何解释视频/x-raw的输出，格式=RGB？

colors gstreamer

Colors Gstreamer如何解释视频/x-raw的输出，格式=RGB？,colors,gstreamer,rgb,h.264,video-processing,Colors,Gstreamer,Rgb,H.264,Video Processing,所以我一直在做我的一个项目，我想让一架无人机自主飞行。无人机创建了一个我从PC连接到的WiFi。视频流通过UDP连接传输，并且是原始h264格式。我使用gstreamer显示流： gst-launch-1.0 udpsrc port=5555 ! h264parse ! avdec_h264 ! videoconvert ! autovideosink sync=false 这对于查看流非常有用。但我需要的是流中每个图像的3D RGB阵列。所以我试图将解码后的h264从gstreamer管道

所以我一直在做我的一个项目，我想让一架无人机自主飞行。无人机创建了一个我从PC连接到的WiFi。视频流通过UDP连接传输，并且是原始h264格式。我使用gstreamer显示流：

gst-launch-1.0 udpsrc port=5555 ! h264parse ! avdec_h264 ! videoconvert ! autovideosink sync=false

这对于查看流非常有用。但我需要的是流中每个图像的3D RGB阵列。所以我试图将解码后的h264从gstreamer管道中转换成RGB

gst-launch-1.0 udpsrc port=5555 ! h264parse ! avdec_h264 ! decodebin ! videoconvert ! video/x-raw,format=RGB ! tcpclientsink host=NETCAT_TCPLISTENER port=8888
nc -lnvp 8888 > rgb.txt

但我很难弄明白垃圾场意味着什么。维基也帮不了我什么忙。

下面是一个示例hextump的片段：

hexdump rgb.txt
.
.
.
0000100 232d 2b4c 4c21 212b 2b4c 4b21 202a 2a4b
0000110 4b20 202a 2a4b 4b20 202a 2a4a 4922 2129
0000120 2a45 4523 232a 2a42 4122 2129 2941 4021
0000130 2028 263e 3e1e 1e26 253d 3d1d 1d25 243c
0000140 3b1c 1b23 233b 3a1b 1a22 2137 3719 1921
0000150 2032 3217 1720 202f 2f16 1620 202f 2f16
0000160 1620 1f2e 2e15 151f 1e2d 2d14 141e 1d2c
0000170 2c13 131d 1d2c 2c13 131d 1d27 2714 141d
0000180 1b20 2014 141b 1c1c 1c13 131c 1c1c 1c13
0000190 131c 1b1b 1b12 121b 1a1a 1a11 111a 1919
00001a0 1910 1019 1919 1910 1019 181a 1a10 1018
00001b0 171c 1c10 1017 171c 1c10 1017 171c 1c10
00001c0 1017 171c 1c10 1017 171c 1c10 1017 171c
00001d0 1c10 1017 171c 1c10 1017 171c 1c10 1017
00001e0 191b 1b11 1119 1a1a 1a11 111a 1a1a 1a11
00001f0 111a 1a1a 1a11 111a 1a1a 1a11 111a 1a1a
0000200 1a11 111a 1a1a 1b11 121b 1b1b 1c12 131c
0000210 1d1d 1f14 161f 1f1f 2016 1720 2020 2017
0000220 1720 2020 2017 1720 2020 2017 1720 2020
0000230 2017 1720 2020 2017 1720 2020 2017 1720
0000240 1f1f 1f16 161f 1f1f 1f16 161f 1f1f 1f16
0000250 161f 1f1f 1f16 161f 1f1f 1f16 161f 1f1f
0000260 1f16 161f 1f1f 1f16 161f 1f1f 1f16 161f
0000270 2020 2017 1720 2020 2017 1720 2020 2017
0000280 1720 2020 2017 1720 2020 2017 1720 2020
0000290 2017 1720 2020 2017 1720 2020 2017 1720
00002a0 1f1f 1f16 161f 1f1f 1f16 161f 1f1f 1f16
00002b0 161f 1f1f 1f16 161f 1f1f 1f16 161f 1f1f
00002c0 1f16 161f 1f1f 1f16 161f 1f1f 1f16 161f
00002d0 1f1f 1f16 161f 1f1f 1f16 161f 1f1f 1f16
00002e0 161f 1f1f 1f16 161f 1f1f 1f16 161f 1f1f
00002f0 1f16 161f 1f1f 1f16 161f 1f1f 1f16 161f
0000300 1d1d 1d14 141d 1d1d 1d14 141d 1d1d 1d14
0000310 141d 1d1d 1d14 141d 1d1d 1d14 141d 1d1d
0000320 1d14 141d 1d1d 1d14 141d 1d1d 1d14 141d
0000330 1f1f 1f16 161f 1f1f 1f16 161f 1f1f 1f16
0000340 161f 1f1f 1f16 161f 1f1f 1f16 161f 1f1f
0000350 1f16 161f 1f1f 1f16 161f 1f1f 1f16 161f
0000360 1f1f 1f16 161f 1f1f 1f16 161f 1f1f 1f16
0000370 161f 1f1f 1f16 161f 1f1f 1f16 161f 1f1f
0000380 1f16 161f 1f1f 1f16 161f 1f1f 1f16 161f
0000390 1f1f 1f16 161f 1f1f 1f16 161f 1f1f 1f16
00003a0 161f 1f1f 1f16 161f 1f1f 1f16 161f 1f1f
00003b0 1f16 161f 1f1f 1f16 161f 1f1f 1f16 161f
00003c0 1f1f 1f16 161f 1f1f 1f16 161f 1f1f 1f16
00003d0 161f 1f1f 1f16 161f 1f1f 1f16 161f 1f1f
00003e0 1f16 161f 1f1f 1f16 161f 1f1f 1f16 161f
00003f0 1d1d 1d14 141d 1d1d 1d14 141d 1d1d 1d14
0000400 141d 1d1d 1f14 161f 1f1f 1f16 161f 1f1f
0000410 2016 1720 2020 2017 1720 2020 2017 1720
0000420 2121 2118 1821 2121 2118 1821 2121 2118
0000430 1821 2121 2118 1821 2121 2118 1821 2121
0000440 1517 ff13 1517 ff13 1517 ff13 1517 ff13
    .
    .
    .

所以我的问题是：如何解释video/x-raw的RGB转储，format=RGB函数？（我的目标是：为了编写一个解析器，返回流中每个帧的3D RGB数组。）

正如@MarkSetchell所建议的那样，我必须每像素取2个字母/数字。 Linux上的hexdump也以不希望的方式格式化了十六进制值

我为呈现gstreamer的原始RGB输出而准备的python脚本是：

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import binascii

with open('rgb.txt', 'rb') as f:
    hexdata = binascii.hexlify(f.read())

hexdata = hexdata.decode()
n = 2
hexlist = [hexdata[i:i+n] for i in range(0, len(hexdata), n)]

height = 720
width = 1280
depth = 3
array = np.zeros((height,width,depth), dtype=float)

counter = 0
for y in range(0,height):
    for x in range(0,width):
        for z in range(0,depth):
            array[y][x][z] = float(int(hexlist[counter], 16))
            counter = counter + 1



test_image = image.array_to_img(array)
imgplot = plt.imshow(test_image)

假设您的图像为640x480像素。您可能可以使用ImageMagick将其制作成如下JPEG

convert-depth 8-size 640x480 RGB:RGB.txt-auto-level result.jpg

它只表示第一个字节为红色，下一个字节为绿色，下一个字节为蓝色。然后对第二个像素重复该操作。@MarkSetchell

convert-depth 8-size 640x480 RGB:RGB.txt

-auto-level result.jpg生成图片，但它们充满了噪音。在我看来没有任何意义，但谢谢你的回答。我想进一步了解你的第二次陈述。让我们看最后一行<代码>1517 ff13 1517 ff13 1517 ff13 1517 ff13 1517 ff13。这些是六边形的。十六进制的1517是十进制的5399。但是RGB是十进制格式的[255255]。我该怎么处理这些大数字？你能添加前几行的转储，或者共享一个完整的二进制文件，并告诉我你的图像尺寸吗？每像素取2个字母/数字。15=>红色是255个最大值中的21个。如果255个最大值ff=>蓝色是255个，则绿色是23个。

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import binascii

with open('rgb.txt', 'rb') as f:
    hexdata = binascii.hexlify(f.read())

hexdata = hexdata.decode()
n = 2
hexlist = [hexdata[i:i+n] for i in range(0, len(hexdata), n)]

height = 720
width = 1280
depth = 3
array = np.zeros((height,width,depth), dtype=float)

counter = 0
for y in range(0,height):
    for x in range(0,width):
        for z in range(0,depth):
            array[y][x][z] = float(int(hexlist[counter], 16))
            counter = counter + 1



test_image = image.array_to_img(array)
imgplot = plt.imshow(test_image)