通过Python从.idx3 ubyte文件或GZIP提取图像_Python_Mnist

通过Python从.idx3 ubyte文件或GZIP提取图像

python

通过Python从.idx3 ubyte文件或GZIP提取图像,python,mnist,Python,Mnist,我使用OpenCV中的FaceRecognitor创建了一个简单的facerecognition函数。它适用于人的图像现在我想用手写字符代替人来做一个测试。我遇到了MNIST数据集，但它们将图像存储在一个奇怪的文件中，这是我以前从未见过的我只需要从以下内容中提取一些图像： train-images.idx3-ubyte 并将它们保存在文件夹中，作为.gif 还是我误解了这件事。如果是，我在哪里可以获得这样的数据集编辑我还有gzip文件： train-images-idx3-ubyte.

我使用OpenCV中的FaceRecognitor创建了一个简单的facerecognition函数。它适用于人的图像

现在我想用手写字符代替人来做一个测试。我遇到了MNIST数据集，但它们将图像存储在一个奇怪的文件中，这是我以前从未见过的

我只需要从以下内容中提取一些图像：

train-images.idx3-ubyte

并将它们保存在文件夹中，作为

.gif

还是我误解了这件事。如果是，我在哪里可以获得这样的数据集

编辑

我还有gzip文件：

train-images-idx3-ubyte.gz

我正在尝试阅读内容，但是

show（）

不起作用，如果我

read（）

我会看到随机符号

images = gzip.open("train-images-idx3-ubyte.gz", 'rb')
print images.read()

编辑

通过使用以下工具获得了一些有用的输出：

with gzip.open('train-images-idx3-ubyte.gz','r') as fin:
    for line in fin:
        print('got line', line)

不知何故，我现在必须将其转换为图像，输出：

下载培训/测试图像和标签：

train-images-idx3-ubyte.gz：训练集映像
train-labels-idx1-ubyte.gz：训练集标签
t10k-images-idx3-ubyte.gz：测试集图像
t10k-labels-idx1-ubyte.gz：测试集标签

然后在workdir中解压，比如说

samples/

从PyPi获取包：

pip install python-mnist

导入

mnist

包并读取培训/测试图像：

from mnist import MNIST

mndata = MNIST('samples')

images, labels = mndata.load_training()
# or
images, labels = mndata.load_testing()

要向控制台显示图像，请执行以下操作：

index = random.randrange(0, len(images))  # choose an index ;-)
print(mndata.display(images[index]))

你会得到这样的结果：

............................
............................
............................
............................
............................
.................@@.........
..............@@@@@.........
............@@@@............
..........@@................
..........@.................
...........@................
...........@................
...........@...@............
...........@@@@@.@..........
...........@@@...@@.........
...........@@.....@.........
..................@.........
..................@@........
..................@@........
..................@.........
.................@@.........
...........@.....@..........
...........@....@@..........
............@@@@............
.............@..............
............................
............................
............................

说明：

图像列表的每个图像都是一个无符号字节的Python
```
列表
```


标签是一个Python数组
，包含无符号字节

（仅使用matplotlib、gzip和numpy）

提取图像数据：
import gzip
f = gzip.open('train-images-idx3-ubyte.gz','r')

image_size = 28
num_images = 5

import numpy as np
f.read(16)
buf = f.read(image_size * image_size * num_images)
data = np.frombuffer(buf, dtype=np.uint8).astype(np.float32)
data = data.reshape(num_images, image_size, image_size, 1)

打印图像：
import matplotlib.pyplot as plt
image = np.asarray(data[2]).squeeze()
plt.imshow(image)
plt.show()


打印前50个标签：
f = gzip.open('train-labels-idx1-ubyte.gz','r')
f.read(8)
for i in range(0,50):   
    buf = f.read(1)
    labels = np.frombuffer(buf, dtype=np.uint8).astype(np.int64)
    print(labels)

实际上，您可以使用PyPI提供的包。它的使用非常简单，可以直接将数据转换为numpy数组。
以下是您必须做的：
下载数据
从下载MNIST数据集。

如果您使用的是Linux，那么您可以使用从命令行本身获取它。只需运行：
wgethttp://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
wgethttp://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
wgethttp://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
wgethttp://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz

解压缩数据
解压或解压缩数据。在Linux上，您可以使用
最终，您应该拥有以下文件：
train-images-idx3-ubyte
train-labels-idx1-ubyte
t10k-images-idx3-ubyte
t10k-labels-idx1-ubyte

data/train-images-idx3-ubyte
数据/列车标签-idx1-ubyte
数据/t10k-images-idx3-ubyte
数据/t10k-labels-idx1-ubyte

前缀data/
只是因为我已将它们提取到名为data
的文件夹中。你的问题在这里之前似乎做得很好，所以继续阅读
使用idx2numpy
下面是一个简单的python代码，可以将解压文件中的所有内容读取为numpy数组
导入idx2numpy
将numpy作为np导入
文件='data/train-images-idx3-ubyte'
arr=idx2numpy.convert_from_文件（文件）
#arr现在是一种np.ndarray类型的物体，形状为60000,28,28

现在，您可以将其与OpenCV突出部分一起使用，与显示任何其他图像的方式相同，使用
cv.imshow（“图像”，arr[4]）

要安装idx2numpy，可以使用PyPI（pip
packagemanager）。只需运行以下命令：
pip安装idx2numpy
安装idx2numpy
pip install idx2numpy

下载数据
从中下载MNIST数据集
解压缩数据
最终，您应该拥有以下文件：
train-images-idx3-ubyte
train-labels-idx1-ubyte
t10k-images-idx3-ubyte
t10k-labels-idx1-ubyte

使用idx2numpy
pip install idx2numpy

将numpy导入为np
导入idx2numpy
将matplotlib.pyplot作为plt导入
imagefile='train images.idx3 ubyte'
imagearray=idx2numpy。从\u文件（imagefile）转换\u
imshow（imagearray[4]，cmap=plt.cm.binary）

我也有同样的问题
每当我将文件解压缩到可执行文件时，扩展名都没有被删除，因此我有：
train-images-idx3-ubyte.gz

无论何时，我移除了：
.gz，
我有：
这解决了我的问题。
这里直接为您提供了一个功能！（它以二进制格式加载，即0或1）
import gzip
import numpy as np


def training_images():
    with gzip.open('data/train-images-idx3-ubyte.gz', 'r') as f:
        # first 4 bytes is a magic number
        magic_number = int.from_bytes(f.read(4), 'big')
        # second 4 bytes is the number of images
        image_count = int.from_bytes(f.read(4), 'big')
        # third 4 bytes is the row count
        row_count = int.from_bytes(f.read(4), 'big')
        # fourth 4 bytes is the column count
        column_count = int.from_bytes(f.read(4), 'big')
        # rest is the image pixel data, each pixel is stored as an unsigned byte
        # pixel values are 0 to 255
        image_data = f.read()
        images = np.frombuffer(image_data, dtype=np.uint8)\
            .reshape((image_count, row_count, column_count))
        return images


def training_labels():
    with gzip.open('data/train-labels-idx1-ubyte.gz', 'r') as f:
        # first 4 bytes is a magic number
        magic_number = int.from_bytes(f.read(4), 'big')
        # second 4 bytes is the number of labels
        label_count = int.from_bytes(f.read(4), 'big')
        # rest is the label data, each label is stored as unsigned byte
        # label values are 0 to 9
        label_data = f.read()
        labels = np.frombuffer(label_data, dtype=np.uint8)
        return labels

PyPI上的python mnist

包中有一些代码可以完成这项工作。第页介绍了

.idx3 ubyte

的文件格式。如果有人想知道在哪里可以找到所有这些数据集？这里是链接->请注意，提取文件时，请将点重命名为

（否则会出现文件丢失错误），例如

t10k图像。idx3 ubyte

必须重命名为

t10k-images-idx3-ubyte

是f.read（16）和f.read（8）跳过非图像信息？现在重写以便于理解。是的，前两个字节（f.read（8））始终为0。阅读更多关于IDX（MNIST）格式的信息：但是你写了100个标签，但是把它改成了50个？谢谢，修正了。我觉得当它只是垂直显示时，在屏幕上显示大量数据并没有额外的价值。当它是水平+垂直堆叠时有它的用途。嗨，如果我想显示来自

train-labels-idx1-ubyte

（已经没有.gz）的图像，那么我必须做什么？有没有办法获得分离的图像而不是混合的图像？很好的端到端教程。此实用程序不仅适用于数字mnist，还适用于时尚mnist（可在此处找到--）；或任何其他idx格式的文件。from_bytes（）函数中的“big”是什么意思？“big”表示定义字节顺序的big-endian。在big-endian中，字的最高有效字节存储在较小的内存地址中。非常好。我尝试了大多数答案，只有这一个是完美的。