Python 使用matplotlib和numpy将图像和标签打印为条形图

Python 使用matplotlib和numpy将图像和标签打印为条形图,python,numpy,matplotlib,plot,Python,Numpy,Matplotlib,Plot,因此,我必须将数据集中的图像分布绘制为条形图。我已经研究了几种方法,但都没有用 我有两个numpy阵列: X_列车-外形(20000,32,32,3) y_列车-外形(20000) 标签-标签字符串的标签索引字典。50码 因此,X_列包含图像,y_列包含相应的标签索引 我需要根据50个标签绘制X_列车的条形图。显示每个标签的图像数量分布 我是否应该首先将X_列数组中的图像按y_列中的相应索引分组?这与matplotlib.bar API调用有什么关系 或者我应该使用numpy直方图API 非常感

因此,我必须将数据集中的图像分布绘制为条形图。我已经研究了几种方法,但都没有用

我有两个numpy阵列:

X_列车-外形(20000,32,32,3) y_列车-外形(20000)

标签-标签字符串的标签索引字典。50码

因此,X_列包含图像,y_列包含相应的标签索引

我需要根据50个标签绘制X_列车的条形图。显示每个标签的图像数量分布

我是否应该首先将X_列数组中的图像按y_列中的相应索引分组?这与matplotlib.bar API调用有什么关系

或者我应该使用numpy直方图API


非常感谢您的帮助。

一种方法是只使用带有一些附加参数的。你可以用这样的东西

In [55]: y
Out[55]: array([0, 0, 1, 2, 1])

In [54]: plt.hist(y, align='mid', range=(np.min(y), np.max(y)+1), bins=50)
In [55]: plt.xlabel("labels")
In [56]: plt.ylabel("image counts")
In [57]: plt.show()

图中显示标签0和1出现两次,2出现一次。
y_train
获取标签,并根据其计数进行绘图。请根据您的标签随意更改垃圾箱的数量

好吧,这篇文章是3年前发布的,也许你不再需要答案了,但这可能会帮助其他正在寻找答案的人。这在CIFAR10数据集上,培训数据集分为培训和验证:

from matplotlib import pyplot
from tensorflow.keras.datasets import cifar10
import matplotlib.pyplot as plt
import numpy as np
from sklearn.model_selection import train_test_split

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.1, random_state=1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_val = x_val.astype('float32')


classes = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

list_classes_train = []
list_classes_val = []

for i in range(len(np.unique(y_train))):

    idx_train = np.where(y_train == i)
    idx_train = idx_train[0]
    x_train, y_train = x_train[idx_train], y_train[idx_train]

    idx_val = np.where(y_val == i)
    idx_val = idx_val[0]
    x_val, y_val = x_val[idx_val], y_val[idx_val]

    #print("The training samples of class {} -> {} is {}" .format(i, classes[i], x_train.shape))
    #print("The validation samples of class {} -> {} is {}" .format(i, classes[i], x_val.shape))

    list_classes_train.append(x_train.shape[0])
    list_classes_val.append(x_val.shape[0])

    (x_train, y_train), (x_test, y_test) = cifar10.load_data()
    x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.1, random_state=1)
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_val = x_val.astype('float32')

x = np.arange(len(classes))
width = 0.35

fig, ax = plt.subplots()

rects1 = ax.bar(x - width/2, list_classes_train, width, label='train')
rects2 = ax.bar(x + width/2, list_classes_val, width, label='val')

ax.set_ylabel('data points')
ax.set_title('Training and Validation set')
ax.set_xticks(x)
ax.set_xticklabels(classes, rotation = 90)
ax.legend()


def autolabel(rects):
    """Attach a text label above each bar in *rects*, displaying its height."""
    for rect in rects:
        height = rect.get_height()
        ax.annotate('{}'.format(height),
                    xy=(rect.get_x() + rect.get_width() / 2, height),
                    xytext=(0, 3),  # 3 points vertical offset
                    textcoords="offset points",
                    ha='center', va='bottom')


autolabel(rects1)
autolabel(rects2)

fig.tight_layout()

plt.show()
结果是:

用于仅绘制训练集

y_pos = range(len(classes))
plt.bar(y_pos, list_classes_train)

# Rotation of the bars names

bars = ax.bar(y_pos, list_classes_train)
# plt.xticks(y_pos, classes, rotation=90)

for rect in bars:
    height = rect.get_height()
    plt.text(rect.get_x() + rect.get_width() / 2.0, height, '%d' % int(height), ha = 'center', va = 'bottom')

plt.show()
结果是