Python 使用matplotlib和numpy将图像和标签打印为条形图
因此,我必须将数据集中的图像分布绘制为条形图。我已经研究了几种方法,但都没有用 我有两个numpy阵列: X_列车-外形(20000,32,32,3) y_列车-外形(20000) 标签-标签字符串的标签索引字典。50码 因此,X_列包含图像,y_列包含相应的标签索引 我需要根据50个标签绘制X_列车的条形图。显示每个标签的图像数量分布 我是否应该首先将X_列数组中的图像按y_列中的相应索引分组?这与matplotlib.bar API调用有什么关系 或者我应该使用numpy直方图APIPython 使用matplotlib和numpy将图像和标签打印为条形图,python,numpy,matplotlib,plot,Python,Numpy,Matplotlib,Plot,因此,我必须将数据集中的图像分布绘制为条形图。我已经研究了几种方法,但都没有用 我有两个numpy阵列: X_列车-外形(20000,32,32,3) y_列车-外形(20000) 标签-标签字符串的标签索引字典。50码 因此,X_列包含图像,y_列包含相应的标签索引 我需要根据50个标签绘制X_列车的条形图。显示每个标签的图像数量分布 我是否应该首先将X_列数组中的图像按y_列中的相应索引分组?这与matplotlib.bar API调用有什么关系 或者我应该使用numpy直方图API 非常感
非常感谢您的帮助。一种方法是只使用带有一些附加参数的。你可以用这样的东西
In [55]: y
Out[55]: array([0, 0, 1, 2, 1])
In [54]: plt.hist(y, align='mid', range=(np.min(y), np.max(y)+1), bins=50)
In [55]: plt.xlabel("labels")
In [56]: plt.ylabel("image counts")
In [57]: plt.show()
图中显示标签0和1出现两次,2出现一次。
从
y_train
获取标签,并根据其计数进行绘图。请根据您的标签随意更改垃圾箱的数量 好吧,这篇文章是3年前发布的,也许你不再需要答案了,但这可能会帮助其他正在寻找答案的人。这在CIFAR10数据集上,培训数据集分为培训和验证:
from matplotlib import pyplot
from tensorflow.keras.datasets import cifar10
import matplotlib.pyplot as plt
import numpy as np
from sklearn.model_selection import train_test_split
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.1, random_state=1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_val = x_val.astype('float32')
classes = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
list_classes_train = []
list_classes_val = []
for i in range(len(np.unique(y_train))):
idx_train = np.where(y_train == i)
idx_train = idx_train[0]
x_train, y_train = x_train[idx_train], y_train[idx_train]
idx_val = np.where(y_val == i)
idx_val = idx_val[0]
x_val, y_val = x_val[idx_val], y_val[idx_val]
#print("The training samples of class {} -> {} is {}" .format(i, classes[i], x_train.shape))
#print("The validation samples of class {} -> {} is {}" .format(i, classes[i], x_val.shape))
list_classes_train.append(x_train.shape[0])
list_classes_val.append(x_val.shape[0])
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.1, random_state=1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_val = x_val.astype('float32')
x = np.arange(len(classes))
width = 0.35
fig, ax = plt.subplots()
rects1 = ax.bar(x - width/2, list_classes_train, width, label='train')
rects2 = ax.bar(x + width/2, list_classes_val, width, label='val')
ax.set_ylabel('data points')
ax.set_title('Training and Validation set')
ax.set_xticks(x)
ax.set_xticklabels(classes, rotation = 90)
ax.legend()
def autolabel(rects):
"""Attach a text label above each bar in *rects*, displaying its height."""
for rect in rects:
height = rect.get_height()
ax.annotate('{}'.format(height),
xy=(rect.get_x() + rect.get_width() / 2, height),
xytext=(0, 3), # 3 points vertical offset
textcoords="offset points",
ha='center', va='bottom')
autolabel(rects1)
autolabel(rects2)
fig.tight_layout()
plt.show()
结果是:
用于仅绘制训练集
y_pos = range(len(classes))
plt.bar(y_pos, list_classes_train)
# Rotation of the bars names
bars = ax.bar(y_pos, list_classes_train)
# plt.xticks(y_pos, classes, rotation=90)
for rect in bars:
height = rect.get_height()
plt.text(rect.get_x() + rect.get_width() / 2.0, height, '%d' % int(height), ha = 'center', va = 'bottom')
plt.show()
结果是