Python 3.x 使用预测生成器方法的分类报告_Python 3.x_Machine Learning

Python 3.x 使用预测生成器方法的分类报告

python-3.x machine-learning

Python 3.x 使用预测生成器方法的分类报告,python-3.x,machine-learning,Python 3.x,Machine Learning,因此，我正在使用每个字母的数据集制作一个土著语言翻译器。我在机器学习方面的知识很少，只做过2类图像分类器。最初，这些是我的代码，工作正常，但只能向我显示混淆矩阵，我需要分类报告，如F1分数，但我似乎无法理解我应该如何操作代码 import numpy as np from sklearn.linear_model import LogisticRegression from tensorflow import keras, metrics from tensorflow.keras.layers

因此，我正在使用每个字母的数据集制作一个土著语言翻译器。我在机器学习方面的知识很少，只做过2类图像分类器。最初，这些是我的代码，工作正常，但只能向我显示混淆矩阵，我需要分类报告，如F1分数，但我似乎无法理解我应该如何操作代码

import numpy as np
from sklearn.linear_model import LogisticRegression
from tensorflow import keras, metrics
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Model
from sklearn.metrics import confusion_matrix
import itertools
import matplotlib.pyplot as plt
from webencodings import labels
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

train_path=r'C:\Users\Acer\imagerec\BAYBAYIN\TRAIN'
valid_path=r'C:\Users\Acer\imagerec\BAYBAYIN\VAL'
test_path=r'C:\Users\Acer\imagerec\BAYBAYIN\TEST'

class_labels=['A', 'BA', 'KA', 'GA', 'HA', '1', '2', '3', '4', '5', '6', '7',
              '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19',
              '20', '21', '22', '23', '24', '25', '26', '28', '29', '30', '31', '32',
              '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44']

train_batches=ImageDataGenerator(preprocessing_function=keras.applications.xception.preprocess_input)\
    .flow_from_directory(train_path, target_size=(299,299),classes=class_labels,batch_size=5)
valid_batches=ImageDataGenerator(preprocessing_function=keras.applications.xception.preprocess_input)\
    .flow_from_directory(valid_path, target_size=(299,299),classes=class_labels,batch_size=5)
test_batches=ImageDataGenerator(preprocessing_function=keras.applications.xception.preprocess_input)\
    .flow_from_directory(test_path, target_size=(299,299),classes=class_labels,batch_size=5, shuffle=False)

base_model=keras.applications.vgg19.VGG19(include_top=False)

x=base_model.output
x=GlobalAveragePooling2D()(x)
x=Dense(1024, activation='relu')(x)
x=Dense(48, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=x)


base_model.trainable = False

N=1

print("HANG ON LEARNING IN PROGRESS...")

model.compile(Adam(lr=.0001),loss='categorical_crossentropy', metrics=['accuracy'])
history=model.fit_generator(train_batches, steps_per_epoch=1290, validation_data=valid_batches,
                            validation_steps=90,epochs=N,verbose=1)

print("[INFO]evaluating model...")

test_labels=test_batches.classes
predictions=model.predict_generator(test_batches, steps=28, verbose=1)


import matplotlib.pyplot as plt
import numpy as np


plt.imshow(np.random.random((48,48)), interpolation='nearest')
plt.xticks(np.arange(0,48), ['A', 'BA', 'KA', 'GA', 'HA', '1', '2', '3', '4', '5', '6', '7',
              '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19',
              '20', '21', '22', '23', '24', '25', '26', '28', '29', '30', '31', '32',
              '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44'])
plt.yticks(np.arange(0,48),['A', 'BA', 'KA', 'GA', 'HA', '1', '2', '3', '4', '5', '6', '7',
              '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19',
              '20', '21', '22', '23', '24', '25', '26', '28', '29', '30', '31', '32',
              '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44'])



plt.show()
model.save("X19baybayin.h5")

我如何使用预测，或者我可以使用它作为我的y-pred，我应该使用什么作为y-true

TL；博士

#you can unpack the step of creating the generator 
test_datagen = ImageDataGenerator(preprocessing_function=keras.applications.xception.preprocess_input)
test_generator = test_datagen.flow_from_directory(train_path, target_size=(299,299),classes=class_labels,batch_size=5)
##
##your code goes here
##
predictions=model.predict_generator(test_generator, steps=28, verbose=1)
print(classification_report(test_generator[1], predictions))

将批处理类测试为y

np.argmax（预测，轴=-1）为y_pred

我假设每个样本有一个类，因为您使用的是“softmax”和“Category_crossentropy”，并且您需要为每个样本获得最佳相关（一个）类（多类分类问题）

澄清：

# import classification_report
from sklearn.metrics import classification_report

# get the ground truth of your data. 
test_labels=test_batches.classes 

# predict the probability distribution of the data
predictions=model.predict_generator(test_batches, steps=28, verbose=1)

# get the class with highest probability for each sample
y_pred = np.argmax(predictions, axis=-1)

# get the classification report
print(classification_report(test_labels, y_pred))

注意：predict_生成器将被弃用，请使用model.predict代替。

TL；博士

将批处理类测试为y

np.argmax（预测，轴=-1）为y_pred

我假设每个样本有一个类，因为您使用的是“softmax”和“Category_crossentropy”，并且您需要为每个样本获得最佳相关（一个）类（多类分类问题）

澄清：

# import classification_report
from sklearn.metrics import classification_report

# get the ground truth of your data. 
test_labels=test_batches.classes 

# predict the probability distribution of the data
predictions=model.predict_generator(test_batches, steps=28, verbose=1)

# get the class with highest probability for each sample
y_pred = np.argmax(predictions, axis=-1)

# get the classification report
print(classification_report(test_labels, y_pred))

注意：predict_generator将被弃用，请改用model.predict。

hiya感谢您的帮助，不幸的是，我得到了这个错误

Traceback（最近一次调用）：文件“C:/Users/Acer/PycharmProjects/translator/venv/TRY.py”，第65行，正在打印（分类报告（test_generator[1]，预测））NameError:name“classification_report”未定义

您应该从sklearn导入它，使用：from sklearn.metrics import classification_report

回溯（最后一次调用）：文件“C:/Users/Acer/Pycharm项目/translator/venv/TRY.py”，第65行，打印（classification_report（测试生成器[1]，预测））文件“C:\Users\Acer\Anaconda3\envs\TRANSLATOR\lib\site packages\sklearn\metrics\u classification.py”，第1967行，在分类报告y\u type中，y\u true，y\u pred=\u check\u targets（y\u true，y\u pred）

文件“C:\Users\Acer\Anaconda3\envs\TRANSLATOR\lib\site packages\sklearn\utils\validation.py”，第212行，在check\u一致长度“示例：%r“%[int（l）表示l的长度]）ValueError:找到样本数不一致的输入变量：[2140]`谢谢您的帮助，不幸的是，我得到了这个错误

回溯（最近一次调用）：文件“C:/Users/Acer/PycharmProjects/translator/venv/TRY.py”，第65行，正在打印（分类报告（测试生成器[1]，预测）名称错误：名称“分类报告”未定义

您应该从sklearn导入，使用：from sklearn.metrics import classification_report

回溯（最近一次调用）：文件“C:/Users/Acer/PycharmProjects/translator/venv/TRY.py”，第65行，打印（分类报告（测试生成器[1]，预测））文件“C:\Users\Acer\Anaconda3\envs\TRANSLATOR\lib\site packages\sklearn\metrics\u classification.py”，第1967行，在分类报告y\u type中，y\u true，y\u pred=\u检查目标（y\u true，y\u pred）

`文件“C:\Users\Acer\Anaconda3\envs\TRANSLATOR\lib\site packages\sklearn\utils\validation.py”，第212行，检查长度一致“样本：%r”%[int（l）表示长度为l的样本]）值错误：找到样本数不一致的输入变量：[2，140]`