Python 3.x 使用预测生成器方法的分类报告

Python 3.x 使用预测生成器方法的分类报告,python-3.x,machine-learning,Python 3.x,Machine Learning,因此,我正在使用每个字母的数据集制作一个土著语言翻译器。我在机器学习方面的知识很少,只做过2类图像分类器。最初,这些是我的代码,工作正常,但只能向我显示混淆矩阵,我需要分类报告,如F1分数,但我似乎无法理解我应该如何操作代码 import numpy as np from sklearn.linear_model import LogisticRegression from tensorflow import keras, metrics from tensorflow.keras.layers

因此,我正在使用每个字母的数据集制作一个土著语言翻译器。我在机器学习方面的知识很少,只做过2类图像分类器。最初,这些是我的代码,工作正常,但只能向我显示混淆矩阵,我需要分类报告,如F1分数,但我似乎无法理解我应该如何操作代码

import numpy as np
from sklearn.linear_model import LogisticRegression
from tensorflow import keras, metrics
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Model
from sklearn.metrics import confusion_matrix
import itertools
import matplotlib.pyplot as plt
from webencodings import labels
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

train_path=r'C:\Users\Acer\imagerec\BAYBAYIN\TRAIN'
valid_path=r'C:\Users\Acer\imagerec\BAYBAYIN\VAL'
test_path=r'C:\Users\Acer\imagerec\BAYBAYIN\TEST'

class_labels=['A', 'BA', 'KA', 'GA', 'HA', '1', '2', '3', '4', '5', '6', '7',
              '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19',
              '20', '21', '22', '23', '24', '25', '26', '28', '29', '30', '31', '32',
              '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44']

train_batches=ImageDataGenerator(preprocessing_function=keras.applications.xception.preprocess_input)\
    .flow_from_directory(train_path, target_size=(299,299),classes=class_labels,batch_size=5)
valid_batches=ImageDataGenerator(preprocessing_function=keras.applications.xception.preprocess_input)\
    .flow_from_directory(valid_path, target_size=(299,299),classes=class_labels,batch_size=5)
test_batches=ImageDataGenerator(preprocessing_function=keras.applications.xception.preprocess_input)\
    .flow_from_directory(test_path, target_size=(299,299),classes=class_labels,batch_size=5, shuffle=False)

base_model=keras.applications.vgg19.VGG19(include_top=False)

x=base_model.output
x=GlobalAveragePooling2D()(x)
x=Dense(1024, activation='relu')(x)
x=Dense(48, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=x)


base_model.trainable = False

N=1

print("HANG ON LEARNING IN PROGRESS...")

model.compile(Adam(lr=.0001),loss='categorical_crossentropy', metrics=['accuracy'])
history=model.fit_generator(train_batches, steps_per_epoch=1290, validation_data=valid_batches,
                            validation_steps=90,epochs=N,verbose=1)

print("[INFO]evaluating model...")

test_labels=test_batches.classes
predictions=model.predict_generator(test_batches, steps=28, verbose=1)


import matplotlib.pyplot as plt
import numpy as np


plt.imshow(np.random.random((48,48)), interpolation='nearest')
plt.xticks(np.arange(0,48), ['A', 'BA', 'KA', 'GA', 'HA', '1', '2', '3', '4', '5', '6', '7',
              '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19',
              '20', '21', '22', '23', '24', '25', '26', '28', '29', '30', '31', '32',
              '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44'])
plt.yticks(np.arange(0,48),['A', 'BA', 'KA', 'GA', 'HA', '1', '2', '3', '4', '5', '6', '7',
              '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19',
              '20', '21', '22', '23', '24', '25', '26', '28', '29', '30', '31', '32',
              '33', '34', '35', '36', '37', '38', '39', '40', '41', '42', '43', '44'])



plt.show()
model.save("X19baybayin.h5")
我如何使用预测,或者我可以使用它作为我的y-pred,我应该使用什么作为y-true

TL;博士

#you can unpack the step of creating the generator 
test_datagen = ImageDataGenerator(preprocessing_function=keras.applications.xception.preprocess_input)
test_generator = test_datagen.flow_from_directory(train_path, target_size=(299,299),classes=class_labels,batch_size=5)
##
##your code goes here
##
predictions=model.predict_generator(test_generator, steps=28, verbose=1)
print(classification_report(test_generator[1], predictions))
  • 将批处理类测试为y
  • np.argmax(预测,轴=-1)为y_pred

  • 我假设每个样本有一个类,因为您使用的是“softmax”和“Category_crossentropy”,并且您需要为每个样本获得最佳相关(一个)类(多类分类问题)

    澄清:

    # import classification_report
    from sklearn.metrics import classification_report
    
    # get the ground truth of your data. 
    test_labels=test_batches.classes 
    
    # predict the probability distribution of the data
    predictions=model.predict_generator(test_batches, steps=28, verbose=1)
    
    # get the class with highest probability for each sample
    y_pred = np.argmax(predictions, axis=-1)
    
    # get the classification report
    print(classification_report(test_labels, y_pred))
    
    注意:predict_生成器将被弃用,请使用model.predict代替。

    TL;博士

  • 将批处理类测试为y
  • np.argmax(预测,轴=-1)为y_pred

  • 我假设每个样本有一个类,因为您使用的是“softmax”和“Category_crossentropy”,并且您需要为每个样本获得最佳相关(一个)类(多类分类问题)

    澄清:

    # import classification_report
    from sklearn.metrics import classification_report
    
    # get the ground truth of your data. 
    test_labels=test_batches.classes 
    
    # predict the probability distribution of the data
    predictions=model.predict_generator(test_batches, steps=28, verbose=1)
    
    # get the class with highest probability for each sample
    y_pred = np.argmax(predictions, axis=-1)
    
    # get the classification report
    print(classification_report(test_labels, y_pred))
    

    注意:predict_generator将被弃用,请改用model.predict。

    hiya感谢您的帮助,不幸的是,我得到了这个错误
    Traceback(最近一次调用):文件“C:/Users/Acer/PycharmProjects/translator/venv/TRY.py”,第65行,正在打印(分类报告(test_generator[1],预测))NameError:name“classification_report”未定义
    您应该从sklearn导入它,使用:from sklearn.metrics import classification_report
    回溯(最后一次调用):文件“C:/Users/Acer/Pycharm项目/translator/venv/TRY.py”,第65行,打印(classification_report(测试生成器[1],预测))文件“C:\Users\Acer\Anaconda3\envs\TRANSLATOR\lib\site packages\sklearn\metrics\u classification.py”,第1967行,在分类报告y\u type中,y\u true,y\u pred=\u check\u targets(y\u true,y\u pred)
    文件“C:\Users\Acer\Anaconda3\envs\TRANSLATOR\lib\site packages\sklearn\utils\validation.py”,第212行,在check\u一致长度“示例:%r“%[int(l)表示l的长度])ValueError:找到样本数不一致的输入变量:[2140]`谢谢您的帮助,不幸的是,我得到了这个错误
    回溯(最近一次调用):文件“C:/Users/Acer/PycharmProjects/translator/venv/TRY.py”,第65行,正在打印(分类报告(测试生成器[1],预测)名称错误:名称“分类报告”未定义
    您应该从sklearn导入,使用:from sklearn.metrics import classification_report
    回溯(最近一次调用):文件“C:/Users/Acer/PycharmProjects/translator/venv/TRY.py”,第65行,打印(分类报告(测试生成器[1],预测))文件“C:\Users\Acer\Anaconda3\envs\TRANSLATOR\lib\site packages\sklearn\metrics\u classification.py”,第1967行,在分类报告y\u type中,y\u true,y\u pred=\u检查目标(y\u true,y\u pred)
    `文件“C:\Users\Acer\Anaconda3\envs\TRANSLATOR\lib\site packages\sklearn\utils\validation.py”,第212行,检查长度一致“样本:%r”%[int(l)表示长度为l的样本])值错误:找到样本数不一致的输入变量:[2,140]`