Python 高假阳性率_Python_Tensorflow_Keras_Conv Neural Network_Cnn

Python 高假阳性率

python tensorflow keras

Python 高假阳性率,python,tensorflow,keras,conv-neural-network,cnn,Python,Tensorflow,Keras,Conv Neural Network,Cnn,我试图训练一个卷积神经网络，但我得到了相当多的假阳性分类对象。我使用了两个类，每个类有10000个图像，它们之间有着非常明显的差异。对于CNN来说，我希望这是一项相当简单的任务，同时我还使用了一些手工制作的功能和随机森林分类器，在这之前它工作得相当好这是我正在使用的模型： def build(width, height, depth, classes): model = Sequential() inputShape = (height, width,

我试图训练一个卷积神经网络，但我得到了相当多的假阳性分类对象。我使用了两个类，每个类有10000个图像，它们之间有着非常明显的差异。对于CNN来说，我希望这是一项相当简单的任务，同时我还使用了一些手工制作的功能和随机森林分类器，在这之前它工作得相当好

这是我正在使用的模型：

    def build(width, height, depth, classes):
        model = Sequential()
        inputShape = (height, width, depth)
        chanDim = -1
        # if we are using "channels first", update the input shape
        # and channels dimension
        if K.image_data_format() == "channels_first":
            inputShape = (depth, height, width)
            chanDim = 1

        # CONV => RELU => POOL layer set
        model.add(Conv2D(32, (3, 3), padding="same",
                         input_shape=inputShape))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=chanDim))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        model.add(Dropout(0.25))

        # (CONV => RELU) * 2 => POOL layer set
        model.add(Conv2D(64, (3, 3), padding="same"))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=chanDim))
        model.add(Conv2D(64, (3, 3), padding="same"))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=chanDim))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        model.add(Dropout(0.25))

        # (CONV => RELU) * 3 => POOL layer set
        model.add(Conv2D(128, (3, 3), padding="same"))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=chanDim))
        model.add(Conv2D(128, (3, 3), padding="same"))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=chanDim))
        model.add(Conv2D(128, (3, 3), padding="same"))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=chanDim))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        model.add(Dropout(0.25))

        # first (and only) set of FC => RELU layers
        model.add(Flatten())
        model.add(Dense(512))
        model.add(Activation("relu"))
        model.add(BatchNormalization())
        model.add(Dropout(0.25))
        # softmax classifier
        model.add(Dense(classes))
        model.add(Activation("softmax"))
        # return the constructed network architecture
        return model

在应用了数据扩充之后，训练和验证丢失看起来更好了，但我仍然得到了许多误报

下面是验证集（）中一些示例图像的屏幕截图，绿色标记为正确分类，其余为误报。有什么建议，如何改进我的模型

编辑：我还添加了用于预处理图像的代码：

import matplotlib
matplotlib.use("Agg")
from smallvggnet import SmallVGGNet
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import SGD
from imutils import paths
import matplotlib.pyplot as plt
import numpy as np
import argparse
import random
import pickle
import cv2
import os
from keras.utils.np_utils import to_categorical

# initialize the data and labels
print("[INFO] loading images...")
data = []
labels = []
# grab the image paths and randomly shuffle them
imagePaths = sorted(list(paths.list_images("C:/06112020_hyphae/all/")))
random.seed(42)
random.shuffle(imagePaths)
# loop over the input images
for imagePath in imagePaths:
    
    image = cv2.imread(imagePath)
    image = cv2.resize(image, (350, 150))
    data.append(image)
    
    label = imagePath.split(os.path.sep)[-2].split('/')[-1]
    if label == 'pos':
        label = 1
    else:
        label = 0
    labels.append(label)

# scale the raw pixel intensities to the range [0, 1]
data = np.array(data, dtype="float") / 255.0
labels = np.array(labels)


# partition the data into training and testing splits using 75% of
# the data for training and the remaining 25% for testing
(trainX, testX, trainY, testY) = train_test_split(data, labels, test_size=0.25, random_state=42)
unique, counts = np.unique(trainY, return_counts=True)
print (dict(zip(unique, counts)))

trainY = to_categorical(trainY)
testY = to_categorical(testY)

# construct the image generator for data augmentation
#aug = ImageDataGenerator(rotation_range=30, width_shift_range=0.1, height_shift_range=0.1, zoom_range=0.2, horizontal_flip=True, fill_mode="nearest")
aug = ImageDataGenerator()

# initialize our VGG-like Convolutional Neural Network
model = SmallVGGNet.build(width=350, height=150, depth=3,
    classes=2)

# initialize our initial learning rate, # of epochs to train for,
# and batch size
INIT_LR = 0.01
EPOCHS = 20
BS = 32
# initialize the model and optimizer (you'll want to use
# binary_crossentropy for 2-class classification)
print("[INFO] training network...")
opt = SGD(lr=INIT_LR, decay=INIT_LR / EPOCHS)
model.compile(loss="binary_crossentropy", optimizer=opt,
    metrics=["accuracy"])
# train the network
H = model.fit(x=aug.flow(trainX, trainY, batch_size=BS),
    validation_data=(testX, testY), steps_per_epoch=len(trainX) // BS,
    epochs=EPOCHS)

雷卢都死了吗？不知道你在说什么。从图中可以看出，精度几乎为1，即近乎完美。也许你的标签错了，或者数据实际上不平衡？@CrawlCycle我只显示了少量图像。要明确的是，大多数否定词都被正确地归类为否定词，而肯定词也是如此。但剩余的误报数量仍然很奇怪…@xdurch0我再次检查，一切看起来都很好，很平衡。我还添加了用于预处理的代码，可能有一个bug…在您的数据集中有多少个标记为0、标记为1的图像？