Python 用keras二元分类模型进行预测
我正在尝试解决卡格尔-库拉问题配对挑战。 这是我的模型代码Python 用keras二元分类模型进行预测,python,machine-learning,keras,data-science,tensorflow2.0,Python,Machine Learning,Keras,Data Science,Tensorflow2.0,我正在尝试解决卡格尔-库拉问题配对挑战。 这是我的模型代码 data = pd.read_csv('train.csv') targets = data['is_duplicate'] nlp = spacy.load('en_core_web_lg') x_train, x_test, y_train, y_test = train_test_split(data,targets, test_size=0.25) questions = list(x_train['question1']
data = pd.read_csv('train.csv')
targets = data['is_duplicate']
nlp = spacy.load('en_core_web_lg')
x_train, x_test, y_train, y_test = train_test_split(data,targets, test_size=0.25)
questions = list(x_train['question1'].values.astype(str)) + list(x_train['question2'].values.astype(str))
# tokenizer
tokenzr = Tokenizer(num_words=20000)
max_length=100
# genrating vocab
tokenzr.fit_on_texts(questions)
#converting text to sequences
q1_train = tokenzr.texts_to_sequences(x_train['question1'].values.astype(str))
q2_train = tokenzr.texts_to_sequences(x_train['question2'].values.astype(str))
#padding
q1_train = pad_sequences(q1_train, maxlen=max_length,padding='post')
q2_train = pad_sequences(q2, maxlen=max_length,padding='post')
#converting text to sequences
q1_test = tokenzr.texts_to_sequences(x_test['question1'].values.astype(str))
q2_test = tokenzr.texts_to_sequences(x_test['question2'].values.astype(str))
#padding
q1_test = pad_sequences(q1_test, maxlen=max_length,padding='post')
q2_test = pad_sequences(q2_test, maxlen=max_length,padding='post')
# Embadding matrix
vocab = tokenzr.word_index
embedding_dim = 300
embedding_matrix = np.zeros((len(vocab), embedding_dim))
for i, word in tqdm(enumerate(vocab), total= len(vocab)):
embedding_matrix[i] = nlp(word).vector
# Keras model
act_q = 'relu'
drop = 0.35
# Model q1
model_q1 = Sequential()
model_q1.add(Embedding(len(vocab),300,weights=[embedding_matrix],input_length=100,trainable=True))
model_q1.add(Dropout(drop))
model_q1.add(LSTM(200,return_sequences=True,dropout=drop,))
model_q1.add(Dropout(drop))
model_q1.add(Dense(64,activation=act_q))
model_q1.add(BatchNormalization())
# Model q2
model_q2 = Sequential()
model_q2.add(Embedding(len(vocab),300,weights=[embedding_matrix],input_length=100,trainable=True))
model_q2.add(Dropout(drop))
model_q2.add(LSTM(200,return_sequences=True,dropout=drop))
model_q2.add(Dropout(drop))
model_q2.add(Dense(64,activation=act_q))
model_q2.add(BatchNormalization())
# Merge
mergedOut = Add()([model_q1.output,model_q2.output])
mergedOut = Dense(32, activation=act_q)(mergedOut)
mergedOut = Dropout(drop)(mergedOut)
mergedOut = BatchNormalization()(mergedOut)
mergedOut = Dense(1, activation='sigmoid')(mergedOut)
model = Model([model_q1.input,model_q2.input], mergedOut)
model.compile(optimizer = "adam", loss = 'binary_crossentropy' ,metrics = ['accuracy'])
model.summary()
model.fit([q1_train,q2_train],y_train,epochs = 10,batch_size=512, validation_split=0.3, shuffle=True)
在训练模型之后,我尝试用25%的测试数据进行预测
y_pred = model.predict([q1_test,q2_test])
我从y_pred得到的是
(101073, 100, 1)
而不是
(101073, 1)
我做错了什么?
我想用这个画出混乱矩阵,
如何将(101073,100,1)矩阵转换为带有类标签(0,1)的(101073,1)矩阵?从LSTM层中删除return_sequences参数。