Python 用keras二元分类模型进行预测

Python 用keras二元分类模型进行预测,python,machine-learning,keras,data-science,tensorflow2.0,Python,Machine Learning,Keras,Data Science,Tensorflow2.0,我正在尝试解决卡格尔-库拉问题配对挑战。 这是我的模型代码 data = pd.read_csv('train.csv') targets = data['is_duplicate'] nlp = spacy.load('en_core_web_lg') x_train, x_test, y_train, y_test = train_test_split(data,targets, test_size=0.25) questions = list(x_train['question1']

我正在尝试解决卡格尔-库拉问题配对挑战。 这是我的模型代码

data = pd.read_csv('train.csv')
targets = data['is_duplicate']
nlp =  spacy.load('en_core_web_lg')

x_train, x_test, y_train, y_test = train_test_split(data,targets, test_size=0.25)

questions = list(x_train['question1'].values.astype(str)) + list(x_train['question2'].values.astype(str))

#  tokenizer 
tokenzr = Tokenizer(num_words=20000)
max_length=100
# genrating vocab
tokenzr.fit_on_texts(questions)

#converting text to sequences
q1_train = tokenzr.texts_to_sequences(x_train['question1'].values.astype(str)) 
q2_train = tokenzr.texts_to_sequences(x_train['question2'].values.astype(str))

#padding 
q1_train = pad_sequences(q1_train, maxlen=max_length,padding='post')
q2_train = pad_sequences(q2, maxlen=max_length,padding='post')

#converting text to sequences
q1_test = tokenzr.texts_to_sequences(x_test['question1'].values.astype(str)) 
q2_test = tokenzr.texts_to_sequences(x_test['question2'].values.astype(str))

#padding 
q1_test = pad_sequences(q1_test, maxlen=max_length,padding='post')
q2_test = pad_sequences(q2_test, maxlen=max_length,padding='post')

# Embadding matrix
vocab = tokenzr.word_index
embedding_dim = 300 
embedding_matrix = np.zeros((len(vocab), embedding_dim))

for i, word in tqdm(enumerate(vocab), total= len(vocab)):
    embedding_matrix[i] = nlp(word).vector

# Keras model 
act_q = 'relu'
drop = 0.35

# Model q1
model_q1 = Sequential()
model_q1.add(Embedding(len(vocab),300,weights=[embedding_matrix],input_length=100,trainable=True)) 
model_q1.add(Dropout(drop))
model_q1.add(LSTM(200,return_sequences=True,dropout=drop,))
model_q1.add(Dropout(drop))
model_q1.add(Dense(64,activation=act_q))
model_q1.add(BatchNormalization())

# Model q2
model_q2 = Sequential()
model_q2.add(Embedding(len(vocab),300,weights=[embedding_matrix],input_length=100,trainable=True)) 
model_q2.add(Dropout(drop))
model_q2.add(LSTM(200,return_sequences=True,dropout=drop))
model_q2.add(Dropout(drop))
model_q2.add(Dense(64,activation=act_q))
model_q2.add(BatchNormalization())

# Merge 
mergedOut = Add()([model_q1.output,model_q2.output])
mergedOut = Dense(32, activation=act_q)(mergedOut)
mergedOut = Dropout(drop)(mergedOut)
mergedOut = BatchNormalization()(mergedOut)

mergedOut = Dense(1, activation='sigmoid')(mergedOut)
model = Model([model_q1.input,model_q2.input], mergedOut)
model.compile(optimizer = "adam", loss = 'binary_crossentropy' ,metrics = ['accuracy']) 
model.summary()

model.fit([q1_train,q2_train],y_train,epochs = 10,batch_size=512, validation_split=0.3, shuffle=True)
在训练模型之后,我尝试用25%的测试数据进行预测

y_pred = model.predict([q1_test,q2_test])
我从y_pred得到的是

(101073, 100, 1)
而不是

(101073, 1)
我做错了什么? 我想用这个画出混乱矩阵,
如何将(101073,100,1)矩阵转换为带有类标签(0,1)的(101073,1)矩阵?

从LSTM层中删除return_sequences参数。