Python 如何对我的RNN模型进行预测?
我创建了一个基于数据集的模型,该数据集包含以下示例:Name-Gender 我从数据集中读取数据,并将其拆分为训练数据和测试数据Python 如何对我的RNN模型进行预测?,python,tensorflow,keras,model,predict,Python,Tensorflow,Keras,Model,Predict,我创建了一个基于数据集的模型,该数据集包含以下示例:Name-Gender 我从数据集中读取数据,并将其拆分为训练数据和测试数据 # Give the location of the file loc = ("/content/gdrive/My Drive/DatasetProject/name_gender_dataset.csv") data = pd.read_csv(loc, header = 0) listNames = data['Name'].map(la
# Give the location of the file
loc = ("/content/gdrive/My Drive/DatasetProject/name_gender_dataset.csv")
data = pd.read_csv(loc, header = 0)
listNames = data['Name'].map(lambda x: x)
listGenders = data['Gender'].map(lambda x: x)
dictionary = dict(zip(listNames, listGenders))
max_rows = 500000 # Reduction due to memory limitations
df = (pd.read_csv(loc, usecols=['Name', 'Gender'])
.dropna(subset=['Name', 'Gender'])
.assign(Name = lambda x: x.Name.str.strip())
.head(max_rows))
names_train, names_test, gen_train, gen_test = train_test_split(listNames, listGenders, test_size=0.25, shuffle=True, random_state=123)
for name, gen in zip(names_train[:20], gen_train[:20]):
print(name, gen)
之后,我使用标记器为我的层创建输入
encoder_train = tf.keras.preprocessing.text.Tokenizer(char_level=True)
encoder_train.fit_on_texts(names_train)
encoder_test = tf.keras.preprocessing.text.Tokenizer(char_level=True)
encoder_test.fit_on_texts(names_test)
sequences = encoder_train.texts_to_sequences(names_train)
sequences= tf.keras.preprocessing.sequence.pad_sequences(sequences)
sequences_test= encoder_test.texts_to_sequences(names_test)
sequences_test= tf.keras.preprocessing.sequence.pad_sequences(sequences_test)
encoder_gen_train = tf.keras.preprocessing.text.Tokenizer(lower=False, char_level=True)
encoder_gen_train.fit_on_texts(gen_train)
encoder_gen_test = tf.keras.preprocessing.text.Tokenizer(lower=False, char_level=True)
encoder_gen_test.fit_on_texts(gen_test)
gender_vec_train = encoder_gen_train.texts_to_sequences(gen_train)
gender_vec_train = np.asarray(gender_vec_train)
gender_vec_test = encoder_gen_test.texts_to_sequences(gen_test)
gender_vec_test = np.asarray(gender_vec_test)
embedding_input_dim = max(encoder_train.index_word) + 1
embedding_output_dim = 32
我的模型描述如下:
model = Sequential()
model.add(Embedding(input_dim=embedding_input_dim,
output_dim=embedding_output_dim,
mask_zero=True))
model.add(LSTM(64, return_sequences= True))
model.add(LSTM(32))
model.add(Dense(64, activation='relu'))
model.add(Dense(2, activation='sigmoid'))
在此之后,我训练并编译了我的模型
model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
optimizer=tf.keras.optimizers.Adam(lr=0.075),
metrics=['accuracy'])
history = model.fit(sequences,
gender_vec_train,
epochs=5,
batch_size= 200,
validation_data= (sequences_test, gender_vec_test))
最后我想做一些预测。我想这样写:
模型预测(“安德鲁”)
为了能够做到这一点,我应该在模型中更改什么