Python 科拉斯残疾人士致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉
我有这样一个数据帧:Python 科拉斯残疾人士致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉致歉,python,pandas,keras,keras-layer,Python,Pandas,Keras,Keras Layer,我有这样一个数据帧: user_id anime_id rating_user genre 0 1 20 NaN Action, Comedy, Martial Arts, Shounen, Super P... 1 3 20 8.0 Action, Comedy, Martial Arts, Shounen, Super P... 2 5 20 6.0 Action, Comedy, Martial Arts, Shounen, Su
user_id anime_id rating_user genre
0 1 20 NaN Action, Comedy, Martial Arts, Shounen, Super P...
1 3 20 8.0 Action, Comedy, Martial Arts, Shounen, Super P...
2 5 20 6.0 Action, Comedy, Martial Arts, Shounen, Super P...
3 6 20 NaN Action, Comedy, Martial Arts, Shounen, Super P...
4 10 20 NaN Action, Comedy, Martial Arts, Shounen, Super P...
5 21 20 8.0 Action, Comedy, Martial Arts, Shounen, Super P...
我正在尝试构建一个带有3个嵌入的推荐系统,但我遇到了以下错误:
InvalidArgumentError (see above for traceback): indices[0,0] = 31223 is not in [0, 6816)
[[Node: Anime-Embedding_1/Gather = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Anime-Embedding_1/embeddings/read, Anime-Embedding_1/Cast)]]
我的模型如下所示:
输入
gender_id_input = Input(shape=[1], name='gender')
anime_input = Input(shape=[1],name='Anime')
user_input = Input(shape=[1],name='User')
我的嵌入
gender_vec = Flatten(name='FlattenGender')(Embedding(n_genres + 1,
n_latent_factors_genre,
name='Gender-Embedding')(gender_id_input))
gender_vec = Dropout(0.2)(gender_vec)
#anime vec
anime_vec = Flatten(name='FlattenAnimes')(Embedding(n_animes + 1,
n_latent_factors_anime,
name='Anime-Embedding')(anime_input))
anime_vec = Dropout(0.2)(anime_vec)
#user vec
user_vec = Flatten(name='FlattenUsers')(Embedding(n_users + 1,
n_latent_factors_user,
name='User-Embedding')(user_input))
user_vec = Dropout(0.2)(user_vec)
#multi towers. #not obligated
gender_vec = Dense(64, activation='relu')(gender_vec)
anime_vec = Dense(64, activation='relu')(anime_vec)
user_vec = Dense(64, activation='relu')(user_vec)
然后我将动漫+性别元数据和concat与用户连接起来:
#concatenate
anime_vecs_complete = Concatenate()([gender_vec,anime_vec])
input_vecs = Concatenate()([user_vec, anime_vecs_complete])
input_vecs = Dropout(0.2)(input_vecs)
x = Dense(128, activation='relu')(input_vecs)
x = Dropout(0.2)(x)
y = Dense(1)(x)
model = Model(inputs=[user_input
,anime_input
,gender_id_input], outputs=y)
然后使用adam优化器编译。以下是该模型的摘要:
Layer (type) Output Shape Param # Connected to
==================================================================================================
gender (InputLayer) (None, 1) 0
__________________________________________________________________________________________________
Anime (InputLayer) (None, 1) 0
__________________________________________________________________________________________________
User (InputLayer) (None, 1) 0
__________________________________________________________________________________________________
Gender-Embedding (Embedding) (None, 1, 30) 79440 gender[0][0]
__________________________________________________________________________________________________
Anime-Embedding (Embedding) (None, 1, 30) 213240 Anime[0][0]
__________________________________________________________________________________________________
User-Embedding (Embedding) (None, 1, 30) 150000 User[0][0]
__________________________________________________________________________________________________
FlattenGender (Flatten) (None, 30) 0 Gender-Embedding[0][0]
__________________________________________________________________________________________________
FlattenAnimes (Flatten) (None, 30) 0 Anime-Embedding[0][0]
__________________________________________________________________________________________________
FlattenUsers (Flatten) (None, 30) 0 User-Embedding[0][0]
__________________________________________________________________________________________________
dropout_151 (Dropout) (None, 30) 0 FlattenGender[0][0]
__________________________________________________________________________________________________
dropout_152 (Dropout) (None, 30) 0 FlattenAnimes[0][0]
__________________________________________________________________________________________________
dropout_153 (Dropout) (None, 30) 0 FlattenUsers[0][0]
__________________________________________________________________________________________________
dense_220 (Dense) (None, 64) 1984 dropout_151[0][0]
__________________________________________________________________________________________________
dense_221 (Dense) (None, 64) 1984 dropout_152[0][0]
__________________________________________________________________________________________________
dense_222 (Dense) (None, 64) 1984 dropout_153[0][0]
__________________________________________________________________________________________________
concatenate_61 (Concatenate) (None, 128) 0 dense_220[0][0]
dense_221[0][0]
__________________________________________________________________________________________________
concatenate_62 (Concatenate) (None, 192) 0 dense_222[0][0]
concatenate_61[0][0]
__________________________________________________________________________________________________
dropout_154 (Dropout) (None, 192) 0 concatenate_62[0][0]
__________________________________________________________________________________________________
dense_223 (Dense) (None, 128) 24704 dropout_154[0][0]
__________________________________________________________________________________________________
dropout_155 (Dropout) (None, 128) 0 dense_223[0][0]
__________________________________________________________________________________________________
dense_224 (Dense) (None, 1) 129 dropout_155[0][0]
==================================================================================================
Total params: 473,465
Trainable params: 473,465
Non-trainable params: 0
说到合身
history = model.fit([train.user_id, train.anime_id, train.gender_id],
train.rating_user, batch_size=32, epochs=5,
validation_data=([
valid.user_id, valid.anime_id,
valid.gender_id], valid.rating_user))
我得到了上面的错误
附言:
- 潜在因子固定在30
- 我这样计算
,n_动画
,n_类型
,例如n_用户
n_用户= len(df[“用户id”].unique())
是一个映射结果:gender\u id
genres=df.genre.unique()
如果您需要任何补充信息,请询问我。您正在处理每个样本的3个输入参数。您不需要三个输入。只需为每个样本输入一个1×3的数组。我不确定是否理解,我应该将我的输入替换为一个唯一的数组,以便为每个嵌入指定?创建一个列表[性别、动画、用户]输入。所以输入形状是[3]。您不需要嵌入或联系。这使一个非常简单的问题变得过于复杂。我不会用这种方法失去用户/项目之间的交互?按照这个列表,我不嵌入或直接跳入模型??我仍然是Keras和Deep Learning的初学者,如果你能写一个例子,这对我来说会更容易理解,谢谢。神经网络可以自己计算交互作用,你不需要过度设计它。