Python 为每个内容创建一个包含前10条建议的数据框架_Python_Pandas_Cosine Similarity

Python 为每个内容创建一个包含前10条建议的数据框架

python pandas

Python 为每个内容创建一个包含前10条建议的数据框架,python,pandas,cosine-similarity,Python,Pandas,Cosine Similarity,我在这里遵循一个内容库推荐系统计算完余弦相似矩阵后，将创建一个函数，将前10个相似内容推荐给我们输入的内容 # creating a Series for the movie titles so they are associated to an ordered numerical # list I will use in the function to match the indexes indices = pd.Series(df.index) # defining the func

我在这里遵循一个内容库推荐系统

计算完余弦相似矩阵后，将创建一个函数，将前10个相似内容推荐给我们输入的内容

# creating a Series for the movie titles so they are associated to an ordered numerical
# list I will use in the function to match the indexes

indices = pd.Series(df.index)

#  defining the function that takes in movie title 
# as input and returns the top 10 recommended movies

def recommendations(title, cosine_sim = cosine_sim):

    # initializing the empty list of recommended movies
    recommended_movies = []

    # gettin the index of the movie that matches the title
    idx = indices[indices == title].index[0]

    # creating a Series with the similarity scores in descending order
    score_series = pd.Series(cosine_sim[idx]).sort_values(ascending = False)

    # getting the indexes of the 10 most similar movies
    top_10_indexes = list(score_series.iloc[1:11].index)

    # populating the list with the titles of the best 10 matching movies
    for i in top_10_indexes:
        recommended_movies.append(list(df.index)[i])

    return recommended_movies

上面给出了我输入的每个内容的前10个内容。我想创建一个数据框，其中第1列是所有内容，第2-10列是最类似的电影。因此，每一行都将是原始内容和除自身之外的前10部类似电影。我是python新手，非常感谢您的帮助。

考虑将输入标题及其建议保存在数据框中，然后根据需要使用值列运行

pivot\u table

。但是，首先调整函数以返回一个字典，并使用一个将结果传递到

DataFrame

构造函数的列表运行它：

indices = pd.Series(df.index)

def recommendations(title, cosine_sim = cosine_sim):    
    ...

    df_dict = {'title' = [title] * 10, 
               'recommended' = recommended_movies,
               'rank' = list(range(1, 11))}

    return  df_dict


# BUILD DATA FRAME FROM LIST OF DICTS
df = pd.DataFrame([recommendations(t) for t in indices.to_list()])

# PIVOT FOR TITLE X OTHERS VIEW 
pd.pivot_table(df, index = 'title',  columns = 'recommended',
               values = 'rank', aggunc = 'max')