Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/333.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何用python制作术语文档_Python_Machine Learning_Information Retrieval_Imdb_Inverted Index - Fatal编程技术网

如何用python制作术语文档

如何用python制作术语文档,python,machine-learning,information-retrieval,imdb,inverted-index,Python,Machine Learning,Information Retrieval,Imdb,Inverted Index,我有16000条来自imdb数据集的记录 Movie_Name Synops Alien Predator ['great','17th', 'abigail', 'by', 'century', 'is'] Shark Exorcist ['demonic', 'devil', 'great', 'hell', 'holy', 'nun'] Jurassic Shark ['abandoned', 'an', 'and', 'beautiful', '

我有16000条来自imdb数据集的记录

Movie_Name         Synops 
Alien Predator     ['great','17th', 'abigail', 'by', 'century', 'is']
Shark Exorcist     ['demonic', 'devil', 'great', 'hell', 'holy', 'nun']
Jurassic Shark     ['abandoned', 'an', 'and', 'beautiful', 'abigail',]
"great": Alien Predator,Shark Exorcist
"17th"  :Alien Predator
"abigail":Alien Predator,Jurassic Shark
.....
我不知道如何像这样为Synops专栏中的每个单词制作术语文档

Movie_Name         Synops 
Alien Predator     ['great','17th', 'abigail', 'by', 'century', 'is']
Shark Exorcist     ['demonic', 'devil', 'great', 'hell', 'holy', 'nun']
Jurassic Shark     ['abandoned', 'an', 'and', 'beautiful', 'abigail',]
"great": Alien Predator,Shark Exorcist
"17th"  :Alien Predator
"abigail":Alien Predator,Jurassic Shark
.....

首先将它们放入字典或JSON中。一旦你有了它

dataset = {
"Alien Predator":['great','17th', 'abigail', 'by', 'century', 'is'],
"Shark Exorcist":['demonic', 'devil', 'great', 'hell', 'holy', 'nun'],
"Jurassic Shark":['abandoned', 'an', 'and', 'beautiful', 'abigail',],
}
您可以从此处轻松查询值

search_word = "great"
d = [movie for movie, synops in dataset.items() if search_word in synops]
回馈
[“外星捕食者”,“鲨鱼驱魔者”]

您可以将它们添加到字典中以生成完整的结果

final_dict = {}
final_dict[search] = d
这应该给你一个答案

>>> final_dict
{'great': ['Alien Predator', 'Shark Exorcist']}
现在,您可以使用一些for循环和所需关键字列表来实现相同的功能,并自己完成任务

data = {
    "Alien Predator": ['great','17th', 'abigail', 'by', 'century', 'is'],
    "Shark Exorcist": ['demonic', 'devil', 'great', 'hell', 'holy', 'nun'],
    "Jurassic Shark": ['abandoned', 'an', 'and', 'beautiful', 'abigail',]
}

result = {}
for movie_name, keywords in data.items():
    for keyword in keywords:
        result.setdefault(keyword, []).append(movie_name)
print(result)
结果(为清晰起见添加了换行符):


数据集的表示形式是什么?它是一个以电影名称为键,以synops为值的字典吗?它是一个excel文件,有两列(电影名称,synops)。