Python 电影录制系统的scikit learn中的fit_变换出错_Python_Pandas_Dataframe

Python 电影录制系统的scikit learn中的fit_变换出错

python pandas dataframe

Python 电影录制系统的scikit learn中的fit_变换出错,python,pandas,dataframe,Python,Pandas,Dataframe,这是我尝试运行的完整代码。它运行得很好，但底部的第二行有错误 count\u matrix=count.fit\u变换（df['bag\u of\u words']）我也不知道这个bag\u of_word是从哪里来的..请建议编辑代码将熊猫作为pd导入从rake_nltk导入rake 将numpy作为np导入从sklearn.metrics.pairwise导入余弦_相似性从sklearn.feature\u extraction.text导入countvectorier df=p

这是我尝试运行的完整代码。它运行得很好，但底部的第二行有错误

count\u matrix=count.fit\u变换（df['bag\u of\u words']）

我也不知道这个

bag\u of_word

是从哪里来的..请建议编辑代码

将熊猫作为pd导入
从rake_nltk导入rake
将numpy作为np导入
从sklearn.metrics.pairwise导入余弦_相似性
从sklearn.feature\u extraction.text导入countvectorier
df=pd.read\u csv（'https://query.data.world/s/uikepcpffyo2nhig52xxeevdialfl7')
df=df[[‘标题’、‘流派’、‘导演’、‘演员’、‘情节’]]
df.head（）
#初始化新列
df[“关键字”]=“”
对于索引，df.iterrows（）中的行：
plot=行['plot']
r=Rake（）
r、 从文本（绘图）中提取关键字
关键词得分=r.获得单词学位（）
行['Keywords']=列表（Keywords\u dict\u scores.keys（））
drop（列=['Plot']，inplace=True）
count=CountVectorizer（）
count\u矩阵=count.fit\u变换（df['bag\u of\u words']）
cosine\u sim=cosine\u相似性（计数矩阵，计数矩阵）

错误如下

Traceback (most recent call last):
  File "C:\Python38\lib\site-packages\pandas\core\indexes\base.py", line 2897, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'bag_of_words'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "movie2.py", line 36, in <module>
    count_matrix = count.fit_transform(df['bag_of_words'])
  File "C:\Python38\lib\site-packages\pandas\core\frame.py", line 2995, in __getitem__
    indexer = self.columns.get_loc(key)
  File "C:\Python38\lib\site-packages\pandas\core\indexes\base.py", line 2899, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'bag_of_words'

回溯（最近一次呼叫最后一次）：
文件“C:\Python38\lib\site packages\pandas\core\indexes\base.py”，第2897行，在get\u loc中
返回发动机。获取位置（钥匙）
文件“pandas/_libs/index.pyx”，第107行，在pandas._libs.index.IndexEngine.get_loc中
文件“pandas/_libs/index.pyx”，第131行，在pandas._libs.index.IndexEngine.get_loc中
文件“pandas/_libs/hashtable_class_helper.pxi”，第1607行，在pandas._libs.hashtable.PyObjectHashTable.get_项中
文件“pandas/_libs/hashtable_class_helper.pxi”，第1614行，在pandas._libs.hashtable.PyObjectHashTable.get_项中
KeyError:“一袋字”
在处理上述异常期间，发生了另一个异常：
回溯（最近一次呼叫最后一次）：
文件“movie2.py”，第36行，在
count\u矩阵=count.fit\u变换（df['bag\u of\u words']）
文件“C:\Python38\lib\site packages\pandas\core\frame.py”，第2995行，在\uu getitem中__
indexer=self.columns.get_loc（键）
文件“C:\Python38\lib\site packages\pandas\core\indexes\base.py”，第2899行，在get\u loc中
返回self.\u引擎。获取\u loc（self.\u可能\u cast\u索引器（键））
文件“pandas/_libs/index.pyx”，第107行，在pandas._libs.index.IndexEngine.get_loc中
文件“pandas/_libs/index.pyx”，第131行，在pandas._libs.index.IndexEngine.get_loc中
文件“pandas/_libs/hashtable_class_helper.pxi”，第1607行，在pandas._libs.hashtable.PyObjectHashTable.get_项中
文件“pandas/_libs/hashtable_class_helper.pxi”，第1614行，在pandas._libs.hashtable.PyObjectHashTable.get_项中
KeyError:“一袋字”

请告诉我该怎么做？

如果可能，请共享完成以复制问题请不仅添加问题，还请编辑您的问题并使用一些格式。也许你可以看看，这意味着你的

数据框

不包含

列名

作为

bag\u单词

。请尝试

df.head（）

并检查该列。如果csv包含此列，请尝试

df=df[['Title'、'Genre'、'Director'、'Actors'、'Plot'、'bag_of_words']

如果csv包含此列，则应该可以使用。