Python 2.7 使用pytextrank(textrank的Python实现)时Spacy中的值错误
我曾经提取过关键词。我使用下面的命令安装了pytextrank和spacyPython 2.7 使用pytextrank(textrank的Python实现)时Spacy中的值错误,python-2.7,jupyter-notebook,spacy,pytextrank,Python 2.7,Jupyter Notebook,Spacy,Pytextrank,我曾经提取过关键词。我使用下面的命令安装了pytextrank和spacy pip install pytextrank pip install -U spacy python -m spacy download en 这是我的密码 import pytextrank import sys path_stage0 = jsonPath path_stage1 = "data/json/temp/o1.json" with open(path_stage1, 'w') as f: f
pip install pytextrank
pip install -U spacy
python -m spacy download en
这是我的密码
import pytextrank
import sys
path_stage0 = jsonPath
path_stage1 = "data/json/temp/o1.json"
with open(path_stage1, 'w') as f:
for graf in pytextrank.parse_doc(pytextrank.json_iter(path_stage0)):
f.write("%s\n" % pytextrank.pretty_print(graf._asdict()))
# to view output in this notebook
print(pytextrank.pretty_print(graf))
当我尝试执行此操作时,我得到以下错误
ValueError Traceback (most recent call last)
<ipython-input-12-07819fc6acea> in <module>()
6
7 with open(path_stage1, 'w') as f:
----> 8 for graf in
pytextrank.parse_doc(pytextrank.json_iter(path_stage0)):
9 f.write("%s\n" % pytextrank.pretty_print(graf._asdict()))
10 # to view output in this notebook
/home/sameera/anaconda2/lib/python2.7/site-
packages/pytextrank/pytextrank.pyc in parse_doc(json_iter)
259 print("graf_text:", graf_text)
260
--> 261 grafs, new_base_idx = parse_graf(meta["id"], graf_text, base_idx)
262 base_idx = new_base_idx
263
/home/sameera/anaconda2/lib/python2.7/site-packages/pytextrank/pytextrank.pyc in parse_graf(doc_id, graf_text, base_idx, spacy_nlp)
193 doc = spacy_nlp(graf_text, parse=True)
194
--> 195 for span in doc.sents:
196 graf = []
197 digest = hashlib.sha1()
/home/sameera/anaconda2/lib/python2.7/site-packages/spacy/tokens/doc.pyx in __get__ (spacy/tokens/doc.cpp:9664)()
432
433 if not self.is_parsed:
--> 434 raise ValueError(
435 "sentence boundary detection requires the dependency parse, which "
436 "requires data to be installed. If you haven't done so, run: "
ValueError: sentence boundary detection requires the dependency parse, which
requires data to be installed. If you haven't done so, run:
python -m spacy download en
to install the data
ValueError回溯(最近一次调用)
在()
6.
7开放式(路径1,w')作为f:
---->格拉夫8号
pytextrank.parse_doc(pytextrank.json_iter(path_stage0)):
9 f.write(“%s\n”%pytextrank.pretty\u print(graf.\u asdict()))
10#查看此笔记本中的输出
/home/sameera/anaconda2/lib/python2.7/site-
parse_doc(json_iter)中的packages/pytextrank/pytextrank.pyc
259打印(“graf_文本:”,graf_文本)
260
-->261 grafs,new_base_idx=parse_graf(meta[“id”],graf_text,base_idx)
262 base_idx=新的base_idx
263
/home/sameera/anaconda2/lib/python2.7/site-packages/pytextrank/pytextrank.pyc in parse_graf(doc_id、graf_text、base_idx、spacy_nlp)
193 doc=spacy\u nlp(graf\u text,parse=True)
194
-->195文件中的跨度:
196格拉夫=[]
197 digest=hashlib.sha1()
/home/sameera/anaconda2/lib/python2.7/site-packages/spacy/tokens/doc.pyx in_uuuuget_uuu(spacy/tokens/doc.cpp:9664)()
432
433如果未解析self.u:
-->434提升值错误(
435“句子边界检测需要依赖项解析,它”
436“需要安装数据。如果尚未安装,请运行:”
ValueError:句子边界检测需要依赖项解析,这
需要安装数据。如果尚未安装,请运行:
python-mspacy下载
要安装数据
我使用的是python 2.7、anaconda 4.3、jupyter笔记本和ubuntu 14.04,这可能只是您将代码复制到StackOverflow时的一个错误,但如果不是: 确保在“with”语句下面缩进所有内容,包括for循环 基本上:
with open(path_stage1, 'w') as f:
for graf in pytextrank.parse_doc(pytextrank.json_iter(path_stage0)):
f.write("%s\n" % pytextrank.pretty_print(graf._asdict()))
print(pytextrank.pretty_print(graf))
最好使用
pytextrank
包中的requirements.txt
,而不是pip install-U spacy
——因为spacy
发展迅速,-U
将安装最新版本。这些更新并不总是向后兼容的
此外,您还可以在GitHub repo上为pytextrank
发布问题:
很高兴听到用法:)