Python Goose无法提取mashable/usatoday/politicalwire文章

Python Goose无法提取mashable/usatoday/politicalwire文章,python,text-extraction,goose,Python,Text Extraction,Goose,我在mashable.com和usatoday.com上的每一篇文章中都使用pythongose提取器,但它失败了。有人能建议解决这个问题吗 有关usatoday.com的文章: g = Goose() article = g.extract(url='http://www.usatoday.com/story/tech/columnist/talkingtech/2014/01/25/namm-2014---ik-multimedias-rings-to-make-music/4863193/

我在mashable.com和usatoday.com上的每一篇文章中都使用pythongose提取器,但它失败了。有人能建议解决这个问题吗

有关usatoday.com的文章:

g = Goose()
article = g.extract(url='http://www.usatoday.com/story/tech/columnist/talkingtech/2014/01/25/namm-2014---ik-multimedias-rings-to-make-music/4863193/')
assert(article.cleaned_text=='')
对于mashable文章:

g = Goose()
article = g.extract(url='http://mashable.com/2014/01/26/square-cofounder-jim-mckelvey/')
assert(article.cleaned_text=='')
g = Goose()
article = g.extract(url='http://politicalwire.com/archives/2014/01/27/some_republicans_go_off_script_in_sotu_response.html')
assert(article.cleaned_text=='')
对于politicalwire文章:

g = Goose()
article = g.extract(url='http://mashable.com/2014/01/26/square-cofounder-jim-mckelvey/')
assert(article.cleaned_text=='')
g = Goose()
article = g.extract(url='http://politicalwire.com/archives/2014/01/27/some_republicans_go_off_script_in_sotu_response.html')
assert(article.cleaned_text=='')

我想这些是非常重要的文本提取网站。有人能提出一个解决办法吗?感谢

最新版本的Goose from可以从usatoday.com和mashable.com中提取

您还应该假设您违反了这两个网站的ToS。