Python 3.x 使用Python Newspaper3k获取article.text的URL
我想从这个网站上获取全国新闻。下面是我的代码Python 3.x 使用Python Newspaper3k获取article.text的URL,python-3.x,web-scraping,Python 3.x,Web Scraping,我想从这个网站上获取全国新闻。下面是我的代码 from newspaper import Article url = 'https://www.stuff.co.nz/national' article = Article(url) article.download() article.parse() data = article.text data.splitlines() 除了文本标题,我还需要获取这些文件的URL。例如: Fish sausages recalled People wi
from newspaper import Article
url = 'https://www.stuff.co.nz/national'
article = Article(url)
article.download()
article.parse()
data = article.text
data.splitlines()
除了文本标题,我还需要获取这些文件的URL。例如:
Fish sausages recalled People with egg allergies or an intolerance should not consume these products, MPI says
https://www.stuff.co.nz/business/118673724/fish-sausages-sold-in-auckland-and-hamilton-recalled-due-to-egg-allergy-risk
试着这样做:
import newspaper
url = 'https://www.stuff.co.nz/national'
paper = newspaper.build(url)
for article in paper.articles:
target = article.title
if target and len(target.strip())>0:
print(target.strip().replace('\n',''))
print(article.url)
输出:
Is a text worth a life?
https://www.stuff.co.nz/national/118315357/is-a-text-worth-a-life-a-message-to-kiwi-drivers-on-the-roads-this-holiday-season
Courtenay Place 'going backwards'
https://www.stuff.co.nz/national/118660168/wellington-police-commander-says-rowdy-courtenay-place-going-backwards
New frontier for energy?
https://www.stuff.co.nz/national/117860027/searching-for-new-zealands-electricity-future-in-the-deep-heat
等等