网页中节的Python抓取
我正试图从新闻网站上学习,但我遇到了这样的情况网页中节的Python抓取,python,beautifulsoup,Python,Beautifulsoup,我正试图从新闻网站上学习,但我遇到了这样的情况 从bs4导入美化组 从urllib.request导入urlopen req=urlopen('https://timesofindia.indiatimes.com/india/evidence-of-chidambaram-meeting-mukerjeas-destroyed-cbi/articleshow/71337533.cms') page_html=req.read() page\u soup=BeautifulSoup(page\u
从bs4导入美化组
从urllib.request导入urlopen
req=urlopen('https://timesofindia.indiatimes.com/india/evidence-of-chidambaram-meeting-mukerjeas-destroyed-cbi/articleshow/71337533.cms')
page_html=req.read()
page\u soup=BeautifulSoup(page\u html,“html.parser”)
section=page_soup.find('section',{'class':''u2suu5 clearfix id-r-component
未定义未定义“})
打印(部分)
我已经尝试了另一个网站。代码运行良好。但这次的错误是无法解释的。我为您修复了它。我希望你学到了一些有用的东西
import requests
from bs4 import BeautifulSoup
url = 'https://timesofindia.indiatimes.com/india/evidence-of-chidambaram-meeting-mukerjeas-destroyed-cbi/articleshow/71337533.cms'
response = requests.get(url)
bs = BeautifulSoup(response.text,"html.parser")
#this will work too
#section = bs.find_all('section', class_='_2suu5 clearfix id-r-component undefined undefined')
section = bs.find_all('section', attrs={'class': '_2suu5 clearfix id-r-component undefined undefined'})
#print(section)
我试图在页面中删除节标记什么是
错误
?根本不生成输出。我很高兴有人能帮助我解决这个问题,我非常感谢你们的帮助。