Python 属性错误:';非类型';对象没有属性';获取文本';在beautifulsoop web抓取中
我正在用python中的beautifulsoop(web scraping)做一个项目。早些时候,这个程序运行得很好。但是,现在它给出了如下所示的错误。网站的html结构可能会改变。但我仍然无法找出错误并解决它。请帮忙 该网站是——[https://covidindia.org/][1] 请帮我解决错误。 错误-Python 属性错误:';非类型';对象没有属性';获取文本';在beautifulsoop web抓取中,python,web-scraping,beautifulsoup,python-requests,Python,Web Scraping,Beautifulsoup,Python Requests,我正在用python中的beautifulsoop(web scraping)做一个项目。早些时候,这个程序运行得很好。但是,现在它给出了如下所示的错误。网站的html结构可能会改变。但我仍然无法找出错误并解决它。请帮忙 该网站是——[https://covidindia.org/][1] 请帮我解决错误。 错误- Traceback (most recent call last): File "t1.py", line 112, in <module>
Traceback (most recent call last):
File "t1.py", line 112, in <module>
mainLabel = tk.Label(root, text=get_corona_detail_of_india(), font=f, bg='light blue',fg='red')
File "t1.py", line 23, in get_corona_detail_of_india
total_cases = soup.find("div",class_="elementor-element elementor-element-aceece0 elementor-widget elementor-widget-heading",).get_text()
AttributeError: 'NoneType' object has no attribute 'get_text
URL = 'https://covidindia.org/'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
#print(soup)
total_cases = soup.find("div",class_="elementor-element elementor-element-aceece0 elementor-widget elementor-widget-heading",).get_text()
tc=(total_cases.strip())
<html><head><title>403 Forbidden</title></head>
<body>
<center><h1>403 Forbidden</h1></center>
<hr/><center>nginx</center>
当我提取汤时,o/p为-
Traceback (most recent call last):
File "t1.py", line 112, in <module>
mainLabel = tk.Label(root, text=get_corona_detail_of_india(), font=f, bg='light blue',fg='red')
File "t1.py", line 23, in get_corona_detail_of_india
total_cases = soup.find("div",class_="elementor-element elementor-element-aceece0 elementor-widget elementor-widget-heading",).get_text()
AttributeError: 'NoneType' object has no attribute 'get_text
URL = 'https://covidindia.org/'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
#print(soup)
total_cases = soup.find("div",class_="elementor-element elementor-element-aceece0 elementor-widget elementor-widget-heading",).get_text()
tc=(total_cases.strip())
<html><head><title>403 Forbidden</title></head>
<body>
<center><h1>403 Forbidden</h1></center>
<hr/><center>nginx</center>
403禁止
403禁止
nginx
是否永久禁止我的访问???向您的请求添加
用户代理
标题。当你不添加用户代理时,网站会检测到你是一个机器人,因此不允许你访问网站的内容。以下是完整的代码:
from bs4 import BeautifulSoup
import requests
headers = {'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:32.0) Gecko/20100101 Firefox/32.0'}
URL = 'https://covidindia.org/'
page = requests.get(URL,headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
#print(soup)
total_cases = soup.find("div",class_="elementor-element elementor-element-aceece0 elementor-widget elementor-widget-heading",).get_text()
tc=(total_cases.strip())
输出:
>>> tc
'Total Cases - 83,14,673 (+46,171)'
当站点需要一个您没有放入您的请求中的对象时,就会出现此问题,请检查站点需要什么,它可能是其他用户回答时的用户代理或其他东西。headers={'user-agent':'Mozilla/5.0(Macintosh;Intel Mac OS X 10.9;rv:32.0)Gecko/20100101 Firefox/32.0'}上面这一行是为您的Mac配置。我应该如何处理我的系统?请帮我用同样的头谢谢!!!现在它进一步显示了这一点,请对此也提供帮助--Code-URL=''html\u page=requests.get(URL).text-soup=BeautifulSoup(html\u page,'lxml')get\u table=soup.find(“table”,id=“tablepress-96”)get\u table\u data=get\u table.tbody.find\u all(“tr”)错误-'NoneType'对象没有属性'tbody',好的…我将研究它。顺便说一句,如果我的ans帮助了你,请接受我的ans作为最好的ans。谢谢!您在代码中遗漏了标题
。添加它们,它就会工作。