Python 使用BeautifulSoup删除网站_Python_Beautifulsoup

Python 使用BeautifulSoup删除网站

python

Python 使用BeautifulSoup删除网站,python,beautifulsoup,Python,Beautifulsoup,我在抓取时得到属性错误 import urllib2 from bs4 import BeautifulSoup quote_page ='https://www.bloomberg.com/quote/SPX:IND' page = urllib2.urlopen(quote_page) soup = BeautifulSoup(page,'html.parser') name_box = soup.find('h1', attires ={'class': 'name'}) name

我在抓取时得到属性错误

import urllib2
from bs4 import BeautifulSoup

quote_page ='https://www.bloomberg.com/quote/SPX:IND'
page = urllib2.urlopen(quote_page)

soup = BeautifulSoup(page,'html.parser')

name_box = soup.find('h1', attires ={'class': 'name'})

name = name_box.text.strip()
print name

回溯（最近一次呼叫最后一次）：

文件“word1.py”，第11行，在

name = name_box.text.strip()

AttributeError:“非类型”对象没有属性“文本”

Viveks MacBook Pro:py vivek$

当你这么做的时候

print(name_box)

你会得到

 None
Traceback (most recent call last):
  File "C:/Users/devsurya/python/demo programs/b4s.py", line 13, in <module>
    name = name_box.text.strip()
AttributeError: 'NoneType' object has no attribute 'text'

我们已从您的计算机网络中检测到异常活动

和

soup.find（'h1'，attires={'class'：'name'}）

应该是

soup.find（'h1'，{'class'：'companyName_uu99a4824b'}）

假设您想要的是公司名称，我会在请求中使用，并且需要两个标题（您需要测试一下，看看这是否会随着时间的推移而保持一致）。我使用css属性=值选择器来获取适当的元素，并在值是动态的情况下使用带开头的运算符^。这使得它对于其他请求更通用

import requests
from bs4 import BeautifulSoup as bs

quote_page ='https://www.bloomberg.com/quote/SPX:IND'
page = requests.get(quote_page, headers = {'User-Agent':'Mozilla/5.0', 'accept-language':'en-US,en;q=0.9'})
soup = bs(page.content,'lxml')
name_box = soup.select_one('[class^=companyName]')
name = name_box.text.strip()
print(name)

什么是服装打字错误？

import requests
from bs4 import BeautifulSoup as bs

quote_page ='https://www.bloomberg.com/quote/SPX:IND'
page = requests.get(quote_page, headers = {'User-Agent':'Mozilla/5.0', 'accept-language':'en-US,en;q=0.9'})
soup = bs(page.content,'lxml')
name_box = soup.select_one('[class^=companyName]')
name = name_box.text.strip()
print(name)