Python 美丽的热汤网装饰_Python_Beautifulsoup

Python 美丽的热汤网装饰

python

Python 美丽的热汤网装饰,python,beautifulsoup,Python,Beautifulsoup,返回属性错误 import bs4 as bs import requests url = 'https://hotcopper.com.au/postview/' header = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36","X-Requested-With": "XMLHttpRequest"

返回属性错误

import bs4 as bs
import requests
url = 'https://hotcopper.com.au/postview/'
header = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36","X-Requested-With": "XMLHttpRequest"}
#Get Raw HTML content
r = requests.get(url, headers=header)
soup = bs.BeautifulSoup(r.text, 'html5lib')

stocks =[]
rows = soup.find_all('td',attrs={'class':'stock-pill-td alt-tr'})
for row in rows:
  name = row.find('a').text.strip()
  stocks.append(name)
print(stocks)

我正在尝试获取股票代码文本html。正在运行soup:

---> 12   name = row.find('a').text.strip()
     13   stocks.append(name)
     14 print(stocks)

AttributeError: 'NoneType' object has no attribute 'text'

将返回一个列表，例如我正试图提取“NVA”

soup.find_all('td',attrs={'class':'stock-pill-td alt-tr'})

`在此处输入代码`

当我尝试迭代列表时，它抛出一个属性错误

问题似乎是表中的一些空行。

import bs4 as bs
import requests
url = 'https://hotcopper.com.au/postview/'
header = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36","X-Requested-With": "XMLHttpRequest"}
#Get Raw HTML content
r = requests.get(url, headers=header)
soup = bs.BeautifulSoup(r.text, 'html5lib')

stocks =[]
rows = soup.find_all('td',attrs={'class':'stock-pill-td alt-tr'})
for row in rows:
  name = row.find('a').text.strip()
  stocks.append(name)
print(stocks)

添加try-except块可以解释这一点

将bs4作为bs导入
导入请求
url='1〕https://hotcopper.com.au/postview/'
header={“用户代理”：“Mozilla/5.0（X11；Linux x86_64）AppleWebKit/537.36（KHTML，像Gecko）Chrome/50.0.2661.75 Safari/537.36”，“X-request-With”：“XMLHttpRequest”}
#获取原始HTML内容
r=requests.get（url，headers=header）
soup=bs.BeautifulSoup（r.text，'html5lib'）
股票=[]
rows=soup.find_all（'td'，attrs={'class'：'stock-pill-td alt tr'}）
对于行中的行：
尝试：
name=row.find（'a'）。get_text（）
股票。追加（名称）
除：
打印（行）
印刷品（库存）

谢谢Atharva是的，这就是问题所在。我注意到，当我开始检查网站表时，它有空值。