Python 如何解决'';非类型';对象不可下标';
我正在编写一个用于刮取的小脚本,遇到了错误“TypeError:'NoneType'对象不可订阅” 我以前从未见过这个错误,所以我不知道它是什么意思Python 如何解决'';非类型';对象不可下标';,python,beautifulsoup,Python,Beautifulsoup,我正在编写一个用于刮取的小脚本,遇到了错误“TypeError:'NoneType'对象不可订阅” 我以前从未见过这个错误,所以我不知道它是什么意思 import bs4 import requests myUrl = "https://www.houzz.com/professionals/searchDirectory? topicId=11785&query=Interior+Designers+%26+Decorators&location=Texas&dist
import bs4
import requests
myUrl = "https://www.houzz.com/professionals/searchDirectory? topicId=11785&query=Interior+Designers+%26+Decorators&location=Texas&distance=0&sort=4"
data=requests.get(myUrl)
soup=bs4.BeautifulSoup(data.text,'html.parser')
listing = soup.find_all('div', class_="hz-pro-search-result__profile- desc")
for li in listing:
myurl = li
res = myurl.a['href']
print(res)
错误:
File "C:\Users\Hp\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/Hp/.spyder-py3/houzz.py", line 20, in <module>
res = myurl.a['href']
TypeError: 'NoneType' object is not subscriptable
execfile中的文件“C:\Users\Hp\Anaconda3\lib\site packages\spyder\u kernels\customize\spydercurustomize.py”,第108行
exec(编译(f.read(),文件名,'exec'),命名空间)
文件“C:/Users/Hp/.spyder-py3/houzz.py”,第20行,在
res=myurl.a['href']
TypeError:“非类型”对象不可下标
我的URL可能返回
无
,因此您无法对其执行任何操作。请使用此类检查此div中li对象的HTML代码。我认为y不存在我的URL可能返回无
,因此您无法对其执行任何操作。使用此类检查此div中li对象的HTML代码。我认为y不存在st其中一个div不包含锚定链接,而是包含段落标记。
下面的代码可能就是您所需要的
import bs4
import requests
myUrl = "https://www.houzz.com/professionals/searchDirectory?topicId=11785&query=Interior+Designers+%26+Decorators&location=Texas&distance=0&sort=4"
data=requests.get(myUrl)
soup=bs4.BeautifulSoup(data.text,'html.parser')
listing = soup.find_all('div', class_="hz-pro-search-result__profile-desc")
listing = [i for i in listing if i]
for li in listing:
if li.a:
res = li.a['href']
print(res)
else:
print("Error: "+li)
其中一个div不包含锚定链接,而是包含一个段落标记。 下面的代码可能就是您所需要的
import bs4
import requests
myUrl = "https://www.houzz.com/professionals/searchDirectory?topicId=11785&query=Interior+Designers+%26+Decorators&location=Texas&distance=0&sort=4"
data=requests.get(myUrl)
soup=bs4.BeautifulSoup(data.text,'html.parser')
listing = soup.find_all('div', class_="hz-pro-search-result__profile-desc")
listing = [i for i in listing if i]
for li in listing:
if li.a:
res = li.a['href']
print(res)
else:
print("Error: "+li)
尝试以下css选择器
import bs4
import requests
myUrl = "https://www.houzz.com/professionals/searchDirectory? topicId=11785&query=Interior+Designers+%26+Decorators&location=Texas&distance=0&sort=4"
data=requests.get(myUrl)
soup=bs4.BeautifulSoup(data.text,'html.parser')
res=[a['href'] for a in soup.select('div.hz-pro-search-result__profile-desc > a')]
print(res)
尝试以下css选择器
import bs4
import requests
myUrl = "https://www.houzz.com/professionals/searchDirectory? topicId=11785&query=Interior+Designers+%26+Decorators&location=Texas&distance=0&sort=4"
data=requests.get(myUrl)
soup=bs4.BeautifulSoup(data.text,'html.parser')
res=[a['href'] for a in soup.select('div.hz-pro-search-result__profile-desc > a')]
print(res)
如果您确实特别想要每个listings houzz.com页面的链接,那么您需要以下内容(bs4 7.7.1)。我们按照类
hz-pro-search-result\uuu name-rating
获取每个列表的标题url,然后限制为第一个a
标记。每个a
标记都有一个href
,因此这里没有None
的风险
import requests
from bs4 import BeautifulSoup as bs
r = requests.get('https://www.houzz.com/professionals/searchDirectory?%20topicId=11785&query=Interior+Designers+%26+Decorators&location=Texas&distance=0&sort=')
soup = bs(r.content, 'lxml')
listings = [i['href'] for i in soup.select('.hz-pro-search-result__name-rating > a:first-child')]
如果您确实特别想要每个listings houzz.com页面的链接,那么您需要以下内容(bs4 7.7.1)。我们按照类
hz-pro-search-result\uuu name-rating
获取每个列表的标题url,然后限制为第一个a
标记。每个a
标记都有一个href
,因此这里没有None
的风险
import requests
from bs4 import BeautifulSoup as bs
r = requests.get('https://www.houzz.com/professionals/searchDirectory?%20topicId=11785&query=Interior+Designers+%26+Decorators&location=Texas&distance=0&sort=')
soup = bs(r.content, 'lxml')
listings = [i['href'] for i in soup.select('.hz-pro-search-result__name-rating > a:first-child')]
它的意思是
myurl。a
是None
…它不应该是None。@decezeWell,是的。检查你到底在处理什么。print(li)
。它的意思是myurl。a
是None
…它不应该是None。@decezeWell,是的。检查你到底在处理什么。print(li)
。列表中有数据数组。如果列表中有数据,那么它也应该有数据。列表中有数据数组。如果列表中有数据,那么它也应该有数据。谢谢!我终于收到了。谢谢!我终于收到了。