如何在python/BeautifulSoup中的列表元素上使用FIND（）-Im getting Nonetype错误_Python_For Loop_Web Scraping_Beautifulsoup_List Comprehension

如何在python/BeautifulSoup中的列表元素上使用FIND（）-Im getting Nonetype错误

python for-loop web-scraping

如何在python/BeautifulSoup中的列表元素上使用FIND（）-Im getting Nonetype错误,python,for-loop,web-scraping,beautifulsoup,list-comprehension,Python,For Loop,Web Scraping,Beautifulsoup,List Comprehension,好的，这段代码可以工作： from bs4 import BeautifulSoup import urllib import re htmlfile = urllib.urlopen(MY SITE URL SITS HERE) soup = BeautifulSoup(htmlfile.read()) title = soup.find('p', {'class': 'deal-title should-truncate'}).getText() print "Title: " +

好的，这段代码可以工作：

from bs4 import BeautifulSoup
import urllib
import re

htmlfile = urllib.urlopen(MY SITE URL SITS HERE)
soup = BeautifulSoup(htmlfile.read())

title = soup.find('p', {'class': 'deal-title should-truncate'}).getText()  
print "Title: " + str(title)

但是上面的代码只给出了第一个结果。我想能够循环通过整个网站的每一次发现事件。为此，我尝试使用一个综合循环来查找每次出现的图形标记，因为此段落标记始终位于图形标记之间。这样我只能专注于图中的内容。但是当我尝试以下方法时：

from bs4 import BeautifulSoup
import urllib
import re

htmlfile = urllib.urlopen(MY WEBSITE URL SITS HERE)
soup = BeautifulSoup(htmlfile.read())

deals = [figure for figure in soup.findAll('figure')]

for i in deals:
    title = i.find('p', {'class': 'deal-title should-truncate'}).getText()  
    print "Title: " + str(title)

我得到这个错误：

回溯最近一次调用：文件C:\Python27\blah.py，第行 11，在 title=i.find'p'，{'class'：'deal title should truncate'}。getText AttributeError:'NoneType'对象没有属性“getText”

现在我正在尝试：

from bs4 import BeautifulSoup import urllib import re

htmlfile = urllib.urlopen(MY SITE SITS HERE) soup = BeautifulSoup(htmlfile.read())

deals = soup.findAll('figure')

for i in deals:
    title = i.find('p', {'class': 'deal-title should-truncate'})
    if (title == None):
        title = "NONE"
    else:
        title = title.getText()
    print "Title: " + str(title)

现在的错误是：

回溯最近一次调用：文件C:\Python27\blah.py，第行 16，在打印标题：+strtitle UnicodeEncodeError:“ascii”编解码器无法对位置12中的字符u'\u2013'进行编码：序号不在范围128

最后回答并向21点发出特别呼喊以寻求帮助

from bs4 import BeautifulSoup
import urllib
import re

htmlfile = urllib.urlopen(MY SITE SITS HERE)
soup = BeautifulSoup(htmlfile.read())

deals = soup.findAll('figure')

for i in deals:
    title = i.find('p', {'class': 'deal-title should-truncate'})
    if (title == None):
        title = "NONE"
    else:
        title = title.getText()
    print "Title: " + title

这意味着在给定类的子树中至少有一个没有。顺便说一句，交易清单上的竞争是不必要的。deals=soup。findAll'figure'会给你同样的结果。@BlackJack你关于列表理解不正确的说法是正确的，我感谢你。当我删除.getText时，第一个结果在列表中返回NONE。。。但是我留下了一大堆文本

有没有办法确保我每次只收到文本？@BlackJack类似的东西不起作用：因为我在交易中：title=I.find'p'，{'class'：'deal title should truncate'}.getText如果title==None:title=Hello-else:print title:+stritle您必须添加一个步骤才能获取标题。首先搜索元素并检查它是否为空，然后再尝试从中获取文本。也许展示一个文档结构的例子可以让我们更容易给出建议。更新了我们之前提到的步骤。它适用于第一个NONE，然后第二个NONE有一个值，但随后抛出一个错误@BlackJack