Python 《美丽的汤》中的这个错误意味着什么？_Python_Pyqt_Beautifulsoup

Python 《美丽的汤》中的这个错误意味着什么？

python

Python 《美丽的汤》中的这个错误意味着什么？,python,pyqt,beautifulsoup,Python,Pyqt,Beautifulsoup,我正在使用PyQt4和BeautifulSoup编写小脚本。基本上，您可以指定url和脚本来从web页面下载所有pic 在输出中，当我提供时，它会下载除一张以外的所有图片： ... Download Complete Download Complete File name is wrong Traceback (most recent call last): File "./picture_downloader.py", line 41, in loadComplete self.

我正在使用PyQt4和BeautifulSoup编写小脚本。基本上，您可以指定url和脚本来从web页面下载所有pic

在输出中，当我提供时，它会下载除一张以外的所有图片：

...
Download Complete
Download Complete
File name is wrong 
Traceback (most recent call last):
  File "./picture_downloader.py", line 41, in loadComplete
    self.download_image()
  File "./picture_downloader.py", line 58, in download_image
    print 'File name is wrong ',image['src']
  File "/usr/local/lib/python2.7/dist-packages/beautifulsoup4-4.1.3-py2.7.egg/bs4/element.py", line 879, in __getitem__
    return self.attrs[key]
KeyError: 'src'

输出为：

最后，这里是代码的一部分：

# SLOT for loadFinished
def loadComplete(self): 
    self.download_image()

def download_image(self):
    html = unicode(self.frame.toHtml()).encode('utf-8')
    soup = bs(html)

    for image in soup.findAll('img'):
        try:
            file_name = image['src'].split('/')[-1]
            cur_path = os.path.abspath(os.curdir)
            if not os.path.exists(os.path.join(cur_path, 'images/')):
                os.makedirs(os.path.join(cur_path, 'images/'))
            f_path = os.path.join(cur_path, 'images/%s' % file_name)
            urlretrieve(image['src'], f_path)
            print "Download Complete"
        except:
            print 'File name is wrong ',image['src']
    print "No more pictures on the page"

这意味着

image

元素没有

“src”

属性，并且您会得到两次相同的错误：一次是在

file_name=image['src'].split（'/'）[-1]

中，然后在except块

“file name is error”，image['src']

中

避免该问题的最简单方法是将

soup.findAll（'img'）

替换为

soup.findAll（'img'，{“src”：True}）

，这样它将只查找具有

src

属性的元素

如果有两种可能性，请尝试以下方法：

for image in soup.findAll('img'):
    v = image.get('src', image.get('dfr-src'))  # get's "src", else "dfr_src"
                                                # if both are missing - None
    if v is None:
        continue  # continue loop with the next image
    # do your stuff

好的，这就是发生的事情。在try except中，您从

file\u name=image['src']中获得了一个KeyError
。split（'/'）[-1]

，因为该对象没有

src

属性

然后，在您的

except

语句之后，您试图访问导致错误的相同属性：

print'File name is error'，image['src']

检查导致错误的

img

标记，并重新评估这些情况下的逻辑。

错误发生在

file_name=image['src']上。split（'/'）[-1]

@That1Guy--你是对的，在你发表评论前3秒修复它：P@Vor--为了避免这个问题，请尝试：

soup.findAll（'img'，{“src”：True}）

谢谢，这就是我现在要做的，但问题是，我只是检查了HTML，一些图像而不是普通的

src

属性有

dfr src=

属性。@Vor--添加了另一个选项。所以如果我在开始检查，如果img有src属性，它应该会工作，对吗？这将解决错误，但不会下载图像。检查导致错误的元素，以找到另一种下载方法。

for image in soup.findAll('img'):
    v = image.get('src', image.get('dfr-src'))  # get's "src", else "dfr_src"
                                                # if both are missing - None
    if v is None:
        continue  # continue loop with the next image
    # do your stuff