Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/286.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python StringIO内存泄漏_Python_Urllib_Cstringio - Fatal编程技术网

Python StringIO内存泄漏

Python StringIO内存泄漏,python,urllib,cstringio,Python,Urllib,Cstringio,我有一个python程序,随着时间的推移,它的速度会减慢到爬行速度。我已经进行了彻底的测试,并将测试范围缩小到下载图像的方法。该方法使用cstringIO和urllib。问题还可能是urllib的某种无限下载(程序在几百次下载后就冻结了) 你有没有想过问题可能在哪里 foundImages = [] images = soup.find_all('img') print('downloading Images') for ima

我有一个python程序,随着时间的推移,它的速度会减慢到爬行速度。我已经进行了彻底的测试,并将测试范围缩小到下载图像的方法。该方法使用cstringIO和urllib。问题还可能是urllib的某种无限下载(程序在几百次下载后就冻结了)

你有没有想过问题可能在哪里

        foundImages = []

        images = soup.find_all('img')
        print('downloading Images')

        for imageTag in images:
            gc.collect()

            url = None
            try:

                #load image into a file to determine size and width
                url = imageTag.attrs['src']
                imgFile = StringIO(urllib.urlopen(url).read())
                im = Image.open(imgFile)
                width, height = im.size

                #if width and height are both above a threshold, it is a valid image
                #so add to recipe images
                if width > self.minOptimalWidth and height > self.minOptimaHeight:
                    image = MIImage({})
                    image.originalUrl = url.encode('ascii', 'ignore')
                    image.width = width
                    image.height = height

                    foundImages.append(image)

                imgFile = None
                im = None
            except Exception:
                print('failed image download url: ' + url)
                traceback.print_exc()
                continue

        #set the main image to be the first in the array
        if len(foundImages) > 0:
            first = foundImages[0]
            recipe.imageUrl = first.originalUrl

        return foundImages

测试有多彻底?你用过吗?有趣的是你没有关闭你的文件。用过的objgraph。我会给内存分析器一个机会。谢谢你的提醒,我对python还比较陌生。你指的是什么?@WilliamFalcon和dstromberg的评论,你从来没有这样做过,例如
imgFile.close()
-
imgFile=None
是不同的。