Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/307.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python内存错误-优化scraper中的一部分代码,以减少内存使用_Python_Out Of Memory - Fatal编程技术网

Python内存错误-优化scraper中的一部分代码,以减少内存使用

Python内存错误-优化scraper中的一部分代码,以减少内存使用,python,out-of-memory,Python,Out Of Memory,我正在使用以下代码从两个站点刮取特定数据 firstclass = input("First class: ") nestedclass = input("Nested class: ") classend = input("Class close tag: ") exportlist = [] def getNames(i): #i is the html string. i=str(i) check = i

我正在使用以下代码从两个站点刮取特定数据

firstclass = input("First class: ")
nestedclass = input("Nested class: ")
classend = input("Class close tag: ")
exportlist = []

def getNames(i):     #i is the html string.
    i=str(i)
    check = i.find(firstclass)
    while check != -1:
        logging("Making new loop...")  #function to show the message together with time in the console
        i = str(i)
        i = i.replace(firstclass, '\n', 1)
        logging("progress = 25%")
        i = i.split('\n')
        i = str(i[1])
        logging("progress = 50%")
        i = i.replace(nestedclass, '\n', 1) 
        i = i.split('\n')
        logging("progress = 75%")
        i = str(i[1])
        i = i.replace(classend, '\n', 1)
        logging("Loop done ! ")
        i = i.split('\n')
        exportlist.append(i[0])
        i = str(i[1])
        check = i.find(firstclass)


        if check < 500 and check!= -1:              #This part removes the next data piece,
            logging("In short Check")               #if it's very close to the previous one.
            i = str(i)                              #In case of double data in short distance
            i = i.replace(firstclass, '\n', 1) 
            i = i.split('\n')
            i = str(i[1])
            i = i.replace(nastedclass, '\n', 1) 
            i = i.split('\n')
            i = str(i)
            i = i.replace(classend, '\n', 1)
            i = i.split('\n')
            i = str(i)
            check = i.find(firstclass), '\n', 1) 
          
我试图删除第一个类,但没有它就无法正常工作,因为嵌套类也存在于第一个类之外,并且会带来错误的结果。那么,对如何更好地编写代码有什么建议吗


另外,我使用的是Python 64位。有64GB内存,我认为在这种情况下已经足够了。如果有办法增加python使用的内存,我准备好了。

请分享所有相关代码,仅从循环中很难学到很多东西。嘿,@AMC。所有代码大约有600行。。。另外,您只需要在变量中打开一个html文件,然后就可以使用该变量作为参数调用函数。
Memmory error.