Python:使用PyQT4+;汤刮了几页

Python:使用PyQT4+;汤刮了几页,python,beautifulsoup,pyqt4,python-requests,Python,Beautifulsoup,Pyqt4,Python Requests,我正在尝试使用Python PyQT4+Beautiful Soup刮取几个网页 由于我整个程序的性质,我使用一个主脚本“program.py”调用来自其他脚本的函数,用BeautifulSoup进行不同的分析 因此,mymain program.py的简化架构如下所示: program.py : import script1 import script2 script1.function1(urlA) script2.function2(urlB) script1.py : impor

我正在尝试使用Python PyQT4+Beautiful Soup刮取几个网页

由于我整个程序的性质,我使用一个主脚本“program.py”调用来自其他脚本的函数,用BeautifulSoup进行不同的分析

因此,mymain program.py的简化架构如下所示:

program.py :

import script1
import script2

script1.function1(urlA)
script2.function2(urlB)
script1.py :

import requests
import re
from bs4 import BeautifulSoup
from PyQt4.QtGui import *
from PyQt4.QtCore import *
from PyQt4.QtWebKit import * 

class Render(QWebPage):
    def __init__(self, url):
        self.app = QApplication(sys.argv)
        QWebPage.__init__(self)
        self.loadFinished.connect(self._loadFinished)
        self.mainFrame().load(QUrl(url))
        self.app.exec_()
    def _loadFinished(self, result):
        self.frame = self.mainFrame()
        self.app.quit()   


def function1(url):
    r = Render(url)
    soup = BeautifulSoup(unicode(r.frame.toHtml()))

    #Do many things with soup.
    #Nothing related to PyQT4 further in this script
使用script1.py和script2.py,如下所示:

program.py :

import script1
import script2

script1.function1(urlA)
script2.function2(urlB)
script1.py :

import requests
import re
from bs4 import BeautifulSoup
from PyQt4.QtGui import *
from PyQt4.QtCore import *
from PyQt4.QtWebKit import * 

class Render(QWebPage):
    def __init__(self, url):
        self.app = QApplication(sys.argv)
        QWebPage.__init__(self)
        self.loadFinished.connect(self._loadFinished)
        self.mainFrame().load(QUrl(url))
        self.app.exec_()
    def _loadFinished(self, result):
        self.frame = self.mainFrame()
        self.app.quit()   


def function1(url):
    r = Render(url)
    soup = BeautifulSoup(unicode(r.frame.toHtml()))

    #Do many things with soup.
    #Nothing related to PyQT4 further in this script
我的脚本2具有完全相同的结构,但在另一个url上执行其他操作

script2.py :

import requests
import re
from bs4 import BeautifulSoup
from PyQt4.QtGui import *
from PyQt4.QtCore import *
from PyQt4.QtWebKit import * 

class Render(QWebPage):
    def __init__(self, url):
        self.app = QApplication(sys.argv)
        QWebPage.__init__(self)
        self.loadFinished.connect(self._loadFinished)
        self.mainFrame().load(QUrl(url))
        self.app.exec_()
    def _loadFinished(self, result):
        self.frame = self.mainFrame()
        self.app.quit()   


def function2(url):
    r = Render(url)
    soup = BeautifulSoup(unicode(r.frame.toHtml()))

    #Do many other things with soup
    #Nothing related to PyQT4 further in this script
使用script1.py,一切正常。我的函数1和分析已成功运行

但是script2.py存在错误,我有以下错误:

QObject::connect: Cannot connect (null)::configurationAdded(QNetworkConfiguration) to QNetworkConfigurationManager::configurationAdded(QNetworkConfiguration)
QObject::connect: Cannot connect (null)::configurationRemoved(QNetworkConfiguration) to QNetworkConfigurationManager::configurationRemoved(QNetworkConfiguration)
QObject::connect: Cannot connect (null)::configurationChanged(QNetworkConfiguration) to QNetworkConfigurationManager::configurationChanged(QNetworkConfiguration)
QObject::connect: Cannot connect (null)::onlineStateChanged(bool) to QNetworkConfigurationManager::onlineStateChanged(bool)
QObject::connect: Cannot connect (null)::configurationUpdateComplete() to QNetworkConfigurationManager::updateCompleted()
我花时间搜索这个问题,发现PyQT4无法在同一个实例中加载多个页面

问题是,我需要PyQT4在将页面内容加载到Beautiful Soup之前呈现Javascripts

因此,我认为我需要在脚本1中的function1末尾添加某种“self.app.quit()”,这样脚本2中的function2也可以使用PyQT4呈现页面。但是我没能让它工作。

这个怎么样

r = Render(url)
soup = BeautifulSoup(unicode(r.frame.toHtml()))

r.app.quit()
这个怎么样

r = Render(url)
soup = BeautifulSoup(unicode(r.frame.toHtml()))

r.app.quit()

hi furas,谢谢,但它不起作用我在脚本远端添加“r.app.quit()”时仍有相同的错误…hi furas,谢谢,但它不起作用我在添加“r.app.quit()”时仍有相同的错误在我的脚本的另一端…这可能是一个重复的问题,因为我无法让它为我工作…?我在回答这个问题时扩展了示例代码,使其更加灵活。这可能是一个重复的问题,因为我无法让它为我工作…?我在回答这个问题时将示例代码扩展到让它更灵活一点。