Python query.next()慢吗?

Python query.next()慢吗?,python,performance,loops,iterator,pyqt,Python,Performance,Loops,Iterator,Pyqt,我正在使用PyQt作为GUI软件。我还使用sqlite数据库为软件提供数据 在我的代码中,我有以下方法: def loadNotifications(self): """Method to find the number of unread articles, for each search. Load a list of id, for the unread articles, in each table. And a list of id, for the conc

我正在使用PyQt作为GUI软件。我还使用sqlite数据库为软件提供数据

在我的代码中,我有以下方法:

def loadNotifications(self):

    """Method to find the number of unread articles,
    for each search. Load a list of id, for the unread articles,
    in each table. And a list of id, for the concerned articles, for
    each table"""

    count_query = QtSql.QSqlQuery(self.bdd)
    count_query.setForwardOnly(True)

    # Don't treat the articles if it's the main tab, it's
    # useless because the article will be concerned for sure
    for table in self.list_tables_in_tabs[1:]:

        # Empty these lists, because when loadNotifications is called
        # several times during the use, the nbr of unread articles is
        # added to the nbr of notifications
        table.list_new_ids = []
        table.list_id_articles = []

        # Try to speed things up
        append_new = table.list_new_ids.append
        append_articles = table.list_id_articles.append

        req_str = self.refineBaseQuery(table.base_query, table.topic_entries, table.author_entries)
        print(req_str)
        count_query.exec_(req_str)

        start_time = datetime.datetime.now()
        i = 0

        while count_query.next():
            i += 1
            record = count_query.record()

            append_articles(record.value('id'))

            if record.value('new') == 1:
                append_new(record.value('id'))

        print(datetime.datetime.now() - start_time)
        print("Nbr of entries processed: {}".format(i))
让我们假设这个循环有大约400个条目要处理。大约需要一秒钟,我觉得时间太长了。我尽可能地优化流程,但仍然需要太多时间

以下是前一种方法通常打印的内容:

SELECT * FROM papers WHERE id IN(320, 1320, 5648, 17589, 20092, 20990, 49439, 58378, 65251, 68772, 73509, 86859, 90594)
0:00:00.001403
Nbr of entries processed: 13
SELECT * FROM papers WHERE topic_simple LIKE '% 3D print%'
0:00:00.591745
Nbr of entries processed: 81
SELECT * FROM papers WHERE id IN (5648, 11903, 14258, 30587, 40339, 55691, 57383, 58378, 62951, 65251, 68772, 87295)
0:00:00.000478
Nbr of entries processed: 12
SELECT * FROM papers WHERE topic_simple LIKE '% Python %'
0:00:00.596490
Nbr of entries processed: 9
SELECT * FROM papers WHERE topic_simple LIKE '% Raspberry pi %' OR topic_simple LIKE '% arduino %'
0:00:00.988276
Nbr of entries processed: 5
SELECT * FROM papers WHERE topic_simple LIKE '% sensor array%' OR topic_simple LIKE '% biosensor %'
0:00:00.996164
Nbr of entries processed: 433
SELECT * FROM papers WHERE id IN (320, 540, 1320, 1434, 1860, 4527, 5989, 6022, 6725, 6978, 7268, 8625, 9410, 9814, 9850, 10608, 13219, 15572, 15794, 19345, 19674, 19899, 20990, 22530, 26443, 26535, 28721, 29089, 30923, 31145, 31458, 31598, 32069, 34129, 35820, 36142, 36435, 37546, 39188, 39952, 40949, 41764, 43529, 43610, 44184, 45206, 49210, 49807, 50279, 50943, 51536, 51549, 52921, 52967, 54610, 56036, 58087, 60490, 62133, 63051, 63480, 63535, 64861, 66906, 68107, 68328, 69021, 71797, 73058, 74974, 75331, 77697, 78138, 80152, 80539, 82172, 82370, 82840, 86859, 87467, 91528, 92167)
0:00:00.002891
Nbr of entries processed: 82
SELECT * FROM papers WHERE id IN (7043, 41643, 44688, 50447, 64723, 72601, 81006, 82380, 84285)
0:00:00.000348
Nbr of entries processed: 9
这是更好的方式吗?我能得到更好的结果吗

注意:显示的时间是运行循环所需的时间,而不是运行查询所需的时间

我尝试了文档中提到的
count\u query.setForwardOnly(True)
,但对性能没有影响

编辑: 以下是包含约600个条目的测试数据库:

显然,我无法测试这一点,因此我不知道这是否会产生重大影响,但您可以尝试使用基于索引的查找:

id_index = count_query.record().indexOf('id')
new_index = count_query.record().indexOf('new')
while count_query.next():
    record = count_query.record()
    id_value = record.value(id_index)
    append_articles(id_value)
    if record.value(new_index) == 1:
        append_new(id_value)
更新

使用您的示例数据库,我无法重现您看到的问题,而且我还发现我上面的方法大约是您原来方法的两倍。以下是一些示例输出:

IDs: 660, Articles: 666
IDs: 660, Articles: 666
IDs: 660, Articles: 666
test(index=False): 0.19050272400090762
IDs: 660, Articles: 666
IDs: 660, Articles: 666
IDs: 660, Articles: 666
test(index=True): 0.09384496400161879
测试用例:

import sys, os, timeit
from PyQt4 import QtCore, QtGui
from PyQt4.QtSql import QSqlDatabase, QSqlQuery

def test(index=False):
    count_query = QSqlQuery('select * from results')
    list_new_ids = []
    list_id_articles = []
    append_new = list_new_ids.append
    append_articles = list_id_articles.append
    if index:
        id_index = count_query.record().indexOf('id')
        new_index = count_query.record().indexOf('new')
        while count_query.next():
            record = count_query.record()
            id_value = record.value(id_index)
            append_articles(id_value)
            if record.value(new_index) == 1:
                append_new(id_value)
    else:
        while count_query.next():
            record = count_query.record()
            append_articles(record.value('id'))
            if record.value('new') == 1:
                append_new(record.value('id'))
    print('IDs: %d, Articles: %d' % (
        len(list_new_ids), len(list_id_articles)))

class Window(QtGui.QWidget):
    def __init__(self):
        super(Window, self).__init__()
        self.button = QtGui.QPushButton('Test', self)
        self.button.clicked.connect(self.handleButton)
        layout = QtGui.QVBoxLayout(self)
        layout.addWidget(self.button)
        self.database = QSqlDatabase.addDatabase("QSQLITE")
        path = os.path.join(os.path.dirname(__file__), 'tmp/perf-test.db')
        self.database.setDatabaseName(path)
        self.database.open()

    def handleButton(self):
        for stmt in 'test(index=False)', 'test(index=True)':
            print('%s: %s' % (stmt, timeit.timeit(
                stmt, 'from __main__ import test', number=3)))

if __name__ == '__main__':

    import sys
    app = QtGui.QApplication(sys.argv)
    window = Window()
    window.setGeometry(600, 300, 200, 100)
    window.show()
    sys.exit(app.exec_())

没有,对不起,没什么变化。性能是一样的,循环所需的时间几乎完全相同(可能稍差一点,但我认为这在统计上并不显著)。@Rififi。我想你可能会这么说。如果没有一个真实的示例数据库进行测试,很难提出有用的建议。是的,我知道,我准备了一个测试数据库,请参阅我的编辑。@Rififi。请参阅我的最新答案。如果您在使用我的测试用例时没有得到类似的结果,那么您显然需要提供有关实际代码和系统设置的更多信息。非常感谢您的辛勤工作。我得到了与您的测试用例非常相似的结果,因此我编辑了我的问题,并提供了我的方法的完整代码以及一些输出。从结果来看,问题似乎来自“like”查询。你不同意吗?注意:显示的时间不是用于执行查询的时间,而是运行循环的时间。