Python 禁用所有缓存以在大型查询集上进行迭代

Python 禁用所有缓存以在大型查询集上进行迭代,python,optimization,Python,Optimization,这是我的密码: def new_badge(badge_classes=None, users=None): """ Utility function for awarding a badge when a new one is added. """ badge_classes = get_badges_classes() if badge_classes is None else badge_classes total = users.count()

这是我的密码:

def new_badge(badge_classes=None, users=None):
    """
    Utility function for awarding a badge when a new one is added.
    """

    badge_classes = get_badges_classes() if badge_classes is None else badge_classes
    total = users.count()

    for i, user in enumerate(users.iterator()):
        t0 = time.time()
        flight_ids = []
        for flight in user.flight_set.order_by('date').iterator():
            flight_ids.append(flight.id)
            flights_before = Flight.objects.filter(id__in=flight_ids)
            for BadgeClass in badge_classes:
                badge = BadgeClass(all_flights=flights_before, new_flight=flight)
                badge.grant_if_eligible()

        print "-- %s" % user.username
        print "-- %.2f s" % (time.time() - t0)
        print "-- %.2f%% done" %(float(i) / total * 100)
users
queryset中有2700个对象,超过500000个航班。当我运行这个脚本时,我希望它需要很长时间,但问题是它正在使用越来越多的内存。几个小时后,脚本停止并出现以下错误:

File "/Users/chris/Documents/flightloggin2/badges/models.py", line 361, in eligible
  c = countries.count()
File "/Library/Python/2.7/site-packages/django/db/models/query.py", line 351, in count
  return self.query.get_count(using=self.db)
File "/Library/Python/2.7/site-packages/django/db/models/sql/query.py", line 418, in get_count
  number = obj.get_aggregation(using=using)[None]
File "/Library/Python/2.7/site-packages/django/contrib/gis/db/models/sql/query.py", line 85, in get_aggregation
  return super(GeoQuery, self).get_aggregation(using)
File "/Library/Python/2.7/site-packages/django/db/models/sql/query.py", line 384, in get_aggregation
  result = query.get_compiler(using).execute_sql(SINGLE)
File "/Library/Python/2.7/site-packages/django/db/models/sql/compiler.py", line 818, in execute_sql
  cursor.execute(sql, params)
File "/Library/Python/2.7/site-packages/django/db/backends/util.py", line 40, in execute
  return self.cursor.execute(sql, params)
File "/Library/Python/2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 52, in execute
  return self.cursor.execute(query, args)

django.db.utils.DatabaseError: could not create temporary file "base/pgsql_tmp/pgsql_tmp98246.932828": No space left on device

我怎样才能解决这个问题?当我运行脚本时,
top
告诉我运行脚本的python进程有10G的内存,所以我认为这是python端的问题,不是postgres,但我不完全确定。

这看起来不像是内存问题(对我来说)。看起来磁盘空间不足。如果将其作为单个SQL
SELECT
语句编写,则运行速度可能会加快几个数量级,资源问题也会减少。使用嵌套的游标进行操作将是缓慢而脆弱的。这实际上是某种内存问题,肯定是postgresql方面的问题。尝试创建临时文件时磁盘空间不足。临时文件用于缓存大量数据,在这种情况下,它可能会缓存非常大的结果集。临时文件是在数据库的主表空间下创建的,因此如果有办法释放分区上的额外空间,那么这可能会有所帮助。找到一种减少其缓存的结果集的方法也会有所帮助。。。。此外,在这里给出的代码中似乎没有发生此错误;它在
徽章内。如果符合条件,请授予()。符合条件的方法似乎要返回数据库以获取更多信息。在没有看到这些代码的情况下,不清楚它要求DB做多少额外的工作。