Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/django/23.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Mysql 如何批量评估查询集?_Mysql_Django_Django Models_Django Queryset - Fatal编程技术网

Mysql 如何批量评估查询集?

Mysql 如何批量评估查询集?,mysql,django,django-models,django-queryset,Mysql,Django,Django Models,Django Queryset,我有一个100000多行的模型。我想对它做一些手术,但不能一次性完成,因为它太大了。所以,我想到了这样使用Paginator: def fun(): paginator = Paginator(Model.objects.filter(**some_filter), 10000) for page_no in paginator.page_range: page = paginator.get_page(page_no) queryset = pa

我有一个100000多行的模型。我想对它做一些手术,但不能一次性完成,因为它太大了。所以,我想到了这样使用Paginator:

def fun():
    paginator = Paginator(Model.objects.filter(**some_filter), 10000)
    for page_no in paginator.page_range:
        page = paginator.get_page(page_no)
        queryset = page.object_list
        # Do some operation on queryset

    # Check if new records are added in the Model, (if yes, then do the operation on new records 
    only)
queryset = Model.objects.filter(**some_filter)
while queryset.exists():
    timestamp = datetime.datetime.now()
    # Do your batching and other operations
    queryset = queryset.filter('created_at__gt'=timestamp)
代码中的最后一条注释是,在运行上述代码时,如果添加了新记录(因为这是一个实时应用程序),那么我们也必须对这些记录执行相同的操作


所以我的问题是,如何只运行相同的代码就获得剩余(新)记录?

这很简单。如果模型中有datetime字段,则在“for”中的最后一项上,可以将datetime字段保留在变量中,在“for”之后,检查是否有datetime字段大于最后一项datetime字段的对象,只对其执行操作。这样可以防止对一个对象执行两次操作


注意:如果您的对象没有datetime字段,请添加该字段。

您可以按照其他答案的建议,始终使用在字段创建的
来获取最新记录,如下所示:

def fun():
    paginator = Paginator(Model.objects.filter(**some_filter), 10000)
    for page_no in paginator.page_range:
        page = paginator.get_page(page_no)
        queryset = page.object_list
        # Do some operation on queryset

    # Check if new records are added in the Model, (if yes, then do the operation on new records 
    only)
queryset = Model.objects.filter(**some_filter)
while queryset.exists():
    timestamp = datetime.datetime.now()
    # Do your batching and other operations
    queryset = queryset.filter('created_at__gt'=timestamp)