Python 在处理大量图像时释放内存
我在气流中有dag,dag处理带有图像信息的文档,我运行这个dag处理500K个文档,每个文档都包含带有图像的url,从这个url下载的图像(使用multiprocessing.ThreadPool),用PIL.image打开,并通过神经网络分类预测这个图像。我对成批处理的图像进行编码(每批5K个图像)。如何在下一批开始处理时释放内存?在内存探查器中,生成器函数调用中的内存大幅增加(对于img\u批处理,在self.images\u生成器(img\u索引\u文档\u列表)中映射url\u到\u id\u此字符串) 这是内存分析器结果Python 在处理大量图像时释放内存,python,image,image-processing,airflow,Python,Image,Image Processing,Airflow,我在气流中有dag,dag处理带有图像信息的文档,我运行这个dag处理500K个文档,每个文档都包含带有图像的url,从这个url下载的图像(使用multiprocessing.ThreadPool),用PIL.image打开,并通过神经网络分类预测这个图像。我对成批处理的图像进行编码(每批5K个图像)。如何在下一批开始处理时释放内存?在内存探查器中,生成器函数调用中的内存大幅增加(对于img\u批处理,在self.images\u生成器(img\u索引\u文档\u列表)中映射url\u到\u
Line # Mem usage Increment Line Contents
================================================
54 319.652 MiB 319.652 MiB @profile
55 def execute(self):
56 319.652 MiB 0.000 MiB count = 0
57 319.957 MiB 0.305 MiB models_collection = self.init_mng_collection(self.mng_models_collection)
58 319.957 MiB 0.000 MiB img_index_collection = self.init_mng_collection(self.mng_image_index_collection)
59 319.957 MiB 0.000 MiB model_id_list = ['5d9de619929ea61c40ae6267']
60 2143.152 MiB 1823.195 MiB models_objects = self.get_trained_models_objects(model_id_list, models_collection)
61 5196.199 MiB 3053.047 MiB img_index_docs_list = self.get_mng_images_doc(img_index_collection)
62 20555.500 MiB 7782.336 MiB for img_batch, url_to_id_map in self.images_generator(img_index_docs_list):
63 20555.500 MiB 0.000 MiB count += 1
64 20605.824 MiB 0.000 MiB for prediction_model in models_objects:
65 20606.246 MiB 90.785 MiB prediction_result = prediction_model.work(img_batch)
66 20605.824 MiB 0.000 MiB self.put_to_img_index(prediction_result, prediction_model.name, url_to_id_map, img_index_collection)
67 20605.824 MiB 0.000 MiB if count == 2:
68 20605.824 MiB 0.000 MiB break
Line # Mem usage Increment Line Contents
================================================
54 319.652 MiB 319.652 MiB @profile
55 def execute(self):
56 319.652 MiB 0.000 MiB count = 0
57 319.957 MiB 0.305 MiB models_collection = self.init_mng_collection(self.mng_models_collection)
58 319.957 MiB 0.000 MiB img_index_collection = self.init_mng_collection(self.mng_image_index_collection)
59 319.957 MiB 0.000 MiB model_id_list = ['5d9de619929ea61c40ae6267']
60 2143.152 MiB 1823.195 MiB models_objects = self.get_trained_models_objects(model_id_list, models_collection)
61 5196.199 MiB 3053.047 MiB img_index_docs_list = self.get_mng_images_doc(img_index_collection)
62 20555.500 MiB 7782.336 MiB for img_batch, url_to_id_map in self.images_generator(img_index_docs_list):
63 20555.500 MiB 0.000 MiB count += 1
64 20605.824 MiB 0.000 MiB for prediction_model in models_objects:
65 20606.246 MiB 90.785 MiB prediction_result = prediction_model.work(img_batch)
66 20605.824 MiB 0.000 MiB self.put_to_img_index(prediction_result, prediction_model.name, url_to_id_map, img_index_collection)
67 20605.824 MiB 0.000 MiB if count == 2:
68 20605.824 MiB 0.000 MiB break