Python 结合两个for循环以提高效率_Python_Csv_Pandas

Python 结合两个for循环以提高效率

python csv pandas

Python 结合两个for循环以提高效率,python,csv,pandas,Python,Csv,Pandas,我通过循环csv文件制作了一组图形，然后基于groupby制作了多个图形。代码如下： import pandas as pd import matplotlib.pyplot as plt from matplotlib.backends.backend_pdf import PdfPages frame=pd.read_csv('C:\\') pdf_files = {} for group_name, group in frame.groupby(['Allotment','Ye

我通过循环csv文件制作了一组图形，然后基于groupby制作了多个图形。代码如下：

import pandas as pd   
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages


frame=pd.read_csv('C:\\')

pdf_files = {}
for group_name, group in frame.groupby(['Allotment','Year','Month','Day']):
    allotment,year,month,day = group_name
    if month not in pdf_files:
        pdf_files[allotment,month] = PdfPages(r'F:\Sheyenne\Statistics\IDL_stats\Allotment_histos\Month\SWIR32' + '_' + allotment + '_'+ month + '.pdf') 
    plot=group.plot(x='Percent', y='SWIR32', title=str(group_name)).get_figure()
    pdf_files[allotment,month].savefig(plot)
    plt.close(plot)

for key in pdf_files:
    pdf_files[key].close()

print "Done"

但这返回了一个错误，表示打开的文件太多。我想，如果我能将两个for循环合并为一个，这可能会解决这个问题，但我不确定如何做。

这样行吗

import pandas as pd   
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages

frame=pd.read_csv('C:\\')

pdf_files = {}
for group_name, group in frame.groupby(['Allotment','Year','Month','Day']):
    allotment,year,month,day = group_name
    if month not in pdf_files:
        pdf_files[allotment,month] = PdfPages(r'F:\Sheyenne\Statistics\IDL_stats\Allotment_histos\Month\SWIR32' + '_' + allotment + '_'+ month + '.pdf') 
    plot=group.plot(x='Percent', y='SWIR32', title=str(group_name)).get_figure()
    pdf_files[allotment,month].savefig(plot)
    pdf_files[allotment,month].close()
    plt.close(plot)

print "Done"

基本上，只需确保在完成编辑后关闭该文件。

任何不能按

['allotment'，'month']]

分组的原因，首先，每个循环将只是一个pdf文件（最好将

与PdfPages（…）一起作为pdf文件使用：

）

如前所述，这将不起作用，因为它有时会访问已关闭的文件。我会简化它，只是根本不尝试缓存文件对象。每次打开一个文件，完成后关闭。如果速度太慢，我会将输出图缓存在内存中，然后在最后写入文件。我会避免有很多开放文件对象的中间立场。啊，你完全正确。无论如何，这样分解逻辑是有意义的。

使用PdfPages（{}{}{}{}{}.pdf'.format（basename，allocation，month）作为pdf文件：

返回语法错误抱歉缺少paren-修复

basename = r'F:\Sheyenne\Statistics\IDL_stats\Allotment_histos\Month\SWIR32'
for file_name, file in frame.groupby(['Allotment','Month']):
    allotment, month = file_name
    with PdfPages('{}_{}_{}.pdf'.format(basename, allotment, month)) as pdf_file:
        for group_name, group in file.groupby(['Allotment','Month','Year', 'Day']):
            plot = group.plot(x='Percent', y='SWIR32', title=str(group_name)).get_figure()
            pdf_file.savefig(plot)
            plt.close(plot)