Python 循环更新现有excel模板_Python_Pandas

Python 循环更新现有excel模板

python pandas

Python 循环更新现有excel模板,python,pandas,Python,Pandas,尝试编写一个脚本，其中我当前有一个excel VBA工作表，它有两个选项卡，第一个是图形，第二个是后端文件。后端由主文件更新。在主文件中有一个“城市”列，我想在其中循环浏览所有唯一的城市行，将这些行写入VBA文件，并使用城市名称保存VBA文件 master_backend = pd.read_excel(path) city = master_backend[(master_backend["City"]=="NY")] def append_df_to_excel(filename, df

尝试编写一个脚本，其中我当前有一个excel VBA工作表，它有两个选项卡，第一个是图形，第二个是后端文件。后端由主文件更新。在主文件中有一个“城市”列，我想在其中循环浏览所有唯一的城市行，将这些行写入VBA文件，并使用城市名称保存VBA文件

master_backend = pd.read_excel(path)
city = master_backend[(master_backend["City"]=="NY")] 

def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None,
                      truncate_sheet=False, 
                      **to_excel_kwargs):
   from openpyxl import load_workbook
   import pandas as pd
   if 'engine' in to_excel_kwargs:
       to_excel_kwargs.pop('engine')
   writer = pd.ExcelWriter(filename, engine='openpyxl') 
   try:
       FileNotFoundError
   except NameError:
       FileNotFoundError = IOError
   try:        
       writer.book = load_workbook(filename, keep_vba = True)
       if startrow is None and sheet_name in writer.book.sheetnames:
           startrow = writer.book[sheet_name].max_row
       if truncate_sheet and sheet_name in writer.book.sheetnames:
           idx = writer.book.sheetnames.index(sheet_name)
           writer.book.remove(writer.book.worksheets[idx])
           writer.book.create_sheet(sheet_name, idx)
       writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
   except FileNotFoundError:
       pass
   if startrow is None:
       startrow = 0
   df.to_excel(writer, sheet_name, startrow=startrow, **to_excel_kwargs)
   writer.save()

基本上我想要的是5个文件，因为有5个城市都用他们的城市名命名，因为我不知道VBA，你把它贴在python标签下，我将提供我对这个的看法

假设您的数据表名为

file

，您可以尝试以下方法：

import shutil
for city in master_backend.City.unique():
    df = master_backend.loc[master_backend.City == city]
    shutil.copy(file,f"{city}.xlsx")
    append_df_excel(f"{city}.xlsx", df,sheet_name='Backend')

破解函数顺便说一句，为了便于使用，我会在其中放入一些文档字符串：）

因为我不知道VBA，你在python标签下发布了这个，我将提供我对此的看法

假设您的数据表名为

file

，您可以尝试以下方法：

import shutil
for city in master_backend.City.unique():
    df = master_backend.loc[master_backend.City == city]
    shutil.copy(file,f"{city}.xlsx")
    append_df_excel(f"{city}.xlsx", df,sheet_name='Backend')

破解函数顺便说一句，为了便于使用，我会在其中放入一些文档字符串：）

我认为您可以通过理解pandas将在您阅读excel文件时为您创建一个数据框来显著简化此脚本。然后，只需从数据帧中收集所需信息并将其重新写入文件即可。现在还不清楚您想要在新文件中添加什么，但是假设您只想过滤第二个工作表，并将所有内容保留在第一个工作表中，则可能会是这样

# Open the file, 
# NOTE: when you open the file, if there are multiple sheets 
#   then the result is a dictionary of dataframes keyed on the sheet name
master_data = pd.read_excel(file_path, ....)

# Assuming second sheet name is 'City'
city_df=master_data['City']

# Replace 'columnName' with the name of the column (if includes headers) or column number
for city in pd.unique(city_df['columnName']):
    with pd.ExcelWriter(city + '.xlsx') as writer:
        master_data['Sheet1'].to_excel(writer, sheet_name='Sheet1')
        city_df[city_df['columnName']==city].to_excel(writer, sheet_name='City')

我认为您可以通过理解pandas将在您读取excel文件时为您创建一个数据框来显著简化此脚本。然后，只需从数据帧中收集所需信息并将其重新写入文件即可。现在还不清楚您想要在新文件中添加什么，但是假设您只想过滤第二个工作表，并将所有内容保留在第一个工作表中，则可能会是这样

# Open the file, 
# NOTE: when you open the file, if there are multiple sheets 
#   then the result is a dictionary of dataframes keyed on the sheet name
master_data = pd.read_excel(file_path, ....)

# Assuming second sheet name is 'City'
city_df=master_data['City']

# Replace 'columnName' with the name of the column (if includes headers) or column number
for city in pd.unique(city_df['columnName']):
    with pd.ExcelWriter(city + '.xlsx') as writer:
        master_data['Sheet1'].to_excel(writer, sheet_name='Sheet1')
        city_df[city_df['columnName']==city].to_excel(writer, sheet_name='City')

我认为用纯熊猫这样做的问题是，它会重写数据表，并丢失所有连接和公式，这就是为什么他有上面的自定义函数。很好的答案tho，你可以在city go中使用f字符串，使其在imo中更具可读性（：我认为在纯pandas中使用f字符串的问题是，它会重写数据表并丢失所有连接和公式，这就是为什么他有上面的自定义函数。很好的答案tho，你可以在city go中使用f字符串，使其在imo中更具可读性（：因此，这将理想地工作，但我没有控制vba文件，我只是更新后端。它也是一个xlsm文件，所以我必须使用openpyxl，后端填充从第4行开始…所以有很多细微差别，我试图解决哈哈姆我不清楚你想要什么，你想替换vba文件中的数据表，它是.xlsm？y你仍然可以用你的

append_df_excel

函数调用来替换它，这样做很理想，但是我无法控制vba文件，我只是更新了它的后端。它也是一个xlsm文件，所以我必须使用openpyxl，后端填充从第4行开始…所以有很多细微差别，我正在努力解决，哈哈哈，我不清楚它是什么如果需要，您想替换vba文件中的数据表，即.xlsm？您仍然可以使用

append\u df\u excel

函数调用替换该数据表？