Python 如何在不覆盖数据的情况下写入现有excel文件(使用pandas)?
我使用pandas以以下方式写入excel文件:Python 如何在不覆盖数据的情况下写入现有excel文件(使用pandas)?,python,excel,python-2.7,pandas,Python,Excel,Python 2.7,Pandas,我使用pandas以以下方式写入excel文件: import pandas writer = pandas.ExcelWriter('Masterfile.xlsx') data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2']) writer.save() Masterfile.xlsx已经由许多不同的选项卡组成。但是,它还不包含“Main” Pandas正确地写入“主”工作表,不幸的是,它还删除了所有其他选项卡。P
import pandas
writer = pandas.ExcelWriter('Masterfile.xlsx')
data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
writer.save()
Masterfile.xlsx已经由许多不同的选项卡组成。但是,它还不包含“Main”
Pandas正确地写入“主”工作表,不幸的是,它还删除了所有其他选项卡。Pandas文档称它使用openpyxl处理xlsx文件。快速浏览
ExcelWriter
中的代码,可以得出这样的结论:
import pandas
from openpyxl import load_workbook
book = load_workbook('Masterfile.xlsx')
writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl')
writer.book = book
## ExcelWriter for some reason uses writer.sheets to access the sheet.
## If you leave it empty it will not know that sheet Main is already there
## and will create a new sheet.
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
writer.save()
Pandas docs表示,它使用openpyxl处理xlsx文件。快速浏览
ExcelWriter
中的代码,可以得出这样的结论:
import pandas
from openpyxl import load_workbook
book = load_workbook('Masterfile.xlsx')
writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl')
writer.book = book
## ExcelWriter for some reason uses writer.sheets to access the sheet.
## If you leave it empty it will not know that sheet Main is already there
## and will create a new sheet.
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
writer.save()
这非常好,唯一的问题是主文件(我们添加新工作表的文件)的格式丢失
这工作得非常好,唯一的问题是主文件(我们添加新工作表的文件)的格式丢失。使用
openpyxl
版本2.4.0
和pandas
版本0.19.2
,@ski的过程变得简单了一些:
import pandas
from openpyxl import load_workbook
with pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') as writer:
writer.book = load_workbook('Masterfile.xlsx')
data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
#That's it!
使用
openpyxl
version2.4.0
和pandas
version0.19.2
,@ski的过程变得简单了一点:
import pandas
from openpyxl import load_workbook
with pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') as writer:
writer.book = load_workbook('Masterfile.xlsx')
data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2'])
#That's it!
老问题,但我猜有些人仍然在寻找这个-所以 我觉得这个方法很好,因为所有工作表都被加载到一个表名和数据帧对的字典中,由pandas使用sheetname=None选项创建。在将电子表格读入dict格式和从dict中写回工作表之间,添加、删除或修改工作表非常简单。对于我来说,xlsxwriter在速度和格式方面都比openpyxl更好 注:pandas(0.21.0+)的未来版本将“sheetname”参数更改为“sheet_name” 关于2013年问题中的示例:
ws_dict = pd.read_excel('Masterfile.xlsx',
sheetname=None)
ws_dict['Main'] = data_filtered[['Diff1', 'Diff2']]
with pd.ExcelWriter('Masterfile.xlsx',
engine='xlsxwriter') as writer:
for ws_name, df_sheet in ws_dict.items():
df_sheet.to_excel(writer, sheet_name=ws_name)
老问题,但我猜有些人仍然在寻找这个-所以 我觉得这个方法很好,因为所有工作表都被加载到一个表名和数据帧对的字典中,由pandas使用sheetname=None选项创建。在将电子表格读入dict格式和从dict中写回工作表之间,添加、删除或修改工作表非常简单。对于我来说,xlsxwriter在速度和格式方面都比openpyxl更好 注:pandas(0.21.0+)的未来版本将“sheetname”参数更改为“sheet_name” 关于2013年问题中的示例:
ws_dict = pd.read_excel('Masterfile.xlsx',
sheetname=None)
ws_dict['Main'] = data_filtered[['Diff1', 'Diff2']]
with pd.ExcelWriter('Masterfile.xlsx',
engine='xlsxwriter') as writer:
for ws_name, df_sheet in ws_dict.items():
df_sheet.to_excel(writer, sheet_name=ws_name)
我知道这是一个较旧的线程,但这是搜索时找到的第一项,如果您需要在已创建的工作簿中保留图表,则上述解决方案不起作用。在这种情况下,xlwings是一个更好的选择-它允许您写入excel手册并保留图表/图表数据 简单的例子:
import xlwings as xw
import pandas as pd
#create DF
months = ['2017-01','2017-02','2017-03','2017-04','2017-05','2017-06','2017-07','2017-08','2017-09','2017-10','2017-11','2017-12']
value1 = [x * 5+5 for x in range(len(months))]
df = pd.DataFrame(value1, index = months, columns = ['value1'])
df['value2'] = df['value1']+5
df['value3'] = df['value2']+5
#load workbook that has a chart in it
wb = xw.Book('C:\\data\\bookwithChart.xlsx')
ws = wb.sheets['chartData']
ws.range('A1').options(index=False).value = df
wb = xw.Book('C:\\data\\bookwithChart_updated.xlsx')
xw.apps[0].quit()
我知道这是一个较旧的线程,但这是搜索时找到的第一项,如果您需要在已创建的工作簿中保留图表,则上述解决方案不起作用。在这种情况下,xlwings是一个更好的选择-它允许您写入excel手册并保留图表/图表数据 简单的例子:
import xlwings as xw
import pandas as pd
#create DF
months = ['2017-01','2017-02','2017-03','2017-04','2017-05','2017-06','2017-07','2017-08','2017-09','2017-10','2017-11','2017-12']
value1 = [x * 5+5 for x in range(len(months))]
df = pd.DataFrame(value1, index = months, columns = ['value1'])
df['value2'] = df['value1']+5
df['value3'] = df['value2']+5
#load workbook that has a chart in it
wb = xw.Book('C:\\data\\bookwithChart.xlsx')
ws = wb.sheets['chartData']
ws.range('A1').options(index=False).value = df
wb = xw.Book('C:\\data\\bookwithChart_updated.xlsx')
xw.apps[0].quit()
“保持约会”希望对你有所帮助
“keep_date_col”希望能帮助您这里有一个帮助函数:
import os
from openpyxl import load_workbook
def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None,
truncate_sheet=False,
**to_excel_kwargs):
"""
Append a DataFrame [df] to existing Excel file [filename]
into [sheet_name] Sheet.
If [filename] doesn't exist, then this function will create it.
@param filename: File path or existing ExcelWriter
(Example: '/path/to/file.xlsx')
@param df: DataFrame to save to workbook
@param sheet_name: Name of sheet which will contain DataFrame.
(default: 'Sheet1')
@param startrow: upper left cell row to dump data frame.
Per default (startrow=None) calculate the last row
in the existing DF and write to the next row...
@param truncate_sheet: truncate (remove and recreate) [sheet_name]
before writing DataFrame to Excel file
@param to_excel_kwargs: arguments which will be passed to `DataFrame.to_excel()`
[can be a dictionary]
@return: None
Usage examples:
>>> append_df_to_excel('d:/temp/test.xlsx', df)
>>> append_df_to_excel('d:/temp/test.xlsx', df, header=None, index=False)
>>> append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2',
index=False)
>>> append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2',
index=False, startrow=25)
(c) [MaxU](https://stackoverflow.com/users/5741205/maxu?tab=profile)
"""
# Excel file doesn't exist - saving and exiting
if not os.path.isfile(filename):
df.to_excel(
filename,
sheet_name=sheet_name,
startrow=startrow if startrow is not None else 0,
**to_excel_kwargs)
return
# ignore [engine] parameter if it was passed
if 'engine' in to_excel_kwargs:
to_excel_kwargs.pop('engine')
writer = pd.ExcelWriter(filename, engine='openpyxl', mode='a')
# try to open an existing workbook
writer.book = load_workbook(filename)
# get the last row in the existing Excel sheet
# if it was not specified explicitly
if startrow is None and sheet_name in writer.book.sheetnames:
startrow = writer.book[sheet_name].max_row
# truncate sheet
if truncate_sheet and sheet_name in writer.book.sheetnames:
# index of [sheet_name] sheet
idx = writer.book.sheetnames.index(sheet_name)
# remove [sheet_name]
writer.book.remove(writer.book.worksheets[idx])
# create an empty sheet [sheet_name] using old index
writer.book.create_sheet(sheet_name, idx)
# copy existing sheets
writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
if startrow is None:
startrow = 0
# write out the new sheet
df.to_excel(writer, sheet_name, startrow=startrow, **to_excel_kwargs)
# save the workbook
writer.save()
使用以下版本进行测试:
- 熊猫1.2.3
- Openpyxl 3.0.5
- 这里有一个助手函数:
import os
from openpyxl import load_workbook
def append_df_to_excel(filename, df, sheet_name='Sheet1', startrow=None,
truncate_sheet=False,
**to_excel_kwargs):
"""
Append a DataFrame [df] to existing Excel file [filename]
into [sheet_name] Sheet.
If [filename] doesn't exist, then this function will create it.
@param filename: File path or existing ExcelWriter
(Example: '/path/to/file.xlsx')
@param df: DataFrame to save to workbook
@param sheet_name: Name of sheet which will contain DataFrame.
(default: 'Sheet1')
@param startrow: upper left cell row to dump data frame.
Per default (startrow=None) calculate the last row
in the existing DF and write to the next row...
@param truncate_sheet: truncate (remove and recreate) [sheet_name]
before writing DataFrame to Excel file
@param to_excel_kwargs: arguments which will be passed to `DataFrame.to_excel()`
[can be a dictionary]
@return: None
Usage examples:
>>> append_df_to_excel('d:/temp/test.xlsx', df)
>>> append_df_to_excel('d:/temp/test.xlsx', df, header=None, index=False)
>>> append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2',
index=False)
>>> append_df_to_excel('d:/temp/test.xlsx', df, sheet_name='Sheet2',
index=False, startrow=25)
(c) [MaxU](https://stackoverflow.com/users/5741205/maxu?tab=profile)
"""
# Excel file doesn't exist - saving and exiting
if not os.path.isfile(filename):
df.to_excel(
filename,
sheet_name=sheet_name,
startrow=startrow if startrow is not None else 0,
**to_excel_kwargs)
return
# ignore [engine] parameter if it was passed
if 'engine' in to_excel_kwargs:
to_excel_kwargs.pop('engine')
writer = pd.ExcelWriter(filename, engine='openpyxl', mode='a')
# try to open an existing workbook
writer.book = load_workbook(filename)
# get the last row in the existing Excel sheet
# if it was not specified explicitly
if startrow is None and sheet_name in writer.book.sheetnames:
startrow = writer.book[sheet_name].max_row
# truncate sheet
if truncate_sheet and sheet_name in writer.book.sheetnames:
# index of [sheet_name] sheet
idx = writer.book.sheetnames.index(sheet_name)
# remove [sheet_name]
writer.book.remove(writer.book.worksheets[idx])
# create an empty sheet [sheet_name] using old index
writer.book.create_sheet(sheet_name, idx)
# copy existing sheets
writer.sheets = {ws.title:ws for ws in writer.book.worksheets}
if startrow is None:
startrow = 0
# write out the new sheet
df.to_excel(writer, sheet_name, startrow=startrow, **to_excel_kwargs)
# save the workbook
writer.save()
使用以下版本进行测试:
- 熊猫1.2.3
- Openpyxl 3.0.5
ExcelWriter
的模式
关键字参数简化此操作:
将熊猫作为pd导入
使用pd.ExcelWriter('the_file.xlsx',engine='openpyxl',mode='a')作为编写器:
数据过滤到excel(编写器)
从pandas 0.24开始,您可以使用ExcelWriter
的模式
关键字参数简化此操作:
将熊猫作为pd导入
使用pd.ExcelWriter('the_file.xlsx',engine='openpyxl',mode='a')作为编写器:
数据过滤到excel(编写器)
熊猫0.24中有一个更好的解决方案:
with pd.ExcelWriter(path, mode='a') as writer:
s.to_excel(writer, sheet_name='another sheet', index=False)
之前:
之后:
现在就升级你的熊猫吧:
pip install --upgrade pandas
熊猫0.24中有一个更好的解决方案:
with pd.ExcelWriter(path, mode='a') as writer:
s.to_excel(writer, sheet_name='another sheet', index=False)
之前:
之后:
现在就升级你的熊猫吧:
pip install --upgrade pandas
方法:
- 如果不存在,则可以创建文件
- 根据图纸名称追加到现有excel
将熊猫作为pd导入
从openpyxl导入加载工作簿
def写入excel(df,文件):
尝试:
book=加载工作簿(文件)
writer=pd.ExcelWriter(文件,engine='openpyxl')
writer.book=书
writer.sheets=dict((ws.title,ws)表示book.worksheets中的ws)
df.to_excel(编写器,**kwds)
writer.save()
除FileNotFoundError为e外:
df.to_excel(文件,**kwds)
用法:
df_a=pd.DataFrame(范围(10),列=[“a”])
df_b=pd.DataFrame(范围(10,20),列=[“b”])
写入excel(df_a,“test.xlsx”,sheet_name=“sheet a”,columns=['a'],index=False)
写入excel(df_b,“test.xlsx”,sheet_name=“sheet b”,columns=['b'])
方法:
- 如果不存在,则可以创建文件
- 根据图纸名称追加到现有excel
将熊猫作为pd导入
从openpyxl导入加载工作簿
def写入excel(df,文件):
尝试:
book=加载工作簿(文件)
writer=pd.ExcelWriter(文件,engine='openpyxl')
writer.book=书
writer.sheets=dict((ws.title,ws)表示book.worksheets中的ws)
df.to_excel(编写器,**kwds)
writer.save()
除FileNotFoundError为e外:
df.to_excel(文件,**kwds)
用法:
df_a=pd.DataFrame(范围(10),列=[“a”])
df_b=pd.DataFrame(范围(10,20),列=[“b”])
写入excel(df_a,“test.xlsx”,sheet_name=“sheet a”,columns=['a'],index=False)
写入excel(df_b,“test.xlsx”,sheet_name=“sheet b”,columns=['b'])
由@MaxU wor提供的解决方案
# Create a sample dataframe
df = pd.DataFrame({'numbers': [1, 2, 3],
'colors': ['red', 'white', 'blue'],
'colorsTwo': ['yellow', 'white', 'blue'],
'NaNcheck': [float('NaN'), 1, float('NaN')],
})
# EDIT YOUR PATH FOR THE EXPORT
filename = r"C:\DataScience\df.xlsx"
# RUN ONE BY ONE IN ROW THE FOLLOWING LINES, TO SEE THE DIFFERENT UPDATES TO THE EXCELFILE
append_df_to_excel(filename, df, index=False, startrow=0) # Basic Export of df in default sheet (Sheet1)
append_df_to_excel(filename, df, sheet_name="Cool", index=False, startrow=0) # Append the sheet "Cool" where "df" is written
append_df_to_excel(filename, df, sheet_name="Cool", index=False) # Append another "df" to the sheet "Cool", just below the other "df" instance
append_df_to_excel(filename, df, sheet_name="Cool", index=False, startrow=0, startcol=5) # Append another "df" to the sheet "Cool" starting from col 5
append_df_to_excel(filename, df, index=False, truncate_sheet=True, startrow=10, na_rep = '') # Override (truncate) the "Sheet1", writing the df from row 10, and showing blank cells instead of NaN