Python 如何使用Pandas在现有excel文件中保存新工作表?

Python 如何使用Pandas在现有excel文件中保存新工作表?,python,pandas,openpyxl,xlsxwriter,Python,Pandas,Openpyxl,Xlsxwriter,我想使用excel文件来存储用python精心编制的数据。我的问题是无法将工作表添加到现有excel文件中。这里我建议使用一个示例代码来解决这个问题 import pandas as pd import numpy as np path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx" x1 = np.random.randn(100, 2) df1 = pd.DataFrame(x1) x2 = np.random.randn(100

我想使用excel文件来存储用python精心编制的数据。我的问题是无法将工作表添加到现有excel文件中。这里我建议使用一个示例代码来解决这个问题

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)

x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)

writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df1.to_excel(writer, sheet_name = 'x1')
df2.to_excel(writer, sheet_name = 'x2')
writer.save()
writer.close()
此代码将两个数据帧保存到两个表中,分别命名为“x1”和“x2”。如果我创建两个新的数据帧并尝试使用相同的代码添加两个新的工作表“x3”和“x4”,则原始数据将丢失

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

x3 = np.random.randn(100, 2)
df3 = pd.DataFrame(x3)

x4 = np.random.randn(100, 2)
df4 = pd.DataFrame(x4)

writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df3.to_excel(writer, sheet_name = 'x3')
df4.to_excel(writer, sheet_name = 'x4')
writer.save()
writer.close()
我想要一个excel文件,有四张表格:“x1”、“x2”、“x3”、“x4”。 我知道“xlsxwriter”不是唯一的“引擎”,还有“openpyxl”。我还看到已经有其他人写了关于这个问题的文章,但我仍然不明白如何做到这一点

这里有一个代码取自此

他们说它是有效的,但很难弄清楚它是如何起作用的。我不明白在这个上下文中“ws.title”、“ws”和“dict”是什么


保存“x1”和“x2”,然后关闭文件,再次打开并添加“x3”和“x4”的最佳方法是什么?

在您共享的示例中,您正在将现有文件加载到
book
中,并将
writer.book
值设置为
book
。在
writer.sheets=dict行中((ws.title,ws)表示book.worksheets中的ws)
您正在以
ws
的身份访问工作簿中的每个工作表。然后,工作表标题是
ws
,因此您正在创建一个包含
{sheet\u titles:sheet}
键、值对的字典。然后将此词典设置为writer.sheets基本上,这些步骤只是从
'Masterfile.xlsx'
加载现有数据,并用它们填充编写器。

现在假设您已经有一个文件,其中
x1
x2
为图纸。您可以使用示例代码加载文件,然后可以执行类似的操作来添加
x3
x4

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"
writer = pd.ExcelWriter(path, engine='openpyxl')
df3.to_excel(writer, 'x3', index=False)
df4.to_excel(writer, 'x4', index=False)
writer.save()

这应该可以满足您的需求。

我强烈建议您直接与我们合作


这使您可以专注于相关的Excel和Pandas代码。

谢谢。我相信,一个完整的例子对任何有同样问题的人都是有益的:

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)

x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)

writer = pd.ExcelWriter(path, engine = 'xlsxwriter')
df1.to_excel(writer, sheet_name = 'x1')
df2.to_excel(writer, sheet_name = 'x2')
writer.save()
writer.close()
这里我生成一个excel文件,据我所知,它是通过“xslxwriter”还是“openpyxl”引擎生成并不重要

当我想写而不丢失原始数据时

import pandas as pd
import numpy as np
from openpyxl import load_workbook

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

book = load_workbook(path)
writer = pd.ExcelWriter(path, engine = 'openpyxl')
writer.book = book

x3 = np.random.randn(100, 2)
df3 = pd.DataFrame(x3)

x4 = np.random.randn(100, 2)
df4 = pd.DataFrame(x4)

df3.to_excel(writer, sheet_name = 'x3')
df4.to_excel(writer, sheet_name = 'x4')
writer.save()
writer.close()

这个代码可以完成任务

一次将多个数据写入excel的简单示例。以及当您希望将数据附加到书面excel文件(关闭的excel文件)上的工作表时

当您第一次向excel写入时。(将“df1”和“df2”写入“第一张”和“第二张”)

关闭excel后,如果您希望将数据“附加”到同一个excel文件中,而不是另一张工作表上,请在工作表名称“第三张工作表”后面加上“df3”


请注意,excel格式不能是xls,您可以使用xlsx格式

您可以将您感兴趣的现有工作表(例如,“x1”、“x2”)读入内存,并在添加更多新工作表之前将其“写”回(请记住,文件中的工作表和内存中的工作表是两件不同的事情,如果您不读它们,它们将丢失)。此方法仅使用“xlsxwriter”,不涉及openpyxl

import pandas as pd
import numpy as np

path = r"C:\Users\fedel\Desktop\excelData\PhD_data.xlsx"

# begin <== read selected sheets and write them back
df1 = pd.read_excel(path, sheet_name='x1', index_col=0) # or sheet_name=0
df2 = pd.read_excel(path, sheet_name='x2', index_col=0) # or sheet_name=1
writer = pd.ExcelWriter(path, engine='xlsxwriter')
df1.to_excel(writer, sheet_name='x1')
df2.to_excel(writer, sheet_name='x2')
# end ==>

# now create more new sheets
x3 = np.random.randn(100, 2)
df3 = pd.DataFrame(x3)

x4 = np.random.randn(100, 2)
df4 = pd.DataFrame(x4)

df3.to_excel(writer, sheet_name='x3')
df4.to_excel(writer, sheet_name='x4')
writer.save()
writer.close()

另一个相当简单的方法是制作如下方法:

def _write_frame_to_new_sheet(path_to_file=None, sheet_name='sheet', data_frame=None):
    book = None
    try:
        book = load_workbook(path_to_file)
    except Exception:
        logging.debug('Creating new workbook at %s', path_to_file)
    with pd.ExcelWriter(path_to_file, engine='openpyxl') as writer:
        if book is not None:
            writer.book = book
        data_frame.to_excel(writer, sheet_name, index=False)

这里的想法是在路径加载工作簿到文件(如果存在),然后将数据框作为新工作表附加到工作表名称。如果工作簿不存在,则创建该工作簿。似乎无论是openpyxl还是xlsxwriterappend,就像上面@Stefano的例子一样,您都必须加载然后重写以进行append。

可以不用ExcelWriter,使用openpyxl中的工具来完成 使用
openpyxl.styles

import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows

#Location of original excel sheet
fileLocation =r'C:\workspace\data.xlsx'

#Location of new file which can be the same as original file
writeLocation=r'C:\workspace\dataNew.xlsx'

data = {'Name':['Tom','Paul','Jeremy'],'Age':[32,43,34],'Salary':[20000,34000,32000]}

#The dataframe you want to add
df = pd.DataFrame(data)

#Load existing sheet as it is
book = load_workbook(fileLocation)
#create a new sheet
sheet = book.create_sheet("Sheet Name")

#Load dataframe into new sheet
for row in dataframe_to_rows(df, index=False, header=True):
    sheet.append(row)

#Save the modified excel at desired location    
book.save(writeLocation)

用于创建新文件

x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)
with pd.ExcelWriter('sample.xlsx') as writer:  
    df1.to_excel(writer, sheet_name='x1')
要附加到文件,请使用
pd.ExcelWriter
中的参数
mode='a'

x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)
with pd.ExcelWriter('sample.xlsx', engine='openpyxl', mode='a') as writer:  
    df2.to_excel(writer, sheet_name='x2')
默认值为
模式

请参阅。

如果您可以添加更多类似于的“熊猫”示例,那将非常有帮助。我自己对熊猫没有做太多的工作,因此我无法提供那么多示例,但希望对文档进行改进。我看不出这个答案添加了什么。事实上,重复使用这样的上下文管理器将涉及更多的I/O。有没有想法,为什么我尝试此操作时会得到:ValueError:没有Excel writer的“销售线索计算.xlsx”?这是在删除预先存在的工作表。有没有想法,为什么我尝试此操作时会得到:ValueError:没有Excel writer的“销售线索计算.xlsx”?有,这是将工作表添加到excel,而不清除预先存在的工作表。谢谢保存excel文件时,如何保留现有的excel工作表格式?如果有人阅读此内容并想知道如何用相同的名称覆盖现有工作表,而不是重命名新工作表:添加行
writer.sheets=dict((ws.title,ws)表示book.worksheets中的ws)
writer.book=book
@Stefano Fedele之后,你能用“xlsxwriter”而不是“openpyxl”对现有的excel进行同样的更新吗?我不明白这与这个问题有什么关系,只知道它与excel有关。我正在努力寻找一个完整的解决方案来读写现有的工作簿,但找不到相同的解决方案。在这里,我发现了一个关于如何编写现有工作簿的提示,所以我想为我的问题提供一个完整的解决方案。希望它是清楚的。这是一个很好的解决方案,但我不确定这是否也是一个暗示。你的意思是说你不能用
ExcelWriter
做这件事,还是你不需要?你可以用ExcelWriter做这件事,但我发现用openpyxl更容易。请不要只发布代码作为答案,还要解释你的代码做了什么,以及它是如何解决问题的。带解释的答案通常更有帮助,质量也更好,
#This program is to read from excel workbook to fetch only the URL domain names and write to the existing excel workbook in a different sheet..
#Developer - Nilesh K
import pandas as pd
from openpyxl import load_workbook #for writting to the existing workbook

df = pd.read_excel("urlsearch_test.xlsx")

#You can use the below for the relative path.
# r"C:\Users\xyz\Desktop\Python\

l = [] #To make a list in for loop

#begin
#loop starts here for fetching http from a string and iterate thru the entire sheet. You can have your own logic here.
for index, row in df.iterrows():
    try: 
        str = (row['TEXT']) #string to read and iterate
        y = (index)
        str_pos = str.index('http') #fetched the index position for http
        str_pos1 = str.index('/', str.index('/')+2) #fetched the second 3rd position of / starting from http
        str_op = str[str_pos:str_pos1] #Substring the domain name
        l.append(str_op) #append the list with domain names

    #Error handling to skip the error rows and continue.
    except ValueError:
            print('Error!')
print(l)
l = list(dict.fromkeys(l)) #Keep distinct values, you can comment this line to get all the values
df1 = pd.DataFrame(l,columns=['URL']) #Create dataframe using the list
#end

#Write using openpyxl so it can be written to same workbook
book = load_workbook('urlsearch_test.xlsx')
writer = pd.ExcelWriter('urlsearch_test.xlsx',engine = 'openpyxl')
writer.book = book
df1.to_excel(writer,sheet_name = 'Sheet3')
writer.save()
writer.close()

#The below can be used to write to a different workbook without using openpyxl
#df1.to_excel(r"C:\Users\xyz\Desktop\Python\urlsearch1_test.xlsx",index='false',sheet_name='sheet1')
def _write_frame_to_new_sheet(path_to_file=None, sheet_name='sheet', data_frame=None):
    book = None
    try:
        book = load_workbook(path_to_file)
    except Exception:
        logging.debug('Creating new workbook at %s', path_to_file)
    with pd.ExcelWriter(path_to_file, engine='openpyxl') as writer:
        if book is not None:
            writer.book = book
        data_frame.to_excel(writer, sheet_name, index=False)
import pandas as pd
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows

#Location of original excel sheet
fileLocation =r'C:\workspace\data.xlsx'

#Location of new file which can be the same as original file
writeLocation=r'C:\workspace\dataNew.xlsx'

data = {'Name':['Tom','Paul','Jeremy'],'Age':[32,43,34],'Salary':[20000,34000,32000]}

#The dataframe you want to add
df = pd.DataFrame(data)

#Load existing sheet as it is
book = load_workbook(fileLocation)
#create a new sheet
sheet = book.create_sheet("Sheet Name")

#Load dataframe into new sheet
for row in dataframe_to_rows(df, index=False, header=True):
    sheet.append(row)

#Save the modified excel at desired location    
book.save(writeLocation)
x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)
with pd.ExcelWriter('sample.xlsx') as writer:  
    df1.to_excel(writer, sheet_name='x1')
x2 = np.random.randn(100, 2)
df2 = pd.DataFrame(x2)
with pd.ExcelWriter('sample.xlsx', engine='openpyxl', mode='a') as writer:  
    df2.to_excel(writer, sheet_name='x2')
import pandas as pd
import openpyxl

writer = pd.ExcelWriter('test.xlsx', engine='openpyxl')
data_df.to_excel(writer, 'sheet_name')
writer.save()
writer.close()