如何使用python将多个excel文件中的数据合并到一个excel文件中?
这是我目前的代码:如何使用python将多个excel文件中的数据合并到一个excel文件中?,python,Python,这是我目前的代码: import glob import pandas as pd import numpy as np import openpyxl log = 'G:\Data\Hotels\hotel.txt' #text file with my long list of hotels file = open(log, 'r') hotels = [] line = file.readlines() for a in line: hotels.append(a.rstri
import glob
import pandas as pd
import numpy as np
import openpyxl
log = 'G:\Data\Hotels\hotel.txt' #text file with my long list of hotels
file = open(log, 'r')
hotels = []
line = file.readlines()
for a in line:
hotels.append(a.rstrip('\n'))
for hotel in hotels :
path = "G:\\Data\\Hotels\\"+hotel+"\\"+hotel+" - Meetings"
file = hotel+"_Action_Log.xlsx"
print(file)
到目前为止,所有这些代码所做的就是打印所有酒店文件的名称(我猜是字符串?),我现在要将这些文件的内容复制并粘贴到一个“主”excel文件中。每个excel文件中我只需要一张工作表,不需要标题(由于前4行的格式奇特,标题被放在第5行)
我接下来的步骤是什么?我是python新手。根据您对问题的描述,我假设您的意思是打开并附加具有相同格式和结构的多个文件(即,具有相同的列,并且列的顺序相同) 换句话说,您希望执行以下操作: Excel工作表1
Col1 Col2
a b
Excel工作表2
Col1 Col2
c d
合并(附加)Excel工作表
Col1 Col2
a b
c d
如果我对您的问题的假设是正确的,那么您可以尝试以下方法:
import glob
import pandas as pd
import numpy as np
import openpyxl
# This is your code
log = 'G:\Data\Hotels\hotel.txt' #text file with my long list of hotels
file = open(log, 'r')
hotels = []
line = file.readlines()
for a in line:
hotels.append(a.rstrip('\n'))
# We'll use this list to keep track of all your filepaths
filepaths = []
# I merged your 'path' and 'file' vars into a single variable ('fp')
for hotel in hotels :
# path = "G:\\Data\\Hotels\\"+hotel+"\\"+hotel+" - Meetings"
# file = hotel+"_Action_Log.xlsx"
fp = "G:\\Data\\Hotels\\"+hotel+"\\"+hotel+" -Meetings\\"+hotel+"_Action_Log.xlsx"
# print(file)
filepaths.append(fp)
# This list stores all of your worksheets (as dataframes)
worksheets = []
# Open all of your Excel worksheets as Pandas dataframes and store them in 'worksheets' to concatenate later
for filepath in filepaths:
# You may need to adjust the `skiprows` parameter; right now it's set to skip (not read) the first row of each Excel worksheet (typically the header row)
df = pd.read_excel(filepath, skiprows=1)
worksheets.append(df)
# Append all worksheets together
append = pd.concat(worksheets)
# Change 'header' to True if you want to write out column headers
append.to_excel('G:\\Data\\Hotels\\merged.xlsx', header=False)
您可以在此处了解有关
pd.concat()
方法的更多信息:似乎您没有合并文件。。这里您正在打印每个酒店的文件名。但您的目标是将多个excel文件中的数据合并到一个文件中。@Saurabhkukade哦,对了,我是Python新手。我该怎么做?嗨,朋友。我根据你的反馈更新了我的答案。这当然有道理。但是有没有一种方法可以让我先循环浏览所有的酒店文件,然后再追加?