Python 3.x 基于Python目录中所有excel文件的多列合并
假设我有一个数据框Python 3.x 基于Python目录中所有excel文件的多列合并,python-3.x,pandas,dataframe,merge,Python 3.x,Pandas,Dataframe,Merge,假设我有一个数据框df,和一个目录/,其中包含以下excel文件: path = './' for root, dirs, files in os.walk(path): for file in files: if file.endswith(('.xls', '.xlsx')): print(os.path.join(root, file)) # dfs.append(read_dfs(os.path.join(root,
df
,和一个目录/
,其中包含以下excel文件:
path = './'
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith(('.xls', '.xlsx')):
print(os.path.join(root, file))
# dfs.append(read_dfs(os.path.join(root, file)))
# df = reduce(lambda left, right: pd.concat([left, right], axis = 0), dfs)
输出:
我想根据公共列date
和city
将df
与path
中的所有文件合并。它适用于以下代码,但不够简洁
所以我提出了一个关于改进代码的问题,谢谢
df = pd.merge(df, df1, on = ['date', 'city'], how='left')
df = pd.merge(df, df2, on = ['date', 'city'], how='left')
df = pd.merge(df, df3, on = ['date', 'city'], how='left')
...
参考:
以下代码可能有效:
from functools import reduce
dfs = [df0, df1, df2, dfN]
df_final = reduce(lambda left, right: pd.merge(left, right, on=['date', 'city']), dfs)
from functools import reduce
dfs = [df0, df1, df2, dfN]
df_final = reduce(lambda left, right: pd.merge(left, right, on=['date', 'city']), dfs)