Python 如何将文件名映射到使用pandas从多个excel文件提取的列
我试图从多个excel文件中提取所有列,然后将文件名映射到每个提取的列,但是我很难解决“TypeError:索引不支持可变操作”的问题 以下是我的两个文件:Python 如何将文件名映射到使用pandas从多个excel文件提取的列,python,pandas,dataframe,Python,Pandas,Dataframe,我试图从多个excel文件中提取所有列,然后将文件名映射到每个提取的列,但是我很难解决“TypeError:索引不支持可变操作”的问题 以下是我的两个文件: Fund_Data.xlsx: FUND ID FUND NAME AMOUNT client code Price description Trade Date Trade Datetime 0 10101 Holdings company A 10000.5 12
Fund_Data.xlsx:
FUND ID FUND NAME AMOUNT client code Price description Trade Date Trade Datetime
0 10101 Holdings company A 10000.5 1234 124.3 abcd 2020-08-19 2020-08-19 12:30:00
1 20202 Holdings company B -2000.5 192 -24.2 abcd 2020-08-20 2020-08-20 12:30:00
2 30303 Holdings company C 3000.5 123 192 NaN 2020-08-21 2020-08-21 12:30:00
3 10101 Holdings company A 10000 1234567 5.5 NaN 2020-08-22 2020-08-22 12:30:00
4 20202 Holdings company B 10000.5 9999 3.887 abcd 2020-08-23 2020-08-23 12:30:00
这是我到目前为止的代码:
import pandas as pd
from os import walk
f = []
directory = 'C:/Users/rrai020/Documents/Python Scripts/DD'
for (dirpath, dirnames, filenames) in os.walk(directory):
for x in filenames:
if x.endswith('xlsx'):
f.append(x)
#f = ['Fund_Data.xlsx', 'Stocks.xlsx'] created a list from filenames in directory ^^^
data = pd.DataFrame() # initialize empty df
for filename in f:
df = pd.read_excel(filename, dtype = object, ignore_index=True).columns # read in each excel to df
df['filename'] = filename # add a column with the filename
data = data.append(df) # add all small df's to big df
print(data)
我正在尝试实现以下输出(或类似输出):
我希望代码是灵活的,这样它可以工作超过2个文件,我在这里。抱歉,如果这是小事,我还在学习 问题在于要附加的数据帧。我们需要为循环中的每个文件创建一个包含
字段名
,文件名
列的数据框,然后将其附加到数据
这里有一个选项:
data = pd.DataFrame()
for filename in f:
# read in each excel to df
df = pd.read_excel(filename, dtype = object, ignore_index=True).columns
# create a dataframe with (Field Name, Filename) columns for current file
x = pd.DataFrame({'Field Name': x.columns, 'Filename': filename})
# append to the global dataframe
data = data.append(x)
data
输出:
Field Name Filename
0 FUND ID Fund_Data.xlsx
1 FUND NAME Fund_Data.xlsx
2 AMOUNT Fund_Data.xlsx
3 client code Fund_Data.xlsx
4 Price description Fund_Data.xlsx
5 Trade Date Fund_Data.xlsx
6 Trade Datetime Fund_Data.xlsx
7 ID Stocks.xlsx
8 STOCK Stocks.xlsx
9 VALUE Stocks.xlsx
data = pd.DataFrame()
for filename in f:
# read in each excel to df
df = pd.read_excel(filename, dtype = object, ignore_index=True).columns
# create a dataframe with (Field Name, Filename) columns for current file
x = pd.DataFrame({'Field Name': x.columns, 'Filename': filename})
# append to the global dataframe
data = data.append(x)
data
Field Name Filename
0 FUND ID Fund_Data.xlsx
1 FUND NAME Fund_Data.xlsx
2 AMOUNT Fund_Data.xlsx
3 client code Fund_Data.xlsx
4 Price description Fund_Data.xlsx
5 Trade Date Fund_Data.xlsx
6 Trade Datetime Fund_Data.xlsx
7 ID Stocks.xlsx
8 STOCK Stocks.xlsx
9 VALUE Stocks.xlsx