Python 如何将文件名映射到使用pandas从多个excel文件提取的列

Python 如何将文件名映射到使用pandas从多个excel文件提取的列,python,pandas,dataframe,Python,Pandas,Dataframe,我试图从多个excel文件中提取所有列,然后将文件名映射到每个提取的列,但是我很难解决“TypeError:索引不支持可变操作”的问题 以下是我的两个文件: Fund_Data.xlsx: FUND ID FUND NAME AMOUNT client code Price description Trade Date Trade Datetime 0 10101 Holdings company A 10000.5 12

我试图从多个excel文件中提取所有列,然后将文件名映射到每个提取的列,但是我很难解决“TypeError:索引不支持可变操作”的问题

以下是我的两个文件:

Fund_Data.xlsx:
    
    FUND ID  FUND NAME           AMOUNT   client code  Price description   Trade Date    Trade Datetime  
0   10101   Holdings company A  10000.5   1234        124.3   abcd        2020-08-19    2020-08-19 12:30:00 
1   20202   Holdings company B  -2000.5   192         -24.2   abcd        2020-08-20    2020-08-20 12:30:00 
2   30303   Holdings company C  3000.5    123         192     NaN         2020-08-21    2020-08-21 12:30:00 
3   10101   Holdings company A  10000     1234567     5.5     NaN         2020-08-22    2020-08-22 12:30:00
4   20202   Holdings company B  10000.5   9999        3.887   abcd        2020-08-23    2020-08-23 12:30:00
这是我到目前为止的代码:

import pandas as pd
from os import walk

f = []
directory = 'C:/Users/rrai020/Documents/Python Scripts/DD'
for (dirpath, dirnames, filenames) in os.walk(directory):
    for x in filenames:
        if x.endswith('xlsx'):
            f.append(x)
#f = ['Fund_Data.xlsx', 'Stocks.xlsx'] created a list from filenames in directory ^^^

data = pd.DataFrame() # initialize empty df
for filename in f:
    df = pd.read_excel(filename, dtype = object, ignore_index=True).columns # read in each excel to df
    df['filename'] = filename # add a column with the filename
    data = data.append(df) # add all small df's to big df 

print(data)
我正在尝试实现以下输出(或类似输出):


我希望代码是灵活的,这样它可以工作超过2个文件,我在这里。抱歉,如果这是小事,我还在学习

问题在于要附加的数据帧。我们需要为循环中的每个文件创建一个包含
字段名
文件名
列的数据框,然后将其附加到
数据

这里有一个选项:

data = pd.DataFrame()
for filename in f:
    # read in each excel to df
    df = pd.read_excel(filename, dtype = object, ignore_index=True).columns

    # create a dataframe with (Field Name, Filename) columns for current file
    x = pd.DataFrame({'Field Name': x.columns, 'Filename': filename})
    
    # append to the global dataframe
    data = data.append(x)

data
输出:

          Field Name        Filename
0            FUND ID  Fund_Data.xlsx
1          FUND NAME  Fund_Data.xlsx
2             AMOUNT  Fund_Data.xlsx
3        client code  Fund_Data.xlsx
4  Price description  Fund_Data.xlsx
5         Trade Date  Fund_Data.xlsx
6     Trade Datetime  Fund_Data.xlsx
7                 ID     Stocks.xlsx
8              STOCK     Stocks.xlsx
9              VALUE     Stocks.xlsx
data = pd.DataFrame()
for filename in f:
    # read in each excel to df
    df = pd.read_excel(filename, dtype = object, ignore_index=True).columns

    # create a dataframe with (Field Name, Filename) columns for current file
    x = pd.DataFrame({'Field Name': x.columns, 'Filename': filename})
    
    # append to the global dataframe
    data = data.append(x)

data
          Field Name        Filename
0            FUND ID  Fund_Data.xlsx
1          FUND NAME  Fund_Data.xlsx
2             AMOUNT  Fund_Data.xlsx
3        client code  Fund_Data.xlsx
4  Price description  Fund_Data.xlsx
5         Trade Date  Fund_Data.xlsx
6     Trade Datetime  Fund_Data.xlsx
7                 ID     Stocks.xlsx
8              STOCK     Stocks.xlsx
9              VALUE     Stocks.xlsx