Python 循环浏览多个Excel文件,并将每个文件中的数据写入新的数据框
我试图循环浏览文件夹中的多个excel文件,写入新的数据框,然后按列名筛选。它应该只在第一列中显示属性地址,然后在标题下方显示其他三列,并带有相应的美元值。问题是每个文件可能没有相同的列,或者它们是重复的。这是我的密码:Python 循环浏览多个Excel文件,并将每个文件中的数据写入新的数据框,python,pandas,Python,Pandas,我试图循环浏览文件夹中的多个excel文件,写入新的数据框,然后按列名筛选。它应该只在第一列中显示属性地址,然后在标题下方显示其他三列,并带有相应的美元值。问题是每个文件可能没有相同的列,或者它们是重复的。这是我的密码: read_files = glob.glob('C:/Users/watso/Desktop/yardi_testing_desktop/yardi_downloads/*.xlsx') files_xlsx = [f for f in read_files if f[-4:]
read_files = glob.glob('C:/Users/watso/Desktop/yardi_testing_desktop/yardi_downloads/*.xlsx')
files_xlsx = [f for f in read_files if f[-4:] == "xlsx"]
all_data = pd.DataFrame()
for f in files_xlsx:
df = pd.read_excel(f)
df = df.transpose()
new_header = df.iloc[0] # grab the first row for the header
df = df[1:] # take the data less the header row
df.columns = new_header # set the header row as the df header
df.columns = df.columns.astype(str).str.strip()
df = df.iloc[:2, 3:]
all_data = all_data.append(df, sort=True)
all_data.to_csv('C:/Users/watso/PycharmProjects/yardi_noi/yyy.csv', index=False, header=0)
以下是它创建的数据帧:
read_files = glob.glob('C:/Users/watso/Desktop/yardi_testing_desktop/yardi_downloads/*.xlsx')
files_xlsx = [f for f in read_files if f[-4:] == "xlsx"]
all_data = pd.DataFrame()
for f in files_xlsx:
df = pd.read_excel(f)
df = df.transpose()
new_header = df.iloc[0] # grab the first row for the header
df = df[1:] # take the data less the header row
df.columns = new_header # set the header row as the df header
df.columns = df.columns.astype(str).str.strip()
df = df.iloc[:2, 3:]
all_data = all_data.append(df, sort=True)
all_data.to_csv('C:/Users/watso/PycharmProjects/yardi_noi/yyy.csv', index=False, header=0)
----------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| | | ASSETS | | RENTAL REVENUES | | RENTAL INCOME | Rental Inc: Base Rent | Rental Inc: Telecom Transmit | Total Rental Revenue | | TENANT FUNDED IMPROVEMENTS | Tenant Funded Improvements | Total Tenant Funded Improvements | | STRAIGHT LINE RENT | Straight Line Rent |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| 64 Fake Street - Stony Village(6546-01) | Actual | | | | | | 208598.27 | 2519.7 | 211117.97 | | | 4963.73 | 4963.73 | | | -18726.86 |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| | | ASSETS | | RENTAL REVENUES | | PROPERTY OPERATING EXPENSES | | REIMBURSABLE EXPENSES | | OTHER (INCOME) / EXPENSE | | OTHER (INCOME) | | | | |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| 234 Fake Street - Santa Wilshire(6668-01) | Actual | | | | | | | | | | | | | | | |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| | | ASSETS | | RENTAL REVENUES | | RENTAL INCOME | Rental Inc: Base Rent | Rental Inc: Incidental | Total Rental Revenue | | Total Rental Revenues | | OTHER INCOME | Oth Inc: Miscellaneous | Total Other Income | |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| 255 Fake Street - Minneapolis(9879-02) | Actual | | | | | | -12490.27 | -1060 | -13550.27 | | -13550.27 | | | 100 | 100 | |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| | | ASSETS | | RENTAL REVENUES | | RENTAL INCOME | Rental Inc: Base Rent | Total Rental Revenue | | TENANT FUNDED IMPROVEMENTS | Tenant Funded Improvements | Total Tenant Funded Improvements | | STRAIGHT LINE RENT | Straight Line Rent | Total Straight-Line Rent |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
| 543 Fake Street - Seattle(6684-01) | Actual | | | | | | 339802.38 | 339802.38 | | | 39841.64 | 39841.64 | | | 1645.96 | 1645.96 |
+-------------------------------------------+--------+--------+--+-----------------+--+-----------------------------+------------------------+-------------------------------+----------------------+----------------------------+----------------------------+----------------------------------+----------------------------------+-------------------------+--------------------+--------------------------+
问题是,当我尝试按loc()选择列时,出现一个错误,指出我选择的列不在索引中。以下是我希望表格的外观:
read_files = glob.glob('C:/Users/watso/Desktop/yardi_testing_desktop/yardi_downloads/*.xlsx')
files_xlsx = [f for f in read_files if f[-4:] == "xlsx"]
all_data = pd.DataFrame()
for f in files_xlsx:
df = pd.read_excel(f)
df = df.transpose()
new_header = df.iloc[0] # grab the first row for the header
df = df[1:] # take the data less the header row
df.columns = new_header # set the header row as the df header
df.columns = df.columns.astype(str).str.strip()
df = df.iloc[:2, 3:]
all_data = all_data.append(df, sort=True)
all_data.to_csv('C:/Users/watso/PycharmProjects/yardi_noi/yyy.csv', index=False, header=0)
+-------------------------------------------+------------------------+----------------------+----------------------------+
| | Rental Inc: Base Rent | Total Rental Revenue | Tenant Funded Improvements |
+-------------------------------------------+------------------------+----------------------+----------------------------+
| 64 Fake Street - Stony Village(6546-01) | 208598.27 | 211117.97 | 4963.73 |
+-------------------------------------------+------------------------+----------------------+----------------------------+
| 234 Fake Street - Santa Wilshire(6668-01) | Nan | Nan | Nan |
+-------------------------------------------+------------------------+----------------------+----------------------------+
| 255 Fake Street - Minneapolis(9879-02) | -12490.27 | -13550.27 | Nan |
+-------------------------------------------+------------------------+----------------------+----------------------------+
| 543 Fake Street - Seattle(6684-01) | 339802.38 | Nan | 39841.64 |
+-------------------------------------------+------------------------+----------------------+----------------------------+
在我寻求帮助之前,我已经试着弄清楚这一点将近两周了。如果你需要更多信息,请告诉我。我非常感谢您的帮助 如果不知道excel文件的外观以及您得到的确切输出,就有点难以提供帮助。实际收入、资产和租金收入是多少?它们是值还是代码读取字符串?您对.loc[]使用了哪种输入?将.to_csv命令移出循环,除非您确实计划在每个循环期间保存csv。您好,欢迎:))请在上参考此命令,并相应地修改您的问题,以便社区中的人可以轻松地帮助您。
read_files = glob.glob('C:/Users/watso/Desktop/yardi_testing_desktop/yardi_downloads/*.xlsx')
files_xlsx = [f for f in read_files if f[-4:] == "xlsx"]
all_data = pd.DataFrame()
for f in files_xlsx:
df = pd.read_excel(f)
df = df.transpose()
new_header = df.iloc[0] # grab the first row for the header
df = df[1:] # take the data less the header row
df.columns = new_header # set the header row as the df header
df.columns = df.columns.astype(str).str.strip()
df = df.iloc[:2, 3:]
all_data = all_data.append(df, sort=True)
all_data.to_csv('C:/Users/watso/PycharmProjects/yardi_noi/yyy.csv', index=False, header=0)