Pandas-在同一标题中透视和重新排列具有多个标签的表
我有一个xlsx文件,其中包含用于多年数据的选项卡。每个选项卡包含一个包含许多列的表,该表的结构如下:Pandas-在同一标题中透视和重新排列具有多个标签的表,pandas,dataframe,split,header,reformatting,Pandas,Dataframe,Split,Header,Reformatting,我有一个xlsx文件,其中包含用于多年数据的选项卡。每个选项卡包含一个包含许多列的表,该表的结构如下: +-----------+-------+-------------------------+----------------------+ | City | State | Number of Drivers, 2019 | Number of Cars, 2019 | +-----------+-------+-------------------------+---------
+-----------+-------+-------------------------+----------------------+
| City | State | Number of Drivers, 2019 | Number of Cars, 2019 |
+-----------+-------+-------------------------+----------------------+
| LA | CA | 123 | 10.0 |
| San Diego | CA | 456 | 2345 |
+-----------+-------+-------------------------+----------------------+
我想重新排列表格,使其看起来像这样,并对xlsx中的每个选项卡执行此操作:
+-----------+-------+------+-------------------+---------------+
| City | State | Year | Measure Name | Measure Value |
+-----------+-------+------+-------------------+---------------+
| LA | CA | 2019 | Number of Drivers | 123 |
| San Diego | CA | 2019 | Number of Drivers | 456 |
| LA | CA | 2019 | Number of Cars | 10 |
| San Diego | CA | 2019 | Number of Cars | 2345 |
+-----------+-------+------+-------------------+---------------+
这里有很多移动的部分,要想得到正确的最终格式有点棘手 我们做
melt
然后join
与str.split
s=df.melt(['City','State'])
s=s.join(s.variable.str.split(',',expand=True))
Out[120]:
City State variable value 0 1
0 LA CA NumberofDrivers,2019 123.0 NumberofDrivers 2019
1 SanDiego CA NumberofDrivers,2019 456.0 NumberofDrivers 2019
2 LA CA NumberofCars,2019 10.0 NumberofCars 2019
3 SanDiego CA NumberofCars,2019 2345.0 NumberofCars 2019
# if you need change the name adding .rename(columns={}) at the end
这就是我如何将Yoben的解决方案应用于xlsx文件中的每个选项卡,将它们附加在一起,并将完整的表格写入.csv:
sheets_dict = pd.read_excel(r'file.xlsx', sheet_name=None)
full_table = pd.DataFrame()
for name, sheet in sheets_dict.items():
sheet['sheet'] = name
sheet = sheet.melt(['City','State'])
sheet = sheet.join(sheet.variable.str.split(',' , expand=True))
full_table = full_table.append(sheet)
full_table.reset_index(inplace=True, drop=True)
full_table.to_csv('Full Table.csv')
这很有效,非常感谢!下面我将添加代码,以便在xlsx文件中的所有选项卡上执行表重排。