Python 将重复的行转换为具有标题的多列
输入数据帧:Python 将重复的行转换为具有标题的多列,python,pandas,Python,Pandas,输入数据帧: case constant number code 761e7 C20 3570 A 761e7 C20 2780 A 761e7 C20 7150 A 761e7 C20 2950 A 761e7 C20 3570 B 761e7 C20 2780 B 761e
case constant number code
761e7 C20 3570 A
761e7 C20 2780 A
761e7 C20 7150 A
761e7 C20 2950 A
761e7 C20 3570 B
761e7 C20 2780 B
761e7 C20 7150 B
761e7 C20 2950 B
761e7 C21 3000 A
761e8 C20 3570 A
761e8 C20 2780 A
761e8 C20 7150 A
761e8 C20 2950 A
761e8 C14 3570 B
761e8 C14 2780 B
761e8 C14 7150 B
尝试基于其他列将重复的数字列转换为多列
pivot转换为我提供了一个ValueError,如图所示
df = final_df.pivot(index='case', columns='number')
ValueError: Index contains duplicate entries, cannot reshape
预期输出:
case constant code number1 number2 number3 number4 number5
761e7 C20 A 3570 2780 7150 2950 0
761e7 C21 A 0 0 0 0 3000
761e7 C20 B 3570 2780 7150 2950 0
761e8 C20 A 3570 2780 7150 2950 0
761e8 C14 B 3570 2780 7150 0 0
一种更常见的方法是将列名设置为数值,行中包含计数,例如:
df.pivot_表(索引=['case','constant','code'],
columns='number',aggfunc=len.reset_index()
屈服
number case constant code 2780 2950 3000 3570 7150
0 7.610000e+09 C20 A 1 1 0 1 1
1 7.610000e+09 C20 B 1 1 0 1 1
2 7.610000e+09 C21 A 0 0 1 0 0
3 7.610000e+10 C14 B 1 0 0 1 1
4 7.610000e+10 C20 A 1 1 0 1 1
IIUC,试试:
g = df.groupby(['case','constant','code'])
df_out = df.set_index(['case','constant','code',g.cumcount()+1]).unstack(fill_value=0)
df_out.columns = [f'{i}{j}' for i, j in df_out.columns]
df_out.reset_index()
输出:
case constant code number1 number2 number3 number4
0 761e7 C20 A 3570 2780 7150 2950
1 761e7 C20 B 3570 2780 7150 2950
2 761e7 C21 A 3000 0 0 0
3 761e8 C14 B 3570 2780 7150 0
4 761e8 C20 A 3570 2780 7150 2950
您的
3000
号码似乎已关闭。