Python 将重复的行转换为具有标题的多列

Python 将重复的行转换为具有标题的多列,python,pandas,Python,Pandas,输入数据帧: case constant number code 761e7 C20 3570 A 761e7 C20 2780 A 761e7 C20 7150 A 761e7 C20 2950 A 761e7 C20 3570 B 761e7 C20 2780 B 761e

输入数据帧:

case    constant    number  code        
761e7   C20         3570    A   
761e7   C20         2780    A   
761e7   C20         7150    A   
761e7   C20         2950    A   
761e7   C20         3570    B   
761e7   C20         2780    B   
761e7   C20         7150    B   
761e7   C20         2950    B
761e7   C21         3000    A   
761e8   C20         3570    A   
761e8   C20         2780    A   
761e8   C20         7150    A   
761e8   C20         2950    A   
761e8   C14         3570    B   
761e8   C14         2780    B   
761e8   C14         7150    B
尝试基于其他列将重复的数字列转换为多列

pivot转换为我提供了一个ValueError,如图所示

df = final_df.pivot(index='case', columns='number')

ValueError: Index contains duplicate entries, cannot reshape
预期输出:

case    constant    code    number1 number2 number3 number4 number5
761e7   C20         A       3570    2780    7150    2950    0
761e7   C21         A       0       0       0       0       3000
761e7   C20         B       3570    2780    7150    2950    0
761e8   C20         A       3570    2780    7150    2950    0
761e8   C14         B       3570    2780    7150    0       0

一种更常见的方法是将列名设置为数值,行中包含计数,例如:

df.pivot_表(索引=['case','constant','code'],
columns='number',aggfunc=len.reset_index()
屈服

number          case constant code  2780  2950  3000  3570  7150
0       7.610000e+09      C20    A     1     1     0     1     1
1       7.610000e+09      C20    B     1     1     0     1     1
2       7.610000e+09      C21    A     0     0     1     0     0
3       7.610000e+10      C14    B     1     0     0     1     1
4       7.610000e+10      C20    A     1     1     0     1     1
IIUC,试试:

g = df.groupby(['case','constant','code'])

df_out = df.set_index(['case','constant','code',g.cumcount()+1]).unstack(fill_value=0)
df_out.columns = [f'{i}{j}' for i, j in df_out.columns]
df_out.reset_index()
输出:

    case constant code  number1  number2  number3  number4
0  761e7      C20    A     3570     2780     7150     2950
1  761e7      C20    B     3570     2780     7150     2950
2  761e7      C21    A     3000        0        0        0
3  761e8      C14    B     3570     2780     7150        0
4  761e8      C20    A     3570     2780     7150     2950

您的
3000
号码似乎已关闭。