Python 转换数据帧,将行值添加为列标题

Python 转换数据帧,将行值添加为列标题,python,pandas,dataframe,Python,Pandas,Dataframe,我有这样一个熊猫数据框: COMMIT_ID | FILE_NAME | COMMITTER | CHANGE TYPE ------------------------------------------------------------- 1 | package.json | A | MODIFY 2 | main.js | B | ADD 2 | class.java | B

我有这样一个熊猫数据框:

COMMIT_ID | FILE_NAME     | COMMITTER | CHANGE TYPE
-------------------------------------------------------------
  1       |  package.json | A         | MODIFY
  2       |  main.js      | B         | ADD
  2       |  class.java   | B         | DELETE
我希望文件名的行值作为列标题,更改类型作为值

COMMIT_ID | package.json | main.js     | class.java     | COMMITTER
-----------------------------------------------------------------------------
  1       |  MODIFY      |  NONE       |  NONE          | A         
  2       |  NONE        |  ADD        |  DELETE        | B      
我试过使用pandas.pivot\u table,但不是很成功。有没有机会轻松做到这一点?

我想您需要+:

带-的解决方案需要聚合函数,如
sum
(连接不带分隔符的字符串)或
“'.join
(连接带分隔符的字符串),如果重复:

print (df)
   COMMIT_ID     FILE_NAME COMMITTER CHANGE TYPE
0          1  package.json         A      MODIFY
1          2       main.js         B         ADD
2          2    class.java         B      DELETE
3          2    class.java         B         ADD


df = df.pivot_table(index=['COMMIT_ID','COMMITTER'], 
                    columns='FILE_NAME', 
                    values='CHANGE TYPE', 
                    aggfunc='sum').reset_index()
print (df)
FILE_NAME  COMMIT_ID COMMITTER class.java main.js package.json
0                  1         A       None    None       MODIFY
1                  2         B  DELETEADD     ADD         None
或:

使用
first
进行聚合也有效,但可能会丢失重复的值:

df = df.pivot_table(index=['COMMIT_ID','COMMITTER'], 
                    columns='FILE_NAME', 
                    values='CHANGE TYPE', 
                    aggfunc='first').reset_index()
print (df)
FILE_NAME  COMMIT_ID COMMITTER class.java main.js package.json
0                  1         A       None    None       MODIFY
1                  2         B     DELETE     ADD         None
重命名列名称的最后添加:


严重怀疑你是熊猫机器人@jezrael。
df = df.pivot_table(index=['COMMIT_ID','COMMITTER'], 
                    columns='FILE_NAME', 
                    values='CHANGE TYPE', 
                    aggfunc='_'.join).reset_index()
print (df)
FILE_NAME  COMMIT_ID COMMITTER  class.java main.js package.json
0                  1         A        None    None       MODIFY
1                  2         B  DELETE_ADD     ADD         None
df = df.pivot_table(index=['COMMIT_ID','COMMITTER'], 
                    columns='FILE_NAME', 
                    values='CHANGE TYPE', 
                    aggfunc='first').reset_index()
print (df)
FILE_NAME  COMMIT_ID COMMITTER class.java main.js package.json
0                  1         A       None    None       MODIFY
1                  2         B     DELETE     ADD         None
df = df.rename_axis(None, axis=1)
print (df)
   COMMIT_ID COMMITTER class.java main.js package.json
0          1         A       None    None       MODIFY
1          2         B  DELETEADD     ADD         None