Python 3.x 基于共享值在python df中移动特定值

Python 3.x 基于共享值在python df中移动特定值,python-3.x,pandas,dataframe,Python 3.x,Pandas,Dataframe,我现在的df: clinical # date collected name result submitter 123 3/2/2020 flu a negative hospital 123 3/2/2020 flu b positive hospital 123 3/2/2020

我现在的df:

clinical #     date collected      name       result      submitter

123                3/2/2020       flu a       negative      hospital

123                3/2/2020       flu b       positive      hospital

123                3/2/2020       flu c       positive      hospital

123                3/2/2020       flu d       negative      hospital

567                7/7/1945       flu a       negative      hospital

567                7/7/1945       flu b       negative      hospital

567                7/7/1945       flu c       positive      hospital

567                7/7/1945       flu d       negative      hospital

989                8/8/1988       flu a       negative      hospice 

989                8/8/1988       flu b       negative      hospice 

989                8/8/1988       flu c       negative      hospice 

989                8/8/1988       flu d       negative      hospice 

989                8/8/1988       flu e       negative      hospice 

989                8/8/1988       flu f       negative      hospice
我的df有数千行,行数总是在变化。每个人在第一列用一个数字表示,例如:Jane用123表示。简接受了甲型流感、乙型流感、丙型流感和丁型流感的检测。我想把简的信息压缩成一行。我需要在“name”和“result”行之间变化的变量。所有其他信息都是常量,可以删除。有些病人接受了更多的测试,比如989号病人,他接受了6次流感测试,而不是像Jane这样的4次。同样,同样的过程也需要发生。唯一值(如流感类型和附带的测试结果)将移到同一行中

理想的数据框架如下所示:

12      3/2/2020   hospital flu a -  flu b +   flu c -  flu d -  
567     7/7/1977   hospital flu a +  flu b +   flu c -  flu d -  
989     8/8/1988   hospital flu a -  flu b +   flu c -  flu d -  flu e +  flu f +  
也许有更好的方法可以做到这一点——比如用钥匙或字典?我非常感谢任何可行的解决办法


提前感谢您的建议:)

尝试一下,使用
map
将单词转换为正负符号,然后使用
agg
函数
join
创建一个具体的结果文本字段:

df['restxt'] = (df['collected'] + ' ' + 
                df['name'] + ' ' + 
                df['result'].map({'negative':'-', 'positive':'+'}))

df.groupby(['clinical #', 'date', 'submitter'], as_index=False)['restxt'].agg(' '.join)
输出:

   clinical #      date submitter                                           restxt
0         123  3/2/2020  hospital                  flu a - flu b + flu c + flu d -
1         567  7/7/1945  hospital                  flu a - flu b - flu c + flu d -
2         989  8/8/1988   hospice  flu a - flu b - flu c - flu d - flu e - flu f -