Python 为重复标注创建其他列

Python 为重复标注创建其他列,python,pandas,Python,Pandas,我有以下数据帧: p_id c_id region account state A0001 100 east CA1000 VA A0001 100 east CA1001 GA A0001 100 west CA1002 CA A0001 101 west CA1004 TX A0001 101 west CA1007 GA A0002 108 east CA10

我有以下数据帧:

p_id    c_id    region  account state
A0001   100     east    CA1000  VA
A0001   100     east    CA1001  GA
A0001   100     west    CA1002  CA
A0001   101     west    CA1004  TX
A0001   101     west    CA1007  GA
A0002   108     east    CA1013  MI
A0002   109     east    CA1018  WA
A0002   109     east    CA1020  AZ
我想为同一p_id+c_id+地区在单独的列中存储每个唯一的帐户和状态。因此,输出如下所示:

p_id    c_id    region  account_1   state_1 account_2   state_2 account_3   state_3 account_4   state_4
A0001   100     east    CA1000      VA      CA1001      GA              
A0001   100     west    CA1002      CA                      
A0001   101     west    CA1004      TX      CA1007      GA              
A0002   108     east    CA1013      MI                      
A0002   109     east    CA1018      WA      CA1020      AZ              
注:

  • 我需要在输出中增加8列(account_1~4和state_1~4),即使在整个数据帧中p_id、c_id和region的每个组合中只有不到4个不同的account和state
  • 如果没有值,则在新列中将值保留为空

  • 您可以使用、和的组合:

    输出

        p_id  c_id region account_1  ... account_3 state_3 account_4  state_4
    0  A0001   100   east    CA1000  ...       NaN     NaN       NaN      NaN
    1  A0001   100   west    CA1002  ...       NaN     NaN       NaN      NaN
    2  A0001   101   west    CA1004  ...       NaN     NaN       NaN      NaN
    3  A0002   108   east    CA1013  ...       NaN     NaN       NaN      NaN
    4  A0002   109   east    CA1018  ...       NaN     NaN       NaN      NaN
    
    [5 rows x 11 columns]
    

    这看起来像是一个作业问题,你能发布一些示例代码说明你是如何开始的吗?同意,一个问题中的问题太多了,而且没有简单的脚本表明你已经尝试过了…很好的答案+1:-)谢谢你的回答。但是,我在转换代码中收到了一个错误:ValueError:没有要转换的对象concatenate@Larry您好,我使用您提供的DataFrame示例运行了代码,并且正在运行。如果没有与您完全相同的数据,很难再现错误。你能不能试着缩小范围,看看是什么样的数据产生了这样的结果error@DaniMesejo非常感谢。我想出来了:)
        p_id  c_id region account_1  ... account_3 state_3 account_4  state_4
    0  A0001   100   east    CA1000  ...       NaN     NaN       NaN      NaN
    1  A0001   100   west    CA1002  ...       NaN     NaN       NaN      NaN
    2  A0001   101   west    CA1004  ...       NaN     NaN       NaN      NaN
    3  A0002   108   east    CA1013  ...       NaN     NaN       NaN      NaN
    4  A0002   109   east    CA1018  ...       NaN     NaN       NaN      NaN
    
    [5 rows x 11 columns]