Python 在pandas中附加两个多索引数据帧_Python_Pandas

Python 在pandas中附加两个多索引数据帧

python pandas

Python 在pandas中附加两个多索引数据帧,python,pandas,Python,Pandas,我正在尝试这种简单的变量设置： In [94]: cc Out[94]: d0 d1 class sample 5 66 0.128320 0.970817 66 0.160488 0.969077 77 0.919263 0.008597 6 77 0.811914 0.123960 88 0.

我正在尝试这种简单的变量设置：

In [94]: cc
Out[94]: 
                 d0         d1
class sample                    
5     66      0.128320  0.970817
      66      0.160488  0.969077
      77      0.919263  0.008597
6     77      0.811914  0.123960
      88      0.639887  0.262943
      88      0.312303  0.660786

In [101]: bb
Out[101]: 
                     d0         d1
class sample                    
2     22      0.730631  0.656266
      33      0.871292  0.942768
3     44      0.081831  0.714360
      55      0.600095  0.770108

In [102]: aa
Out[102]: 
                     d0         d1
class sample                    
0     00      0.190409  0.789750
      11      0.588001  0.250663
1     22      0.888343  0.428968
      33      0.185525  0.450020

我可以执行以下命令

In [103]: aa.append(bb)
Out[103]: 
                     d0         d1
class sample                    
0     00      0.190409  0.789750
      11      0.588001  0.250663
1     22      0.888343  0.428968
      33      0.185525  0.450020
2     22      0.730631  0.656266
      33      0.871292  0.942768
3     44      0.081831  0.714360
      55      0.600095  0.770108

为什么我不能以相同的方式执行以下命令

aa.append(cc)

[我得到以下例外]

ValueError: all arrays must be same length

更新：如果我没有提供列名，它可以正常工作，但如果例如我有4列，其中4X4和8X4的名称为['d0'、'd0'、'd1'、'd1']，它就不再工作了

下面是重现错误的代码

import pandas
y1 = [['0','0','1','1'],['00','11','22','33']]
y2 = [['2','2','3','3','4','4'],['44','55','66','77','88','99']]
x1  = np.random.rand(4,4)
x2 = np.random.rand(6,4)
cols = ['d1']*2 + ['d2']*2
names = ['class','idx']
aa = pandas.DataFrame(x1,index=y1,columns = cols)
aa.index.names = names
print aa
bb = pandas.DataFrame(x2,index=y2,columns = cols)
bb.index.names = names
print bb

aa.append(bb)

我应该怎么做才能让它运行

谢谢

对您编辑的问题的回答

因此，要回答您编辑的问题，问题在于您的列名有重复项

 cols = ['d1']*2 + ['d2']*2  # <-- this creates ['d1', 'd1', 'd2', 'd2']

及

pandas.append（）

（或

concat（）

方法）只能在具有唯一列名的情况下正确追加

尝试此操作，不会出现任何错误：-

cols2 = ['d1', 'd2', 'd3', 'd4']

cc = pandas.DataFrame(x1, index=y1, columns=cols2)
cc.index.names = names

dd = pandas.DataFrame(x2, index=y2, columns=cols2)
cc.index.names = names

现在

In [70]: cc.append(dd)
Out[70]: 
                 d1        d2        d3        d4
class idx                                        
0     00   0.805445  0.442059  0.296162  0.041271
      11   0.384600  0.723297  0.997918  0.006661
1     22   0.685997  0.794470  0.541922  0.326008
      33   0.117422  0.667745  0.662031  0.634429
2     44   0.465559  0.496039  0.044766  0.649145
      55   0.560626  0.684286  0.929473  0.607542
3     66   0.526605  0.836667  0.608098  0.159471
      77   0.216756  0.749625  0.096782  0.547273
4     88   0.619338  0.032676  0.218736  0.684045
      99   0.987934  0.349520  0.346036  0.926373

您使用的是哪种数据类型？系列、数据帧、面板等？这是我的完整ipython结果-告诉我在ipython shell中执行此操作时有什么不同。你是对的，我忘了提到我提供了问题更新版本中指定的列名称。实际上，我需要为多个列使用相同的名称。我认为在这种情况下，我还必须在列级别上进行多索引。谢谢

In [64]: bb
Out[64]: 
                 d1        d1        d2        d2
class idx                                        
2     44   0.465559  0.496039  0.044766  0.649145
      55   0.560626  0.684286  0.929473  0.607542
3     66   0.526605  0.836667  0.608098  0.159471
      77   0.216756  0.749625  0.096782  0.547273
4     88   0.619338  0.032676  0.218736  0.684045
      99   0.987934  0.349520  0.346036  0.926373

cols2 = ['d1', 'd2', 'd3', 'd4']

cc = pandas.DataFrame(x1, index=y1, columns=cols2)
cc.index.names = names

dd = pandas.DataFrame(x2, index=y2, columns=cols2)
cc.index.names = names

In [70]: cc.append(dd)
Out[70]: 
                 d1        d2        d3        d4
class idx                                        
0     00   0.805445  0.442059  0.296162  0.041271
      11   0.384600  0.723297  0.997918  0.006661
1     22   0.685997  0.794470  0.541922  0.326008
      33   0.117422  0.667745  0.662031  0.634429
2     44   0.465559  0.496039  0.044766  0.649145
      55   0.560626  0.684286  0.929473  0.607542
3     66   0.526605  0.836667  0.608098  0.159471
      77   0.216756  0.749625  0.096782  0.547273
4     88   0.619338  0.032676  0.218736  0.684045
      99   0.987934  0.349520  0.346036  0.926373