Python 熊猫-组合数据集_Python_Pandas

Python 熊猫-组合数据集

python pandas

Python 熊猫-组合数据集,python,pandas,Python,Pandas,我有3个数据集，我正试图与熊猫结合第一种类型的数据集如下所示。它有多个邮政编码索引值，因为数据框中有多家餐厅（我试图给这些餐厅更多的人口统计背景）第二个是这样的（主要是邮政编码，而不是一个或两个属性，一个值对的键） postcode burgers 2640 38064 postcode soda 3000 23715

我有3个数据集，我正试图与熊猫结合

第一种类型的数据集如下所示。它有多个邮政编码索引值，因为数据框中有多家餐厅（我试图给这些餐厅更多的人口统计背景）

第二个是这样的（主要是邮政编码，而不是一个或两个属性，一个值对的键）

    postcode          burgers                  
    2640              38064  
  
    
    postcode       soda     
    3000           23715
    3002             854
    3003             780
    3004              35
    3006            3288>

这些都被简化了

当使用concat或与pandas合并时，我收到以下错误：

ValueError: Plan shapes are not aligned

使用此代码

result = pd.concat(frames,join='outer')

我如何将这些数据集简单地连接成一个？我犯了什么错误

基于注释的预期输出基本上是寻找汉堡和苏打水作为邮政编码的一个值放入数据框中

范例

    postcode      pop growth    burgers    soda   address       
        3793          3,577      123123    1231   AbyRoad
        3793          3,577      12351      5151   northst
        3971             26      6666      7777   northunder abby

首先，您需要确保postcode列是每个数据帧的（唯一）索引

下一步，如果您确实拥有所有以索引作为邮政编码的数据帧，请将它们放入名为frames（数据帧列表）的列表中，并使用以下代码

dfList = [df1, df2, df3]
frames = [df.set_index('postcode') for df in dfList]
pd.concat(frames, axis=1)

如果这不起作用，也许试试这个-

from functools import reduce

frames = [df.reset_index() for df in dfList] #reset the indexes and add dfs into a list
df_final = reduce(lambda left,right: pd.merge(left,right,on='postcode'), frames)

ValueError:您正在尝试合并object和int64列。如果您希望继续，您应该使用pd.concat Next error:/这意味着您的某些“postcode”列具有不同的数据类型。您可以使用

df['postcode']=df['postcode'].astype（int）

from functools import reduce

frames = [df.reset_index() for df in dfList] #reset the indexes and add dfs into a list
df_final = reduce(lambda left,right: pd.merge(left,right,on='postcode'), frames)