Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/355.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 外部连接熊猫中的三个数据帧不工作_Python_Python 3.x_Pandas - Fatal编程技术网

Python 外部连接熊猫中的三个数据帧不工作

Python 外部连接熊猫中的三个数据帧不工作,python,python-3.x,pandas,Python,Python 3.x,Pandas,下面代码的目标是使用三个数据帧实现完全外部连接。应打印所有数据帧的所有记录,如果两个或三个记录之间存在关系,则应将它们打印在同一行中 用于关联数据帧的字段是第一个数据帧中的type_1和id_1,第二个数据帧中的type_2和id_2,第三个数据帧中的type_3和id_3 问题是第二和第三数据帧之间的关系不起作用。看看第11行和第13行中的情况,它应该是一行,因为type_2=type_3和id_2=id_3。预期输出在第11行11 NaN NaN 7.0 8 KoKo 7.0 8 Kuku中

下面代码的目标是使用三个数据帧实现完全外部连接。应打印所有数据帧的所有记录,如果两个或三个记录之间存在关系,则应将它们打印在同一行中

用于关联数据帧的字段是第一个数据帧中的
type_1
id_1
,第二个数据帧中的
type_2
id_2
,第三个数据帧中的
type_3
id_3

问题是第二和第三数据帧之间的关系不起作用。看看第11行和第13行中的情况,它应该是一行,因为
type_2
=
type_3
id_2
=
id_3
。预期输出在第11行
11 NaN NaN 7.0 8 KoKo 7.0 8 Kuku
中,第13行不应打印。如何解决这个问题

import pandas as pd
raw_data = {
        'type_1': [0, 1, 1, 2, 2],
        'id_1': ['3', '4', '5', '3', '3'],
        'name_1': ['Alex', 'Amy', 'Allen', 'Peter', 'Liz']}
df_a = pd.DataFrame(raw_data, columns = ['type_1', 'id_1', 'name_1' ])

raw_datab = {
        'type_2': [1, 1, 1, 0,7],
        'id_2': ['4', '5', '5', '7', '8'],
        'name_2': ['Billy', 'Brian', 'Joe', 'Bryce', 'KoKo']}
df_b = pd.DataFrame(raw_datab, columns = ['type_2', 'id_2', 'name_2'])

raw_datac = {
        'type_3': [1, 1, 1, 1, 2, 2, 7],
        'id_3': ['4', '6', '5', '5', '3', '3','8'],
        'name_3': ['School', 'White', 'Jane', 'Homer', 'Paul', 'Lorel', 'Kuku']}
df_c = pd.DataFrame(raw_datac, columns = ['type_3', 'id_3', 'name_3'])

merged = df_a
merged = merged.merge(df_b, how='outer', left_on=['type_1', 'id_1'],
                      right_on=['type_2', 'id_2'])
merged = merged.merge(df_c, how='outer', left_on=['type_1', 'id_1'], 
                      right_on=['type_3', 'id_3'])

print(merged)
结果:

    type_1 id_1 name_1  type_2 id_2 name_2  type_3 id_3  name_3
0      0.0    3   Alex     NaN  NaN    NaN     NaN  NaN     NaN
1      1.0    4    Amy     1.0    4  Billy     1.0    4  School
2      1.0    5  Allen     1.0    5  Brian     1.0    5    Jane
3      1.0    5  Allen     1.0    5  Brian     1.0    5   Homer
4      1.0    5  Allen     1.0    5    Joe     1.0    5    Jane
5      1.0    5  Allen     1.0    5    Joe     1.0    5   Homer
6      2.0    3  Peter     NaN  NaN    NaN     2.0    3    Paul
7      2.0    3  Peter     NaN  NaN    NaN     2.0    3   Lorel
8      2.0    3    Liz     NaN  NaN    NaN     2.0    3    Paul
9      2.0    3    Liz     NaN  NaN    NaN     2.0    3   Lorel
10     NaN  NaN    NaN     0.0    7  Bryce     NaN  NaN     NaN
11     NaN  NaN    NaN     7.0    8   KoKo     NaN  NaN     NaN
12     NaN  NaN    NaN     NaN  NaN    NaN     1.0    6   White
13     NaN  NaN    NaN     NaN  NaN    NaN     7.0    8    Kuku

您需要在
merge

df_a[['key1','key2']]=df_a[['type_1', 'id_1']]
df_b[['key1','key2']]=df_b[['type_2', 'id_2']]
df_c[['key1','key2']]=df_c[['type_3', 'id_3']]


merged = df_a
merged = merged.merge(df_b, how='outer')
merged = merged.merge(df_c, how='outer')
merged.drop(['key1','key2'],1)
Out[81]: 
    type_1 id_1 name_1  type_2 id_2 name_2  type_3 id_3  name_3
0      0.0    3   Alex     NaN  NaN    NaN     NaN  NaN     NaN
1      1.0    4    Amy     1.0    4  Billy     1.0    4  School
2      1.0    5  Allen     1.0    5  Brian     1.0    5    Jane
3      1.0    5  Allen     1.0    5  Brian     1.0    5   Homer
4      1.0    5  Allen     1.0    5    Joe     1.0    5    Jane
5      1.0    5  Allen     1.0    5    Joe     1.0    5   Homer
6      2.0    3  Peter     NaN  NaN    NaN     2.0    3    Paul
7      2.0    3  Peter     NaN  NaN    NaN     2.0    3   Lorel
8      2.0    3    Liz     NaN  NaN    NaN     2.0    3    Paul
9      2.0    3    Liz     NaN  NaN    NaN     2.0    3   Lorel
10     NaN  NaN    NaN     0.0    7  Bryce     NaN  NaN     NaN
11     NaN  NaN    NaN     7.0    8   KoKo     7.0    8    Kuku
12     NaN  NaN    NaN     NaN  NaN    NaN     1.0    6   White

第11行和第13行具有不同的
type_2
id_2
——它们位于第一次合并后的合并数据帧中。它们的
type_3
id_3
也不同-为什么这个结果对您的任务不满意?对于第11行和第13行,type_2=type_3和id_2=id_3,这就是为什么它们应该在一行中,但没有强制执行它的合并命令。您确定左开=['type_1','id_1']吗?我不确定,这可能是问题所在,但我如何告诉熊猫使用
type_1/id_1
type_2/id_2
作为第一次合并到第三个数据帧的结果?它应该像三个表的SQL完全外部联接一样工作