Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/343.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/database/10.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
注意(保存)数据集_a和数据集_b的不匹配项,使用python_Python_Database_Pandas_Dataframe - Fatal编程技术网

注意(保存)数据集_a和数据集_b的不匹配项,使用python

注意(保存)数据集_a和数据集_b的不匹配项,使用python,python,database,pandas,dataframe,Python,Database,Pandas,Dataframe,正如你们所看到的,由于某些关键点不匹配,两个帧的值都会丢失。我要找的是注意left_frame和right_frame的不匹配条目的数量。我不知道该怎么做 左\u帧 key left_value 0 0 a 1 1 b 2 2 c 3 3 d 4 4 e key right_value 0 2 f 1 3 g 2

正如你们所看到的,由于某些关键点不匹配,两个帧的值都会丢失。我要找的是注意left_frame和right_frame的不匹配条目的数量。我不知道该怎么做

左\u帧

   key left_value
0    0          a
1    1          b
2    2          c
3    3          d
4    4          e
   key right_value
0    2           f
1    3           g
2    4           h
3    5           i
4    6           j

pd.merge(left_frame, right_frame, on='key', how='inner')
右\u帧

   key left_value
0    0          a
1    1          b
2    2          c
3    3          d
4    4          e
   key right_value
0    2           f
1    3           g
2    4           h
3    5           i
4    6           j

pd.merge(left_frame, right_frame, on='key', how='inner')
**期望输出:1**

    key  left_value right_value
0   2    c           f
1   3    d           g
2   4    e           h
**期望输出:2**

   key left_value right_value      _merge
0    0          a         NaN   left_only
1    1          b         NaN   left_only
5    5        NaN           i  right_only
6    6        NaN           j  right_only

因此,基本上,我希望有两个数据帧,一个用于“内部”,另一个用于不匹配

如果将合并类型更改为“外部”,并传递
indicator=True
,则可以看到不匹配行的来源:

In [193]:
pd.merge(left, right, how='outer', indicator=True)

Out[193]:
   key left_value right_value      _merge
0    0          a         NaN   left_only
1    1          b         NaN   left_only
2    2          c           f        both
3    3          d           g        both
4    4          e           h        both
5    5        NaN           i  right_only
6    6        NaN           j  right_only
您可以在此列上
groupby
,然后调用
count

In [194]:
pd.merge(left, right, how='outer', indicator=True).groupby('_merge').count()

Out[194]:
            key  left_value  right_value
_merge                                  
left_only     2           2            0
right_only    2           0            2
both          3           3            3
如果要筛选并保存结果,请执行以下操作:

In [198]:
merged = pd.merge(left, right, how='outer', indicator=True)
merged

Out[198]:
   key left_value right_value      _merge
0    0          a         NaN   left_only
1    1          b         NaN   left_only
2    2          c           f        both
3    3          d           g        both
4    4          e           h        both
5    5        NaN           i  right_only
6    6        NaN           j  right_only

In [199]:    
both = merged[merged['_merge'] == 'both']
both

Out[199]:
   key left_value right_value _merge
2    2          c           f   both
3    3          d           g   both
4    4          e           h   both

In [200]:
other = merged[merged['_merge'] != 'both']
other

Out[200]:
   key left_value right_value      _merge
0    0          a         NaN   left_only
1    1          b         NaN   left_only
5    5        NaN           i  right_only
6    6        NaN           j  right_only

是的,但我想在我使用drop方法进行非匹配后存储它们。你可以过滤我的输出:
both=merged[merged['''u merge']='both']
others=merged[merged[''u merge']!='both']
听起来好像可以。我试试看。非常感谢。