Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/276.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 显示2个不同大小的数据帧的共同元素和差异_Python_Pandas - Fatal编程技术网

Python 显示2个不同大小的数据帧的共同元素和差异

Python 显示2个不同大小的数据帧的共同元素和差异,python,pandas,Python,Pandas,我有两个包含字符串和值的数据帧。它们也有不同的尺寸。我想显示两个数据帧之间的公共元素和差异 我的方法是:我创建了一个函数compare(DataFrame1,DataFrame2),它将使用equals方法比较两个DataFrames。如果它们是相同的,那么我就不需要再找出任何差异。我需要第二个函数,它将实际显示数据帧之间的差异。有人能帮我继续吗 def test2_expansion(): test1 = graph.run('match (n:Disease)-[:HAS_CHILD

我有两个包含字符串和值的数据帧。它们也有不同的尺寸。我想显示两个数据帧之间的公共元素和差异

我的方法是:我创建了一个函数compare(DataFrame1,DataFrame2),它将使用equals方法比较两个DataFrames。如果它们是相同的,那么我就不需要再找出任何差异。我需要第二个函数,它将实际显示数据帧之间的差异。有人能帮我继续吗

def test2_expansion():
    test1 = graph.run('match (n:Disease)-[:HAS_CHILD]->(m:Disease) where n.id ="C0039446" return distinct m.id order by m.id;')
    test1 = pd.DataFrame(test1.data())
    return test1

g = test2_expansion()
g = g.to_dict(orient='list')
print ("The result of test 2 for expansion in Neo4j is ")
for key, value in g.items():
    for n in value:
        print(n)


def compareResults(a,b):
    if a.equals(b):
        return True
    else:
        return False

def takeDifferences():
     a = "Search differences"
     if (compareResult() == True):
        return "Amaizing!"
     else:
        return a


DataFrame1       
C0494228             
C0272078
C2242772

DataFrame2
C2242772
C1882062
C1579212
C1541065
C1306459
C0442867
C0349036
C0343748
C0027651
C0272078

Display Common Elements: C0272078 C2242772
Elements found only in DataFrame1:C0494228
Elements found only in DataFrame2:C2242772
C1882062
C1579212
C1541065
C1306459
C0442867
C0349036
C0343748
C0027651

如果存在列相同的数据帧-例如
m.id
指示器一起使用
参数:

df = df1.merge(df2, how='outer', indicator=True)
print (df)
        m.id      _merge
0   C0494228   left_only
1   C0272078        both
2   C2242772        both
3   C1882062  right_only
4   C1579212  right_only
5   C1541065  right_only
6   C1306459  right_only
7   C0442867  right_only
8   C0349036  right_only
9   C0343748  right_only
10  C0027651  right_only
然后通过以下方式进行过滤:

最后将值与
f-string
s连接:

print (f'Display Common Element: {", ".join(a)}')
Display Common Element: C0272078, C2242772

print (f'Elements found only in DataFrame1: {", ".join(b)}')
Elements found only in DataFrame1: C0494228

print (f'Elements found only in DataFrame2: {", ".join(c)}')
Elements found only in DataFrame2: C1882062, C1579212, C1541065, 
                                   C1306459, C0442867, C0349036, 
                                   C0343748, C0027651

现在,我可以向您展示我的泛型函数,它将回答我的问题

def compare(a,b):
    if a.equals(b):
        print("SAME!")
    else:
        df = a.merge(b, how='outer',indicator=True)
        x = df.loc[df['_merge'] == 'both', 'm.id']
        y = df.loc[df['_merge'] == 'left_only', 'm.id']
        z = df.loc[df['_merge'] == 'right_only', 'm.id']
        print (f'Display Common Element: {", ".join(x)}')
        print (f'Elements found only in DataFrame1: {", ".join(y)}')
        print (f'Elements found only in DataFrame2: {", ".join(z)}')

在这一刻,我的函数返回None,因为我不知道我是否应该返回一些东西,但它工作得很好。谢谢@jezrael

这是我获得数据帧结果的方式:g=dataFrame1()g=g.to_dict(orient='list')print(“Neo4j中扩展的测试2的结果是”)对于键,值,值,值。items():对于n,值:print(n)一切都工作,直到过滤。然后我得到一个错误:KeyError:'标签[coll]不在[columns]@stetcoana-什么是
print(df1.info())
print(df2.info())
?RangeIndex:49个条目,0到48个数据列(共1列):m.id 49非空对象数据类型:对象(1)内存使用量:472.0+字节无RangeIndex:39个条目,0到38个数据列(共1列):m.id 39非空对象数据类型:对象(1)内存使用量:392.0+字节无[在3.445s中完成]是。我的数据帧只有一列包含该字符串,多行中有一列包含该字符串
def compare(a,b):
    if a.equals(b):
        print("SAME!")
    else:
        df = a.merge(b, how='outer',indicator=True)
        x = df.loc[df['_merge'] == 'both', 'm.id']
        y = df.loc[df['_merge'] == 'left_only', 'm.id']
        z = df.loc[df['_merge'] == 'right_only', 'm.id']
        print (f'Display Common Element: {", ".join(x)}')
        print (f'Elements found only in DataFrame1: {", ".join(y)}')
        print (f'Elements found only in DataFrame2: {", ".join(z)}')