合并包含重复行的两个相同文件时,python输出中的行数将增加一倍
first.xlsx合并包含重复行的两个相同文件时,python输出中的行数将增加一倍,python,pandas,merge,Python,Pandas,Merge,first.xlsx name rol no pass manoj sad654 1254 asd 76325 anhi as76 4954 243jh abdul ahsg786 984523 asjdf987 manoj sad654 4954 243jh abhi as76 1254 asd 76325 manoj sad654 1254 asd 76325 anhi as76 4954
name rol no pass
manoj sad654 1254 asd 76325
anhi as76 4954 243jh
abdul ahsg786 984523 asjdf987
manoj sad654 4954 243jh
abhi as76 1254 asd 76325
manoj sad654 1254 asd 76325
anhi as76 4954 243jh
abdul ahsg786 984523 asjdf987
manoj sad654 4954 243jh
abhi as76 1254 asd 76325
second.xlsx也包含相同的行
所以当我尝试合并这些first和secons xlsx时,它每行打印4次
这将是2次在大熊猫
import pandas as pd
first=pd.read_excel("C:/Users/mjb952/Desktop/first.xlsx")
second=pd.read_excel("C:/Users/mjb952/Desktop/second.xlsx")
comparison_df = pd.merge(first,second,indicator=True,how='outer')
print(comparison_df)
产量为
name rol no pass _merge
0 manoj sad654 1254 asd 76325 both
1 manoj sad654 1254 asd 76325 both
2 manoj sad654 1254 asd 76325 both
3 manoj sad654 1254 asd 76325 both
4 anhi as76 4954 243jh both
5 anhi as76 4954 243jh both
6 anhi as76 4954 243jh both
7 anhi as76 4954 243jh both
8 abdul ahsg786 984523 asjdf987 both
9 abdul ahsg786 984523 asjdf987 both
10 abdul ahsg786 984523 asjdf987 both
11 abdul ahsg786 984523 asjdf987 both
12 manoj sad654 4954 243jh both
13 manoj sad654 4954 243jh both
14 manoj sad654 4954 243jh both
15 manoj sad654 4954 243jh both
16 abhi as76 1254 asd 76325 both
17 abhi as76 1254 asd 76325 both
18 abhi as76 1254 asd 76325 both
19 abhi as76 1254 asd 76325 both
0警告20楠楠楠
警告数量=20如果您以相同的方式加入
first
和first
会发生什么?
name rol no pass