Python 比较两个不同的csv/excel文件列“;名称“;如果两者名称相同,则忽略该数据并显示输出的其余部分

Python 比较两个不同的csv/excel文件列“;名称“;如果两者名称相同,则忽略该数据并显示输出的其余部分,python,excel,xlrd,Python,Excel,Xlrd,比较“名称”列中两个不同的csv/excel文件,如果两者具有相同的数据,则忽略该数据,并在新文件中显示其余输出 文件1: KeyField,Name,City,Zip,Location 123,Fred,Chicago,60558,A2 234,Mary,Orlando,12376,4L6 345,George,Boston, 40567,22 456,Peter,Topeka,00341,234 567,Doc,Birmingham,7654,H86 678,Isabel,Guadalaja

比较“名称”列中两个不同的csv/excel文件,如果两者具有相同的数据,则忽略该数据,并在新文件中显示其余输出

文件1:

KeyField,Name,City,Zip,Location
123,Fred,Chicago,60558,A2
234,Mary,Orlando,12376,4L6
345,George,Boston, 40567,22
456,Peter,Topeka,00341,234
567,Doc,Birmingham,7654,H86
678,Isabel,Guadalajara,87654,M111
KeyField,Name,City,Zip,Location
567,Doc,Birmingham,76543,H86
234,Michele,Orlando,12376,4L6
678,Isabel,Guadalajara,87654,U869
567,Doc,Birmingham,7654,H86
123,tony,Chicago,60558,A2
456,Peter,Topeka,00341,659
KeyField,Name,City,Zip,Location
123,Fred,Chicago,60558,A2
234,Mary,Orlando,12376,4L6
345,George,Boston, 40567,22
文件2:

KeyField,Name,City,Zip,Location
123,Fred,Chicago,60558,A2
234,Mary,Orlando,12376,4L6
345,George,Boston, 40567,22
456,Peter,Topeka,00341,234
567,Doc,Birmingham,7654,H86
678,Isabel,Guadalajara,87654,M111
KeyField,Name,City,Zip,Location
567,Doc,Birmingham,76543,H86
234,Michele,Orlando,12376,4L6
678,Isabel,Guadalajara,87654,U869
567,Doc,Birmingham,7654,H86
123,tony,Chicago,60558,A2
456,Peter,Topeka,00341,659
KeyField,Name,City,Zip,Location
123,Fred,Chicago,60558,A2
234,Mary,Orlando,12376,4L6
345,George,Boston, 40567,22
输出:文件3:

KeyField,Name,City,Zip,Location
123,Fred,Chicago,60558,A2
234,Mary,Orlando,12376,4L6
345,George,Boston, 40567,22
456,Peter,Topeka,00341,234
567,Doc,Birmingham,7654,H86
678,Isabel,Guadalajara,87654,M111
KeyField,Name,City,Zip,Location
567,Doc,Birmingham,76543,H86
234,Michele,Orlando,12376,4L6
678,Isabel,Guadalajara,87654,U869
567,Doc,Birmingham,7654,H86
123,tony,Chicago,60558,A2
456,Peter,Topeka,00341,659
KeyField,Name,City,Zip,Location
123,Fred,Chicago,60558,A2
234,Mary,Orlando,12376,4L6
345,George,Boston, 40567,22

您可以使用python的库
pandas

1.)读取熊猫数据框中的
csv

In [383]: df1 = pd.read_csv('f1.csv')

In [384]: df1
Out[384]: 
   KeyField    Name         City    Zip Location
0       123    Fred      Chicago  60558       A2
1       234    Mary      Orlando  12376      4L6
2       345  George       Boston  40567       22
3       456   Peter       Topeka    341      234
4       567     Doc   Birmingham   7654      H86
5       678  Isabel  Guadalajara  87654     M111

In [385]: df2 = pd.read_csv('f2.csv')

In [386]: df2
Out[386]: 
   KeyField     Name         City    Zip Location
0       567      Doc   Birmingham  76543      H86
1       234  Michele      Orlando  12376      4L6
2       678   Isabel  Guadalajara  87654     U869
3       567      Doc   Birmingham   7654      H86
4       123     tony      Chicago  60558       A2
5       456    Peter       Topeka    341      659
2.)在df1和df2之间进行左连接。未找到匹配项的记录将具有空ID

In [392]: merged = pd.merge(df1,df2, on='Name', how='left')
In [396]: merged[merged.iloc[:,-1].isnull()][['KeyField_x','Name','City_x','Zip_x','Location_x']]
Out[396]: 
   KeyField_x    Name   City_x  Zip_x Location_x
0         123    Fred  Chicago  60558         A2
1         234    Mary  Orlando  12376        4L6
2         345  George   Boston  40567         22

以上是你想要的记录。如果有帮助,请告诉我。

谢谢!MayankIn输出:创建了文件3,需要显示原始数据(4123、'tony'、'Chicago'、'60558'、'A2'),因为它不是双重的。很抱歉,因为我没有在主要问题中提到这一点