Python 比较2个文件并将公用行提取到2个单独的文件中_Python

Python 比较2个文件并将公用行提取到2个单独的文件中

python

Python 比较2个文件并将公用行提取到2个单独的文件中,python,Python,我有两个文件文件1： | Name | Age | Place |Work_Start| |:---- |:------:| -----: | -----:| |Ester | 27 | Beijing| 8 | |Jack | 29 | Tokyo | 9 | |Mary | 31 |New_York| 8 | |Leo | 25 |England | 10

我有两个文件

文件1：

| Name | Age    | Place  |Work_Start|  
|:---- |:------:| -----: |    -----:|
|Ester | 27     | Beijing|   8      |     
|Jack  |  29    | Tokyo  |   9      | 
|Mary  | 31     |New_York|   8      |    
|Leo   | 25     |England |   10     |

文件2：

| Name | Age    | Place  |Work_Start|  
|:---- |:------:| -----: |    -----:|
|Ethan | 29     |Osaka   |  7       |   
|Mary  | 31     |New_York|  8       |         
|Leo   | 25     |England |  9       |

| Name | Age    | Place  |Work_Start|  
|:---- |:------:| -----: |    -----:| 
|Mary  |   31   |New_York|  8       |          
|Leo   | 25     |England |  9       |

我想基于列1、2和3提取公共行，并将它们写入单独的文件中。提取后，会是这样的

文件1通用：

| Name | Age    | Place  |Work_Start|  
|:---- |:------:| -----: |    -----:|
|Mary  |31      |New_York|  8       |           
|Leo   |25      |England |  10      |

文件2：

| Name | Age    | Place  |Work_Start|  
|:---- |:------:| -----: |    -----:|
|Ethan | 29     |Osaka   |  7       |   
|Mary  | 31     |New_York|  8       |         
|Leo   | 25     |England |  9       |

| Name | Age    | Place  |Work_Start|  
|:---- |:------:| -----: |    -----:| 
|Mary  |   31   |New_York|  8       |          
|Leo   | 25     |England |  9       |

我的尝试代码

import pandas as pd
import csv

file1 = open('file1.csv')
file2 = open('file2.csv')
file1_common = open('file1_common.csv', 'w')

file1_r=csv.reader(file1)
file2_r=csv.reader(file2)
file1_common_w = csv.writer(file1_common)


count = 0

file2_set = set()

header1 = next(file1_r)
header2 = next(file2_r)

file1_common_w.writerow(header1)


for row2 in file2_r:
    file2_set.add(row2[0])
    
for row1 in file1_r:
    for row2 in file2_set:
        if (row1[0] in row2[0]) and (row1[1] in row2[1]) and (row1[2] in row2[2]):
            break
    else:
            count=count+1
            file1_common_w.writerow(row1)

file1.close()
file2.close()

它给出了文件1信息，而不是普通行。对于file2_common，我想反转第1行和第2行。这样行吗？谢谢你的帮助

（我实际上使用另一篇文章中的代码作为模板，但我忘了使用哪个。）

如果使用from

csv

可以获得csv中每一行的dict。使用是一种方便的方法，可以从这些dict中提取特定的键，这将允许您生成可以添加到集合中的值的元组。这样，你就可以走十字路口了：

import csv
from operator import itemgetter

getter = itemgetter('Name','Age','Place')

file1 = open(file1_path)
file2 = open(file2_path)

file1_r=csv.DictReader(file1)
file2_r=csv.DictReader(file2)

set(map(getter, file1_r)).intersection(map(getter, file2_r))

这将提供以下设置：

 {('Leo', '25', 'England'), ('Mary', '31', 'New')}

您可以将其转换回csv。

非常感谢您的帮助。有没有一种方法可以让我不必在itemgetter中键入每一列，就可以得到列的标题？实际上，我还有其他文件的列比这个多。