Python 基于密钥和辅助密钥合并两个csv文件

Python 基于密钥和辅助密钥合并两个csv文件,python,csv,Python,Csv,我想合并两个csv文件,如下所示: csv1: formula,solver,runtime,conflicts CBS_k3_n100_m403_b30_13.cnf,SWDiA5BY,0.001842,318 CBS_k3_n100_m403_b30_13.cnf,glucose,0.001842,318 formula,entropy,num sols CBS_k3_n100_m403_b30_13.cnf,0.202,707286 formula,solver,runtime,con

我想合并两个csv文件,如下所示:

csv1:

formula,solver,runtime,conflicts
CBS_k3_n100_m403_b30_13.cnf,SWDiA5BY,0.001842,318
CBS_k3_n100_m403_b30_13.cnf,glucose,0.001842,318
formula,entropy,num sols
CBS_k3_n100_m403_b30_13.cnf,0.202,707286
formula,solver,runtime,conflicts,entropy,solutions
CBS_k3_n100_m403_b30_13.cnf,SWDiA5BY,0.001842,318,0.202,707286
CBS_k3_n100_m403_b30_13.cnf,glucose,0.001842,318,0.202,707286
csv2:

formula,solver,runtime,conflicts
CBS_k3_n100_m403_b30_13.cnf,SWDiA5BY,0.001842,318
CBS_k3_n100_m403_b30_13.cnf,glucose,0.001842,318
formula,entropy,num sols
CBS_k3_n100_m403_b30_13.cnf,0.202,707286
formula,solver,runtime,conflicts,entropy,solutions
CBS_k3_n100_m403_b30_13.cnf,SWDiA5BY,0.001842,318,0.202,707286
CBS_k3_n100_m403_b30_13.cnf,glucose,0.001842,318,0.202,707286
所需输出:

formula,solver,runtime,conflicts
CBS_k3_n100_m403_b30_13.cnf,SWDiA5BY,0.001842,318
CBS_k3_n100_m403_b30_13.cnf,glucose,0.001842,318
formula,entropy,num sols
CBS_k3_n100_m403_b30_13.cnf,0.202,707286
formula,solver,runtime,conflicts,entropy,solutions
CBS_k3_n100_m403_b30_13.cnf,SWDiA5BY,0.001842,318,0.202,707286
CBS_k3_n100_m403_b30_13.cnf,glucose,0.001842,318,0.202,707286
所以我在两个字典(csv)的键之间进行了交叉,并使用了列表理解

keysA = set(dict1.keys())
keysB = set(dict2.keys())
keys = keysA & keysB
...
[[key] + dict1.get(key, []) + dict2.get(key, []) for key in keys]
但是有一些“重复”行(我需要),其中字段公式相同,但字段解算器不同,我的输出是:

formula,solver,runtime,conflicts,entropy,solutions
CBS_k3_n100_m403_b30_13.cnf,SWDiA5BY,0.001842,318,0.202,707286
如何使用列表理解保持这些行?或者以任何其他方式

谢谢你的帮助


编辑-添加了一个示例

为什么不使用熊猫。这在熊猫身上很容易做到

import pandas as pd
df1=pd.read_csv("1.csv")
df=pd.read_csv("2.csv")
result=df1.merge(df,on="formula")
result.to_csv("result.csv")

此外,您还可以使用
result=df1.merge(df,on=“formula”,how=“outer”)
来保留您的csv之一拥有但另一个没有的公式。您的问题不清楚。这些“重复行”都在csv1中吗?csv1是唯一存在解算器的文件?您希望如何处理重复行?每个键最多只能有一个值,但该值可以是一个列表。是否要每个公式的值列表?您应该向我们展示一个数据示例以及来自该数据的所需结果。我相信您的意思是
result=df1.merge(df,on=“formula”,how=“left”)
,因为外部联接保留所有行,谢谢
“left”
表示左侧外部联接,这意味着
df
独占的行将在
结果中丢失。这里我要说的是保留所有行的完整外部联接。