Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/ssl/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何比较基于多列awk的2个csv文件?_Awk - Fatal编程技术网

如何比较基于多列awk的2个csv文件?

如何比较基于多列awk的2个csv文件?,awk,Awk,我有2个csv文件,示例格式如下,每个文件中约有5000行: 文件1: EMPLOYEE_NUMBER,LAST_NAME,FIRST_NAME,MIDDLE_NAME,BRANCH,DEPARTMENT,LEVEL,POSITION,EMAIL_ADDRESS 110426,Balbon,Susan,Lagat,"abc Equity Ventures, Inc.",Group Internal Audit,Supervisor,I.S. Audit Supervisor,susan.

我有2个csv文件,示例格式如下,每个文件中约有5000行:

文件1:

    EMPLOYEE_NUMBER,LAST_NAME,FIRST_NAME,MIDDLE_NAME,BRANCH,DEPARTMENT,LEVEL,POSITION,EMAIL_ADDRESS
110426,Balbon,Susan,Lagat,"abc Equity Ventures, Inc.",Group Internal Audit,Supervisor,I.S. Audit Supervisor,susan.balbon@abc.com
30083,Mendezona,Bingen,Roemer,"abc Equity Ventures, Inc.",Risk Management Office,Vice President,VP - AEV Security,bing.mendezona@abc.test
110773,Casas,Joyce Grace,Bea,"abc Equity Ventures, Inc.",Tax Advisory and Compliance,Manager,Tax Counsel,joyce.grace.casas@abc.com
286,Fernandez,Mark Brian,Tato,abc Foundation Inc.,Computer Services Division,Supervisor,Senior Applications Supervisor,mark.fernandez@abc.com
291,Plando,Marilou,Polleros,"abc Equity Ventures, Inc.",Administration,Assistant Vice President,AVP - Risk Management,marilou.plando@abc.test
110813,Gemelo-Abarca,Therese Xyza,Dableo,"abc Equity Ventures, Inc.",Governance & Compliance Team,Manager,Associate General Counsel - Corporate Secretarial and Compliance,therese.xyza.abarca@abc.com
30096,Abay,Joanna Marie,Saluria,"abc Equity Ventures, Inc.",Tax Advisory and Compliance,Supervisor,Tax Compliance Officer,joanna.abay@abc.com
110711,Ostan,Margilyn,Salibio,"abc Equity Ventures, Inc.",Accounting,Staff,Senior Accountant 1,margilyn.ostan@abc.com
110732,Fumar-Gonzales,Vanessa Concepcion,Altarejos,"abc Equity Ventures, Inc.",Legal and Corporate Services,Manager,Associate General Counsel - Labor & Litigation,vanessa.gonzales@abc.com
文件2:

    EMPLOYEE_NUMBER,LAST_NAME,FIRST_NAME,MIDDLE_NAME,BRANCH,DEPARTMENT,LEVEL,POSITION,EMAIL_ADDRESS
110426,Balbon,Susan,Lagat,"abc Equity Ventures, Inc.",Group Internal Audit,Supervisor,I.S. Audit Supervisor,susan.balbon@abc.com
30083,Mendezona,Bingen,Roemer,"abc Equity Ventures, Inc.",Security,Vice President,VP - AEV Security,jetee.velante@abc.com
110773,Casas,Joyce Grace,Bea,"abc Equity Ventures, Inc.",Tax Advisory and Compliance,Supervisor,Tax Counsel,joyce.grace.casas@abc.com
286,Fernandez,Mark Brian,Tato,abc Foundation Inc.,Computer Services Division,Supervisor,Senior Applications Supervisor,mark.fernandez@abc.com
291,Plando,Marilou,Polleros,"abc Equity Ventures, Inc.",Risk Management Office,Assistant Vice President,AVP - Risk Management,marilou.plando@abc.test
110866,Dugan,Belinda,Escultura,"abc Equity Ventures, Inc.",Legal Management,Vice President,Vice President for Legal Services Management,dixie.dugan@abc.test
221,Montehermoso,Gladys,Enoy,"abc Equity Ventures, Inc.",Accounting,Staff,Senior Accountant,gladys.montehermoso@abc.com
30102,Oblianda,Anna Cielo,Salud,"abc Equity Ventures, Inc.",Accounting,Supervisor,Accounting Supervisor,cielo.oblianda@abc.com
110499,Bucol,Charmaine Ann,Rebusa,"abc Equity Ventures, Inc.",Group Internal Audit,Staff,Audit Senior,charmaine.ann.bucol@abc.com
我想使用awk将所有行在EMPLOYEE_NUMBER+EMAIL_ADDRESS列中的值相同,但在其他列中的值不同

我的理想做法是在垂直方向上合并基于列EMPLOYEE_NUMBER+EMAIL_ADDRESS的2个csv文件,并使用awk删除重复的行。谢谢

输出将如下所示:

EMPLOYEE_NUMBER,LAST_NAME,FIRST_NAME,MIDDLE_NAME,BRANCH,DEPARTMENT,LEVEL,POSITION,EMAIL_ADDRESS
110773,Casas,Joyce Grace,Bea,"Aboitiz Equity Ventures, Inc.",Tax Advisory and Compliance,Manager,Tax Counsel,joyce.grace.casas@aboitiz.com
110773,Casas,Joyce Grace,Bea,"Aboitiz Equity Ventures, Inc.",Tax Advisory and Compliance,Supervisor,Tax Counsel,joyce.grace.casas@aboitiz.com

使用简单的awk_脚本即可实现

awk_脚本:

NR==FNR{
  if(FNR==1){print}
  a[$1 $2]=$0
  next
}
a[$1 $2]!=$0 && a[$1 $2]!=""{
  print a[$1 $2],$0
}
要执行的命令

awk -F',' -v OFS="\n" -f awk_script file1 file2

发布最终的预期结果请使用一个简化的示例,即请删除所有不一定需要提问的列。虽然这可能不是您最初的问题,但它可能是一个更好的问题。这些是真实的个人数据,您可能不应该在互联网上共享它们。可能重复@Inian为什么bash不是一个合适的标签?我想是的。那么请帮我检查一下我的回答是否被接受。:)