Awk 比较多列并仅在匹配时替换 我有两个文件(文件1和文件2)
我试图将File1的Column1和column2的字符串与File2的Column4和column5进行比较。除此匹配外,File2的第6列还需要匹配某些字符串,如SO或CO(因为FILE1的第3列和第4列分别为SO和CO),然后将File2的第7列替换为FILE1的第3列,否则保持其他字符串不变Awk 比较多列并仅在匹配时替换 我有两个文件(文件1和文件2),awk,Awk,我试图将File1的Column1和column2的字符串与File2的Column4和column5进行比较。除此匹配外,File2的第6列还需要匹配某些字符串,如SO或CO(因为FILE1的第3列和第4列分别为SO和CO),然后将File2的第7列替换为FILE1的第3列,否则保持其他字符串不变 我试图修改并使用论坛中针对类似问题提供的解决方案,但没有成功 FILE1 type code SO CO other 7757 1 6941.958 1
FILE1
type code SO CO other
7757 1 6941.958 138.922 149.17
7757 2 8666.123 198.908 225.67
7757 4 2795.885 334.875 378.68
7759 GT3 222.104 13.5 734.62
7768 CT2 0 0 0
7805 6 3796.677 75.175 79.09
FILE2
"US","01073",,"7757","1","SO","10","299"
"US","01073",,"7758","1","SO","10","299"
"US","01073",,"7757","1","NO","10","299"
"US","01073",,"7757","1","CO","10","299"
"US","01073",,"7757","4","MO","10","299"
"US","01073",,"7757","1","GO","10","299"
"US","01073",,"7805","6","CO","10","299"
Required output:
"US","01073",,"7757","1","SO","6941.958","299"
"US","01073",,"7758","1","SO","10","299"
"US","01073",,"7757","1","NO","10","299"
"US","01073",,"7757","1","CO","138.922","299"
"US","01073",,"7757","4","MO","10","299"
"US","01073",,"7757","1","GO","10","299"
"US","01073",,"7805","6","CO","75.175","299"
我尝试的解决方案(仅适用于CO):
tr-d'temp#删除双引号
awk'NR==FNR{A[$1,$2]=$3;next}A[$4,$5]&&&$6==“CO”{$7=A[$1,$2];print}'FS=”“OFS=“,”FILE1 temp>out
- 复合awk解决方案:
awk 'function unquote(f){
return substr(f, 2, length(f)-2)
}
NR==FNR{
if (NR==1){ f3=$3; f4=$4 }
else if (NF){ a[$1,$2,f3]=$3; a[$1,$2,f4]=$4 }
next;
}
{ k=unquote($4) SUBSEP unquote($5) SUBSEP unquote($6) }
k in a{ $7=a[k] }1' file1 FS=',' OFS=',' file2
-unquotes/在双引号之间提取值(事实上,在字符串的第一个和最后一个字符之间)函数unquote(f){…}
-对关键序列进行分组a[$1,$2,f3]=$3;a[$1,$2,f4]=$4
输出:
"US","01073",,"7757","1","SO",6941.958,"299"
"US","01073",,"7758","1","SO","10","299"
"US","01073",,"7757","1","NO","10","299"
"US","01073",,"7757","1","CO",138.922,"299"
"US","01073",,"7757","4","MO","10","299"
"US","01073",,"7757","1","GO","10","299"
"US","01073",,"7805","6","CO",75.175,"299"
非常感谢您帮助编辑我的代码!Randomir。您好RomanPerekhrest,谢谢您的帮助。您的脚本对我来说非常棒。但是我一直得到与“file2”相同的输出,这意味着在输出的第7列中没有任何替换。有什么提示吗?@kelly,提示:确保你已经发布了实际的输入样本,因为它们是复制和测试的。该解决方案对于当前发布的samplesRomanPerekhrest工作良好,这是我的问题,您的代码工作得非常完美。非常感谢您的帮助和时间。@kelly,没问题,@RomanPerekhrest的解决方案与测试数据完美结合。然而,文件2中的实际数据存在问题:第2列类似于“abc,45”或“abc23”,这意味着有些双引号内有逗号,有些则没有。既然我不能用双引号作为这个问题的分隔符,该如何处理呢?谢谢你的帮助。
"US","01073",,"7757","1","SO",6941.958,"299"
"US","01073",,"7758","1","SO","10","299"
"US","01073",,"7757","1","NO","10","299"
"US","01073",,"7757","1","CO",138.922,"299"
"US","01073",,"7757","4","MO","10","299"
"US","01073",,"7757","1","GO","10","299"
"US","01073",,"7805","6","CO",75.175,"299"