Awk 我想检查第1列中的数字是否等于第2列。第1列应以以下格式开始和结束
我想检查第1列中的数字是否等于第2列,最后第1列应该以Awk 我想检查第1列中的数字是否等于第2列。第1列应以以下格式开始和结束,awk,Awk,我想检查第1列中的数字是否等于第2列,最后第1列应该以“ABC”开头,以“DEF”结尾,但有时它也以“DEFZ”结尾,介于“ABC”和“DEF”之间的数字应该与第2列匹配。有人能帮我吗 我的输入: ABC12345DEF|12345|23132331331| ABC12345DEFZ1|12345|23132331331| ABC12345DEFZ2|12345|23132331331| ABC95678DEF|45678|23132331331| ABC87887DEF|86187|2313
“ABC”
开头,以“DEF”
结尾,但有时它也以“DEFZ”
结尾,介于“ABC”和“DEF”
之间的数字应该与第2列匹配。有人能帮我吗
我的输入:
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC95678DEF|45678|23132331331|
ABC87887DEF|86187|23132331331|
ABC89043DEF|89043|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
输出应为:
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC95678DEF|45678|23132331331|
ABC87887DEF|86187|23132331331|
ABC89043DEF|89043|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
我正在尝试使用下面的一个,但它不起作用
awk -F '|' '"ABC" $2 "DEF" == $1 && "ABC" $2 "DEFZ"+[0-9] == $1 { print }' WHTFile.txt > QC2Valid.txt**
有人能帮我吗?
提前谢谢
awk -v FS="|" '{tmpvar=$1;gsub(/^ABC|DEF(Z[0-9]+)?$/,"",tmpvar)}tmpvar == $2' infile
输入
akshay@db-3325:/tmp$ cat infile
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC95678DEF|45678|23132331331|
ABC87887DEF|86187|23132331331|
ABC89043DEF|89043|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
akshay@db-3325:/tmp$ awk -v FS="|" '{tmpvar = $1; gsub(/^ABC|DEF(Z[0-9]+)?$/,"",tmpvar)} tmpvar == $2' infile
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC89043DEF|89043|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
输出
akshay@db-3325:/tmp$ cat infile
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC95678DEF|45678|23132331331|
ABC87887DEF|86187|23132331331|
ABC89043DEF|89043|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
akshay@db-3325:/tmp$ awk -v FS="|" '{tmpvar = $1; gsub(/^ABC|DEF(Z[0-9]+)?$/,"",tmpvar)} tmpvar == $2' infile
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC89043DEF|89043|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
解释
awk -v FS="|" '{ # call awk set field separator |
tmpvar = $1; # save first field contents in variable tmpvar
# substitute first ABC or DEF
# which can be followed by Z and numbers
# from variable with null globally
# so that tmpvar will just have numbers which is between abc and def*
gsub(/^ABC|DEF(Z[0-9]+)?$/,"",tmpvar)
}
# if tmpvar is equal to second field then
# print current record/row/line, thats boolean true, print $0
tmpvar == $2
' infile
/^ABC | DEF(Z[0-9]+)?/
1st Alternative^ABC
^
断言字符串开头的位置ABC
与字符ABC
逐字匹配(区分大小写)
- 第二个备选方案
DEF(Z[0-9]+)?
DEF
匹配字符DEF
字面上(区分大小写)第一个捕获组(Z[0-9]+)?
?
量词-匹配0到1次,尽可能多次,根据需要返回(贪婪)Z
匹配字符Z
字面意思(区分大小写)匹配下表中的单个字符[0-9]+
+
量词-在一次和无限次之间进行匹配,尽可能多地匹配,根据需要返回(贪婪)
嘿,阿凯,一切正常,非常感谢,非常感谢。当然,谢谢你,阿凯