Awk 删除包含未知字符串的重复行
file.txtAwk 删除包含未知字符串的重复行,awk,sed,grep,Awk,Sed,Grep,file.txt test (CODE:700|SIZE:2356) asdasdad (CODE:700|SIZE:124) xcvxcva (CODE:700|SIZE:8974) asdavasdasdasd (CODE:700|SIZE:124) link-categories (CODE:700|SIZE:8974) edit (CODE:700|SIZE:124) 我需要命令get all duplicatedSIZE:value,然后删除除一行之外的所有重复行,我的意思是输出应如
test (CODE:700|SIZE:2356)
asdasdad (CODE:700|SIZE:124)
xcvxcva (CODE:700|SIZE:8974)
asdavasdasdasd (CODE:700|SIZE:124)
link-categories (CODE:700|SIZE:8974)
edit (CODE:700|SIZE:124)
我需要命令get all duplicatedSIZE:
value,然后删除除一行之外的所有重复行,我的意思是输出应如下所示:
test (CODE:700|SIZE:2356)
xcvxcva (CODE:700|SIZE:8974)
asdavasdasdasd (CODE:700|SIZE:124)
我在中找到了这个命令sed'/SIZE:124/,+1d'file.txt
但是这个命令删除了所有行,我需要的是删除除一行之外的重复行+这个命令不会搜索重复的SIZE:
value,所以它不工作
我需要的是:
- 搜索重复的
值,如上面的大小:
李>124
- 所有行都有此值。如果可以,请将其删除,一行或两行除外
- 请您尝试以下内容
awk 'match($0,/SIZE:[0-9]+/){val=substr($0,RSTART,RLENGTH);array[val]=$0;val=""} END{for(key in array){print array[key]}}' Input_file
或添加一种非线性形式的溶液:
awk '
match($0,/SIZE:[0-9]+/){
val=substr($0,RSTART,RLENGTH)
array[val]=$0
val=""
}
END{
for(key in array){
print array[key]
}
}
' Input_file
解释:添加上述代码的详细解释
awk ' ##Starting awk program from here.
match($0,/SIZE:[0-9]+/){ ##Using match function to match regex of SIZE: then digits in each line here.
val=substr($0,RSTART,RLENGTH) ##Creating variable val whose value is sub string of current line which has matched value from current line.
array[val]=$0 ##Creating an array named array with index of variable val and value is current line.
val="" ##Nullify variable val here.
}
END{ ##Starting END block of this awk program here.
for(key in array){ ##Traversing through array here.
print array[key] ##Printing array value here.
}
}
' Input_file ##Mentioning Input_file name here.
也可以使用此简单的
awk
完成:
awk -F '[ |]+' '!seen[$NF]++{print}' file
因此,我们鼓励用户添加他们为解决自己的问题所付出的努力,请在您的问题中添加同样的努力,然后让我们知道。
awk'!已查看[$2]+'file.txt文件
test (CODE:700|SIZE:2356)
asdasdad (CODE:700|SIZE:124)
xcvxcva (CODE:700|SIZE:8974)