Bash 如何打印与awk多次匹配的列的第一次出现
我有一个包含所有备份的日志文件和一个值为“是”的列,这意味着它不会被保留策略删除(保留)。对于特定的vmname,可能有1行或多行具有保留的列=yes 我的意见是:Bash 如何打印与awk多次匹配的列的第一次出现,bash,awk,grep,Bash,Awk,Grep,我有一个包含所有备份的日志文件和一个值为“是”的列,这意味着它不会被保留策略删除(保留)。对于特定的vmname,可能有1行或多行具有保留的列=yes 我的意见是: = FULL == 20210105 == 2100 == ASR-FULL-20210105-2100 == YES = FULL == 20210202 == 2100 == ASR-FULL-20210202-2100 == YES = FU
= FULL == 20210105 == 2100 == ASR-FULL-20210105-2100 == YES
= FULL == 20210202 == 2100 == ASR-FULL-20210202-2100 == YES
= FULL == 20210302 == 2100 == ASR-FULL-20210302-2100 == YES
= FULL == 20210406 == 2100 == ASR-FULL-20210406-2100 == YES
= FULL == 20210105 == 2146 == DNS10_7-FULL-20210105-2146 == YES
= FULL == 20210202 == 2153 == DNS10_7-FULL-20210202-2153 == YES
= FULL == 20210302 == 2148 == DNS10_7-FULL-20210302-2148 == YES
= FULL == 20210406 == 2122 == DNS10_7-FULL-20210406-2122 == YES
= FULL == 20210105 == 2105 == execnet.0-FULL-20210105-2105 == YES
= FULL == 20210202 == 2106 == execnet.0-FULL-20210202-2106 == YES
= FULL == 20210302 == 2106 == execnet.0-FULL-20210302-2106 == YES
= FULL == 20210406 == 2105 == execnet.0-FULL-20210406-2105 == YES
= FULL == 20210106 == 0200 == Prtgadmin.0-FULL-20210106-0200 == YES
= FULL == 20210105 == 2216 == sandbox.0-FULL-20210105-2216 == YES
= FULL == 20210202 == 2227 == sandbox.0-FULL-20210202-2227 == YES
= FULL == 20210406 == 2152 == sandbox.0-FULL-20210406-2152 == YES
= FULL == 20210105 == 2236 == wwwp.0-FULL-20210105-2236 == YES
= FULL == 20210202 == 2249 == wwwp.0-FULL-20210202-2249 == YES
= FULL == 20210105 == 2259 == wwws.0-FULL-20210105-2259 == YES
= FULL == 20210202 == 2314 == wwws.0-FULL-20210202-2314 == YES
= FULL == 20210105 == 2259 == webhost.0-FULL-20210105-2259 == YES
我想要的输出是打印n-1个最早的匹配项(前n-1个)
到目前为止,我可以通过运行下面的awk命令得到下面的结果,但它显示的是最近的匹配。我也希望有一个awk命令。
年份过滤器没有那么重要
# cat bkp_list.log| grep -E '*2021.*YES'| awk -F[==-] 'cnt[$8]++{if (cnt[$8]>1) print prev=$0;next}' |awk -F[==] '{print $8}'
谢谢如果您想在列中使用“是”进行筛选,您可以使用块前的连续表达式进行筛选
$ cat file
= FULL == 20210105 == 2100 == ASR-FULL-20210105-2100 == NO
= FULL == 20210202 == 2100 == ASR-FULL-20210202-2100 == YES
= FULL == 20210302 == 2100 == ASR-FULL-20210302-2100 == YES
= FULL == 20210406 == 2100 == ASR-FULL-20210406-2100 == YES
= FULL == 20210105 == 2146 == DNS10_7-FULL-20210105-2146 == YES
= FULL == 20210202 == 2153 == DNS10_7-FULL-20210202-2153 == YES
= FULL == 20210302 == 2148 == DNS10_7-FULL-20210302-2148 == YES
= FULL == 20210406 == 2122 == DNS10_7-FULL-20210406-2122 == YES
= FULL == 20210105 == 2105 == execnet.0-FULL-20210105-2105 == YES
= FULL == 20210202 == 2106 == execnet.0-FULL-20210202-2106 == YES
= FULL == 20210302 == 2106 == execnet.0-FULL-20210302-2106 == YES
= FULL == 20210406 == 2105 == execnet.0-FULL-20210406-2105 == YES
= FULL == 20210106 == 0200 == Prtgadmin.0-FULL-20210106-0200 == YES
= FULL == 20210105 == 2216 == sandbox.0-FULL-20210105-2216 == YES
= FULL == 20210202 == 2227 == sandbox.0-FULL-20210202-2227 == YES
= FULL == 20210406 == 2152 == sandbox.0-FULL-20210406-2152 == YES
= FULL == 20210105 == 2236 == wwwp.0-FULL-20210105-2236 == YES
= FULL == 20210202 == 2249 == wwwp.0-FULL-20210202-2249 == YES
= FULL == 20210105 == 2259 == wwws.0-FULL-20210105-2259 == YES
= FULL == 20210202 == 2314 == wwws.0-FULL-20210202-2314 == YES
= FULL == 20210105 == 2259 == webhost.0-FULL-20210105-2259 == YES
**注:我将第一行“是”更改为“否”,以检查行为是否正确
无论如何,如果您需要执行任何其他特殊筛选,如检查年份,请指定如果您要在列中使用“是”进行筛选,您可以使用块前的连续表达式进行筛选
$ cat file
= FULL == 20210105 == 2100 == ASR-FULL-20210105-2100 == NO
= FULL == 20210202 == 2100 == ASR-FULL-20210202-2100 == YES
= FULL == 20210302 == 2100 == ASR-FULL-20210302-2100 == YES
= FULL == 20210406 == 2100 == ASR-FULL-20210406-2100 == YES
= FULL == 20210105 == 2146 == DNS10_7-FULL-20210105-2146 == YES
= FULL == 20210202 == 2153 == DNS10_7-FULL-20210202-2153 == YES
= FULL == 20210302 == 2148 == DNS10_7-FULL-20210302-2148 == YES
= FULL == 20210406 == 2122 == DNS10_7-FULL-20210406-2122 == YES
= FULL == 20210105 == 2105 == execnet.0-FULL-20210105-2105 == YES
= FULL == 20210202 == 2106 == execnet.0-FULL-20210202-2106 == YES
= FULL == 20210302 == 2106 == execnet.0-FULL-20210302-2106 == YES
= FULL == 20210406 == 2105 == execnet.0-FULL-20210406-2105 == YES
= FULL == 20210106 == 0200 == Prtgadmin.0-FULL-20210106-0200 == YES
= FULL == 20210105 == 2216 == sandbox.0-FULL-20210105-2216 == YES
= FULL == 20210202 == 2227 == sandbox.0-FULL-20210202-2227 == YES
= FULL == 20210406 == 2152 == sandbox.0-FULL-20210406-2152 == YES
= FULL == 20210105 == 2236 == wwwp.0-FULL-20210105-2236 == YES
= FULL == 20210202 == 2249 == wwwp.0-FULL-20210202-2249 == YES
= FULL == 20210105 == 2259 == wwws.0-FULL-20210105-2259 == YES
= FULL == 20210202 == 2314 == wwws.0-FULL-20210202-2314 == YES
= FULL == 20210105 == 2259 == webhost.0-FULL-20210105-2259 == YES
**注:我将第一行“是”更改为“否”,以检查行为是否正确
无论如何,如果您需要执行任何其他特殊筛选,如检查年份,请指定打印除最后一次匹配的
$8
子字符串,您可以使用此awk
:
awk'
$NF!=“是”{next}
{
s=8美元
子(/-FULL-.*/,“”,s)
}
s==ps{
打印pval
}
{
ps=s
pval=8美元
}"档案"
ASR-FULL-20210105-2100
ASR-FULL-20210202-2100
ASR-FULL-20210302-2100
DNS10_7-FULL-20210105-2146
DNS10_7-FULL-20210202-2153
DNS10_7-FULL-20210302-2148
execnet.0-FULL-20210105-2105
execnet.0-FULL-20210202-2106
execnet.0-FULL-20210302-2106
沙箱0-FULL-20210105-2216
沙箱0-FULL-20210202-2227
wwwp.0-FULL-20210105-2236
wwws.0-FULL-20210105-2259
或一个班轮:
awk'$NF!=“是”{next}{s=$8;sub(/-FULL-.*/,“”,s)}s==ps{print pval}{ps=s;pval=$8}”文件
要打印除最后一次匹配外的所有$8子字符串
,您可以使用此awk
:
awk'
$NF!=“是”{next}
{
s=8美元
子(/-FULL-.*/,“”,s)
}
s==ps{
打印pval
}
{
ps=s
pval=8美元
}"档案"
ASR-FULL-20210105-2100
ASR-FULL-20210202-2100
ASR-FULL-20210302-2100
DNS10_7-FULL-20210105-2146
DNS10_7-FULL-20210202-2153
DNS10_7-FULL-20210302-2148
execnet.0-FULL-20210105-2105
execnet.0-FULL-20210202-2106
execnet.0-FULL-20210302-2106
沙箱0-FULL-20210105-2216
沙箱0-FULL-20210202-2227
wwwp.0-FULL-20210105-2236
wwws.0-FULL-20210105-2259
或一个班轮:
awk'$NF!=“是”{next}{s=$8;sub(/-FULL-.*/,“”,s)}s==ps{print pval}{ps=s;pval=$8}”文件
带GNU awk的gensub():
或使用任何awk:
$ tac file | awk '$NF!="YES"{next} {k=$8; sub(/-.*/,"",k)} seen[k]++{print $8}' | tac
ASR-FULL-20210105-2100
ASR-FULL-20210202-2100
ASR-FULL-20210302-2100
DNS10_7-FULL-20210105-2146
DNS10_7-FULL-20210202-2153
DNS10_7-FULL-20210302-2148
execnet.0-FULL-20210105-2105
execnet.0-FULL-20210202-2106
execnet.0-FULL-20210302-2106
sandbox.0-FULL-20210105-2216
sandbox.0-FULL-20210202-2227
wwwp.0-FULL-20210105-2236
wwws.0-FULL-20210105-2259
对于gensub(),使用GNU awk:
或使用任何awk:
$ tac file | awk '$NF!="YES"{next} {k=$8; sub(/-.*/,"",k)} seen[k]++{print $8}' | tac
ASR-FULL-20210105-2100
ASR-FULL-20210202-2100
ASR-FULL-20210302-2100
DNS10_7-FULL-20210105-2146
DNS10_7-FULL-20210202-2153
DNS10_7-FULL-20210302-2148
execnet.0-FULL-20210105-2105
execnet.0-FULL-20210202-2106
execnet.0-FULL-20210302-2106
sandbox.0-FULL-20210105-2216
sandbox.0-FULL-20210202-2227
wwwp.0-FULL-20210105-2236
wwws.0-FULL-20210105-2259
老兄,为什么你没有这条线==ASR-FULL-20210105-2100==YES?我没有得到逻辑,它是从另一个备份工具命令中提取出来的,用于列出备份列表。对不起,我不在,谢谢你。是的,你是对的,我的意思是在我的过滤器中有两个连续的“=”。我会同时使用你的建议。@anubhava你是什么意思?它在第一线ASR-FULL-20210105-2100==对不起,我是说n-1个旧匹配(n-1个顶级匹配)我已经编辑了OP。谢谢大家的帮助。不,我从来没有暗示过任何人都应该暗地里弄明白。这是我在帖子的第一个版本上的错误,已经添加了更正。伙计,为什么你没有这行==ASR-FULL-20210105-2100==YES?我没有得到逻辑,它是从另一个备份工具命令中提取出来的,用于列出备份列表。对不起,我不在,谢谢你。是的,你是对的,我的意思是在我的过滤器中有两个连续的“=”。我会同时使用你的建议。@anubhava你是什么意思?它在第一线ASR-FULL-20210105-2100==对不起,我是说n-1个旧匹配(n-1个顶级匹配)我已经编辑了OP。谢谢大家的帮助。不,我从来没有暗示过任何人都应该暗地里弄明白。这是我在帖子的第一版上犯的错误,已经添加了更正。这一年真的没那么重要。我认为你的方法很好,因为我不再需要OP命令中的awk和grep部分。但是我仍然需要为每个vm FULL-*显示除最后一次事件之外的所有事件。(即3个中的2个previous)我们就快到了:“`` awk'$NF==”是“{print$(NF-2)}”file.log `` awk-F[=-]'cnt[$1]+{if(cnt[$1]>1)print prev=$0;next}``顺便说一句,前面的cnt语法没有打印最前面的n-1次。所以我还没到那一年似乎真的没那么重要。我认为你的方法很好,因为我不再需要OP命令中的awk和grep部分。但是我仍然需要为每个vm FULL-*显示除最后一次事件之外的所有事件。(即3个中的2个previous)我们就快到了:“`` awk'$NF==”是“{print$(NF-2)}”file.log `` awk-F[=-]'cnt[$1]+{if(cnt[$1]>1)print prev=$0;next}``顺便说一句,前面的cnt语法没有打印最前面的n-1次。所以我还没到那里,这看起来很可怕。我正在一行中尝试,我希望它可以作为一行运行
awk'/2021.*YES/{next}{s=$8;sub(/-FULL-.*/,“”,s)}s==ps{print pval}{ps=s;pval=$8}文件
是一行单元格,这正是我想要的。您可以添加一个不带2021的变量。我将把它作为有效答案。非常感谢你啊*是/{next}{s=$8;sub(/-FULL-.*/,“”,s)}s==ps{print pval}{ps=s;pval=$8}非常感谢先生!!太棒了,阿努巴瓦。我正在一行中尝试,我希望它可以作为一行运行awk'/2021.*YES/{next}{s=$8;sub(/-FULL-.*/,“”,s)}s==ps{print pval}{ps=s;pval=$8}文件
是一行单元格,这正是我想要的。您可以添加一个不带2021的变量。我将把它作为有效答案。非常感谢你啊*是/{next}{s=$8;sub(/-FULL-.*/,“”,s)}s==ps{print pval}{ps=s;pval=$8}非常感谢先生!!
$ awk ' $NF == "NO" { print $(NF-2) }' file
ASR-FULL-20210105-2100
$
$ tac file | awk '$NF=="YES" && seen[gensub(/-.*/,"",1,$8)]++{print $8}' | tac
ASR-FULL-20210105-2100
ASR-FULL-20210202-2100
ASR-FULL-20210302-2100
DNS10_7-FULL-20210105-2146
DNS10_7-FULL-20210202-2153
DNS10_7-FULL-20210302-2148
execnet.0-FULL-20210105-2105
execnet.0-FULL-20210202-2106
execnet.0-FULL-20210302-2106
sandbox.0-FULL-20210105-2216
sandbox.0-FULL-20210202-2227
wwwp.0-FULL-20210105-2236
wwws.0-FULL-20210105-2259
$ tac file | awk '$NF!="YES"{next} {k=$8; sub(/-.*/,"",k)} seen[k]++{print $8}' | tac
ASR-FULL-20210105-2100
ASR-FULL-20210202-2100
ASR-FULL-20210302-2100
DNS10_7-FULL-20210105-2146
DNS10_7-FULL-20210202-2153
DNS10_7-FULL-20210302-2148
execnet.0-FULL-20210105-2105
execnet.0-FULL-20210202-2106
execnet.0-FULL-20210302-2106
sandbox.0-FULL-20210105-2216
sandbox.0-FULL-20210202-2227
wwwp.0-FULL-20210105-2236
wwws.0-FULL-20210105-2259