在Linux中,如何检查两个不同的模式是否在连续的行中

在Linux中,如何检查两个不同的模式是否在连续的行中,linux,file,awk,sed,hex,Linux,File,Awk,Sed,Hex,我有一个正在验证的ASCII文本文件。该文件包含两种类型的上下文: Necessary Context: One which should be present at least once in its exact position. Optional Context: One which may or may not be present, but if present should hold its proper place. 文件的详细外观: [INDEX] <-- optiona

我有一个正在验证的ASCII文本文件。该文件包含两种类型的上下文:

Necessary Context: One which should be present at least once in its exact position.
Optional Context: One which may or may not be present, but if present should hold its proper place.
文件的详细外观:

[INDEX] <-- optional context, but if present should be the first context immediately followed by [FEATURE_ID], else file invalid
[FEATURE_ID] <-- necessary context and should always be immediately followed by [FEATURE_REV], else file is invalid. If [INDEX] context there then this should be the second CONTEXT in file else first.
[FEATURE_REV] <-- necessary context (must exist one per FEATURE_ID) and should always be immediately after [FEATURE_ID], else file is invalid.
[PRL_ID] <-- optional context, but if present should always be immediately after [FEATURE_REV], else file invalid
[NO_OF_BYTES] <--optional context, but if present, should always be immediately after [PRL_ID] if it is present, else immediately after [FEATURE_REV] if [PRL_ID] not present. Otherwise file invalid.
[NO_OF_SIGNIF_BITS] <-- optional context, but if present should always be between [NO_OF_BYTES] ( can be only present if [NO_OF_BYTES] present else not) and [CRC], else file invalid
[CRC] <-- necessary context,(must exist one per FEATURE_ID and FEATURE_REV). This is always the last context.
我试图在linux脚本中通过多个if-else语句和
egreps
实现这一点,但是代码行变得越来越复杂

我想要的代码是:

f_id_c=`egrep "[ ]*\[FEATURE_ID=[0-9].*\][ ]*" $1 | wc -l`
f_rev_c=`egrep "[ ]*\[FEATURE_REV=[0-9].*\][ ]*" $1 | wc -l`
crc_c=`egrep "[ ]*\[CRC\][ ]*" $1 | wc -l`
[[ $((f_id_c)) -eq 0 ]] && { echo "Invalid! No [FEATURE_ID=] context defined in profile file !"; exit 1; }
[[ $((f_rev_c)) -ne $((f_id_c)) ]] && { echo "Invalid! Not all [FEATURE_REV=] contexts have leading [FEATURE_ID=] defined"; exit 1; }
[[ $((crc_c)) -ne $((f_id_c)) ]] && { echo "Invalid! Not all [CRC] contexts have leading [FEATURE_ID=] defined"; exit 1; }
for (i=0;i<f_id_c;i++)
  do
    // Have a check with SED that will confirm there is a [FEATURE_REV=] immediately following [FEATURE_ID=]
  done
f\u id\u c=`egrep”[]*\[FEATURE\u id=[0-9].\][]*'$1 | wc-l`
f_rev_c=`egrep“[]*\[特征]u rev=[0-9].\][]*“$1 | wc-l`
crc|c=`egrep“[]*\[crc\][]*”$1| wc-l`
[[$((f_id_c))-eq 0]]&&{echo“无效!配置文件中未定义[FEATURE_id=]上下文!”;退出1;}
[[$((f_rev_c))-ne$((f_id_c))]&&{echo“无效!并非所有[FEATURE_rev=]上下文都定义了前导的[FEATURE_id=];退出1;}
[[$((crc_c))-ne$((f_id_c))]&&{echo“无效!并非所有[crc]上下文都定义了前导[FEATURE_id=];退出1;}

对于(i=0;i您将需要一个类似以下内容的FSM:

Validfile_1:

[FEATURE_ID]
[FEATURE_REV]
[CRC]

[INDEX]
[FEATURE_ID]
[FEATURE_REV]
[CRC]

Validfile_2:

[FEATURE_ID]
[FEATURE_REV]
[NO_OF_BYTES]
[CRC]

[INDEX]
[FEATURE_ID]
[FEATURE_REV]
[PRL_ID]
[NO_OF_BYTES]
[NO_OF_SIGNIF_BITS]
[CRC]

Validfile_3

[FEATURE_ID]
[FEATURE_REV]
[CRC]

Invalidfile_1 (order of contexts not ok):

[FEATURE_ID]
[INDEX]
[FEATURE_REV]
[NO_OF_BYTES]
[CRC]
[PRL_ID]

Invalidfile_2(FEATURE_REV or CRC can never exist without a FEATURE_ID):

[FEATURE_REV]
[NO_OF_BYTES]
[CRC]

Invalidfile_3 ( NO_OF_SIGNIF_BITS cannot exist without NO_OF_BYTES)

[FEATURE_ID]
[FEATURE_REV]
[NO_OF_SIGNIF_BITS]
[CRC]
$ cat tst.awk
BEGIN {
    # define the allowed state transitions
    ns["IDLE","INDEX"]
    ns["IDLE","FEATURE_ID"]
    ns["INDEX","FEATURE_ID"]
    ns["FEATURE_ID","FEATURE_REV"]
    ns["FEATURE_REV","PRL_ID"]
    ns["FEATURE_REV","NO_OF_BYTES"]
    ns["FEATURE_REV","CRC"]
    ns["PRL_ID","NO_OF_BYTES"]
    ns["PRL_ID","CRC"]
    ns["NO_OF_BYTES","NO_OF_SIGNIF_BITS"]
    ns["NO_OF_BYTES","CRC"]
    ns["NO_OF_SIGNIF_BITS","CRC"]
    ns["CRC","INDEX"]
    ns["CRC","FEATURE_ID"]

    # create a regexp of the state names for use in match()
    for (state in ns) {
        sub(SUBSEP".*","",state)
        if (!seen[state]++) {
            states = states (states ? "|" : "") state
        }
    }

    # set the initial state
    state = "IDLE"
}

# parse the input
match($0,states) {
    nextState = substr($0,RSTART,RLENGTH)
    if ( ! ((state,nextState) in ns) ) {
        print "ERROR", NR, state, nextState, $0 | "cat>&2"
        exit 1
    }
    state = nextState
}
针对发布的示例输入文件运行时:

$ cat file
....
[FEATURE_ID]
[FEATURE_REV]
...
...
[CRC]

[INDEX]
[FEATURE_ID]
[FEATURE_REV]
...
...
...
[CRC]
$
$ awk -f tst.awk file
$

它不会产生任何输出,因为您提供的示例不包含要查找的错误。

请发布您的代码。@Harry发布了部分代码。您能告诉我一种方法来检查
sed
awk
,如果行中的模式匹配,则紧跟其后的行应该匹配某个模式,否则返回fal吗se.@EdMorton抱歉,解释可能会更冗长,但我不明白它在哪里变得模棱两可?我认为要求非常明确。无论如何,我试图进一步修改帖子,并澄清了[CRC]的位置可以。请看您是否能提供帮助。是的,发布文章的全部目的是为它获取一个精确的
awk
脚本。发布示例输入时,重要的是使其成为可以测试建议解决方案的对象。您发布的内容目前不包含任何您希望脚本捕获的错误,因此我们可以编写仅执行
printf“”的脚本;退出
,它将在给定输入的情况下产生预期的输出。花点力气制作一个输入文件,其中包含一些良好的记录,但也包含一些您希望工具捕获的错误类型,然后还显示给定该输入文件的预期输出。否则,我们无法测试潜在的解决方案ainst@EdMorton无意外。使用多个错误文件进行了测试。但是我必须稍微更改一些
允许的状态转换
。因为我刚刚回忆起
[CRC]
只能在
[NO\u字节]时出现
现在。但是我可以在您的AWK脚本中更改。但是对于我发布的问题,它工作得很好,所以非常感谢。嗨,Ed,再次访问此评论。我发现上面的内容与此文件内容不符:
[FEATURE\u ID=4][FEATURE\u REV=2]
。基本上,这应该是一个无效文件,如
[CRC]
不存在,它是一个
必要的上下文
。如何解决这个问题?即使是只包含
[FEATURE\u ID=4]
的文件也不会显示任何错误。我的要求是它应该存在,因为其他
必要的上下文
缺失
[FEATURE\u REV=]
[CRC]
是的,这就是在你的问题中创建样本输入/输出的问题,它不能充分反映你的真实数据,你会得到一个脚本,它适用于你发布的样本,但不适用于你的真实数据。就像我总是告诉人们的那样-一个脚本,在给定特定样本输入集的情况下产生预期的输出,这就是开始确定解决方案的关键点。如果您现在更新此问题,没有人会注意到,因此发布一个后续问题,其中包含一些真正具有代表性的示例输入/输出(真正考虑错误案例),有人会帮助您。