Bash 从多个括号中提取字符串_Bash_Awk_Sed_Scripting_Cut

Bash 从多个括号中提取字符串

bash awk sed scripting

Bash 从多个括号中提取字符串,bash,awk,sed,scripting,cut,Bash,Awk,Sed,Scripting,Cut,我有一个包含以下内容的文件： ok: [10.9.22.122] => { "out.stdout_lines": [ "cgit-1.1-11.el7.x86_64", "python-paramiko-2.1.1-0.9.el7.noarch", "varnish-libs-4.0.5-1.el7.x86_64", "kernel-3.10.0-862.el7.x86

我有一个包含以下内容的文件：

    ok: [10.9.22.122] => {
        "out.stdout_lines": [
            "cgit-1.1-11.el7.x86_64",
            "python-paramiko-2.1.1-0.9.el7.noarch",
            "varnish-libs-4.0.5-1.el7.x86_64",
            "kernel-3.10.0-862.el7.x86_64"
        ]
    }
    ok: [10.9.33.123] => {
        "out.stdout_lines": [
            "python-paramiko-2.1.1-0.9.el7.noarch"
        ]
    }

    ok: [10.9.44.124] => {
        "out.stdout_lines": [
            "python-paramiko-2.1.1-0.9.el7.noarch",
            "kernel-3.10.0-862.el7.x86_64"
        ]
    }

   ok: [10.9.33.29] => {
       "out.stdout_lines": []
   }
   ok: [10.9.22.28] => {
       "out.stdout_lines": [
        "NetworkManager-tui-1:1.12.0-8.el7_6.x86_64", 
        "java-1.8.0-openjdk-javadoc-zip-debug-1:1.8.0.171-8.b10.el7_5.noarch", 
        "java-1.8.0-openjdk-src-1:1.8.0.171-8.b10.el7_5.x86_64", 
        "kernel-3.10.0-862.el7.x86_64", 
        "kernel-tools-3.10.0-862.el7.x86_64", 
    ]
}

ok: [10.2.2.2] => {
    "out.stdout_lines": [
        "monitorix-3.10.1-1.el6.noarch", 
        "singularity-runtime-2.6.1-1.1.el6.x86_64"
    ]
}

ok: [10.9.22.33] => {
    "out.stdout_lines": [
        "NetworkManager-1:1.12.0-8.el7_6.x86_64",
        "gnupg2-2.0.22-5.el7_5.x86_64", 
        "kernel-3.10.0-862.el7.x86_64", 
    ]
}

我需要提取[]之间的IP，如果它包含内核*

我想模拟子字符串，将内容的“块”保存到varible中，并遍历all文件。

如果我有许多分隔符，我将如何使用sed或其他方法来实现这一点

由于您的数据格式非常好，因此可以使用awkgawk：

awk '
    # get the ip address
    /ok:/ {ip = gensub(/[^0-9\.]/, "", "g", $2) }

    # check the stdout_lines block and print Kernal and ip saved from the above line
    /"out.stdout_lines":/,/\]/ { if (/\<[Kk]ernel\>/) print ip}
' file
#10.9.22.122
#10.9.44.124
#10.9.22.28
#10.9.22.28
#10.9.22.33

注:

我调整了正则表达式以反映更新的数据。在out.stdout_line块下，可能会为同一IP获取多个内核文件，这将多次生成同一IP。如果发生这种情况，只需将结果传输到| uniq即可

由于您的数据格式非常好，因此可以使用awkgawk：

awk '
    # get the ip address
    /ok:/ {ip = gensub(/[^0-9\.]/, "", "g", $2) }

    # check the stdout_lines block and print Kernal and ip saved from the above line
    /"out.stdout_lines":/,/\]/ { if (/\<[Kk]ernel\>/) print ip}
' file
#10.9.22.122
#10.9.44.124
#10.9.22.28
#10.9.22.28
#10.9.22.33

注:

我调整了正则表达式以反映更新的数据。在out.stdout_line块下，可能会为同一IP获取多个内核文件，这将多次生成同一IP。如果发生这种情况，只需将结果传输到| uniq即可快速解决方案： !/bin/bash

AWK='
    /^ok:/ { gsub(/^.*\[/,""); gsub(/].*$/,""); ip=$0 }
    /"Kernel-default/ { if (ip) print ip; ip="" }
'
awk "$AWK" INPUT

快速解决方案： !/bin/bash

AWK='
    /^ok:/ { gsub(/^.*\[/,""); gsub(/].*$/,""); ip=$0 }
    /"Kernel-default/ { if (ip) print ip; ip="" }
'
awk "$AWK" INPUT

GNU awk解决方案：

awk -F'\\]|\\[' 'tolower($3)~/"out.stdout_lines" *:/ && tolower($4)~/"kernel/{print "The IP " $2 " cointain Kernel"}' RS='}' file

输出：

The IP 10.9.22.122 cointain Kernel
The IP 10.9.44.124 cointain Kernel
The IP 10.9.22.28 cointain Kernel
The IP 10.9.22.33 cointain Kernel

我使用]或[作为FS字段分隔符，}作为RS记录分隔符。所以IP将变成2美元。此解决方案取决于结构，这意味着out.stdout_行需要位于[ip]之后的字段中，如您在示例中所示

另一种GNU awk方式，无上述限制：

awk -F']' 'match(tolower($0),/"out\.stdout_lines": *\[([^\]]+)/,m){if(m[1]~/"kernel/)print "The IP " substr($1, index($1,"[")+1) " cointain Kernel"}' RS='}' file

相同的输出。tolowers用于不区分大小写的匹配，如果您想要精确匹配，您可以删除它们或仅使用来自的解决方案

结合以上两种方式的优点，第三种方式：

如果不需要不区分大小写的匹配，请更改为从$0降低到$0。

GNU awk解决方案：

awk -F'\\]|\\[' 'tolower($3)~/"out.stdout_lines" *:/ && tolower($4)~/"kernel/{print "The IP " $2 " cointain Kernel"}' RS='}' file

输出：

The IP 10.9.22.122 cointain Kernel
The IP 10.9.44.124 cointain Kernel
The IP 10.9.22.28 cointain Kernel
The IP 10.9.22.33 cointain Kernel

另一种GNU awk方式，无上述限制：

awk -F']' 'match(tolower($0),/"out\.stdout_lines": *\[([^\]]+)/,m){if(m[1]~/"kernel/)print "The IP " substr($1, index($1,"[")+1) " cointain Kernel"}' RS='}' file

相同的输出。tolowers用于不区分大小写的匹配，如果您想要精确匹配，您可以删除它们或仅使用来自的解决方案

结合以上两种方式的优点，第三种方式：

如果不需要不区分大小写的匹配，请将其更改为低$0到$0。

请尝试以下内容，我相信这适用于大多数AWK。我添加了[kK]在条件匹配中，它应该查找内核或内核两个stringssince OP的前一个样本有大写字母K，现在它有一个大写字母K，所以在这里可以同时包含这两个

awk '
/ok/{
   gsub(/.*\[|\].*/,"")
   ip=$0
}
/stdout_line/{
   found=1
   next
}
found && /[kK]ernel/{
   print ip
}
/}/{
   ip=found=""
}
'  Input_file

说明：为上述代码添加说明

awk '                       ##Starting awk program here.
/ok/{                       ##Checking condition if a line contains string ok in it then do following.
   gsub(/.*\[|\].*/,"")     ##Globally substituting everything till [ and everything till ] with NULL in current line.
   ip=$0                    ##Creating variable named ip whose values is current line value(edited one).
}                           ##Closing BLOCK for ok string check condition.
/stdout_line/{              ##Checking condition if a line contains stdout_line then do following.
   found=1                  ##Set value of variable named found to 1 here.
   next                     ##next will skip all further statements from here.
}                           ##Closing BLOCK for stdout_line string check condition here.
found && /[kK]ernel/{          ##Checking condition if variable found is NOT NULL and string Kernel found in current line then do following.
   print ip                 ##Printing value of variable ip here.
}                           ##Closing BLOCK for above condition now.
/}/{                        ##Checking condition if a line contains } then do following.
   ip=found=""              ##Nullify ip and found variable here.
}                           ##Closing BLOCK for } checking condition.
'   Input_file              ##Mentioning Input_file name here.

输出如下

10.9.22.122
10.9.44.124
10.9.22.28
10.9.22.28
10.9.22.33

请您尝试以下内容，我相信这对大多数AWK都有效。我在条件匹配中添加了[kK]，因此它应该查找内核或内核两个stringssince OP的上一个示例使用大写字母K，而现在使用的是ksmall one，所以我想在这里涵盖这两个

awk '
/ok/{
   gsub(/.*\[|\].*/,"")
   ip=$0
}
/stdout_line/{
   found=1
   next
}
found && /[kK]ernel/{
   print ip
}
/}/{
   ip=found=""
}
'  Input_file

说明：为上述代码添加说明

awk '                       ##Starting awk program here.
/ok/{                       ##Checking condition if a line contains string ok in it then do following.
   gsub(/.*\[|\].*/,"")     ##Globally substituting everything till [ and everything till ] with NULL in current line.
   ip=$0                    ##Creating variable named ip whose values is current line value(edited one).
}                           ##Closing BLOCK for ok string check condition.
/stdout_line/{              ##Checking condition if a line contains stdout_line then do following.
   found=1                  ##Set value of variable named found to 1 here.
   next                     ##next will skip all further statements from here.
}                           ##Closing BLOCK for stdout_line string check condition here.
found && /[kK]ernel/{          ##Checking condition if variable found is NOT NULL and string Kernel found in current line then do following.
   print ip                 ##Printing value of variable ip here.
}                           ##Closing BLOCK for above condition now.
/}/{                        ##Checking condition if a line contains } then do following.
   ip=found=""              ##Nullify ip and found variable here.
}                           ##Closing BLOCK for } checking condition.
'   Input_file              ##Mentioning Input_file name here.

输出如下

10.9.22.122
10.9.44.124
10.9.22.28
10.9.22.28
10.9.22.33

这可能适用于GNU sed：

sed -n '/ok:/{s/[^0-9.]//g;:a;N;/]/!ba;/stdout_line.*kernel/P}' file

将-n设置为禁止隐式打印

如果一行包含字符串ok：这是一个IP地址，则除去该行中除整数和句点以外的所有内容

追加更多行，直到遇到包含]的行，如果模式空间同时包含标准行和内核，则打印第一行。

这可能适用于GNU-sed：

sed -n '/ok:/{s/[^0-9.]//g;:a;N;/]/!ba;/stdout_line.*kernel/P}' file

将-n设置为禁止隐式打印

如果一行包含字符串ok：这是一个IP地址，则除去该行中除整数和句点以外的所有内容

追加更多行，直到遇到包含]的行，如果模式空间同时包含标准输出行和内核，则使用Perl打印第一行。

$ perl -0777 -ne 's!\[(\S+)\].+?\{(.+?)\}!$y=$1;$x=$2;$x=~/kernel/ ? print "$y\n":""!sge'  brenn.log
10.9.22.122
10.9.44.124
10.9.22.28
10.9.22.33

$

使用Perl

$ perl -0777 -ne 's!\[(\S+)\].+?\{(.+?)\}!$y=$1;$x=$2;$x=~/kernel/ ? print "$y\n":""!sge'  brenn.log
10.9.22.122
10.9.44.124
10.9.22.28
10.9.22.33

$

输出为：IP 10.9.44.124 cointain内核10.9.22.122不应打印？是，抱歉。我需要将所有IP保存到containt Kernel。输出将是：IP 10.9.44.124 cointain Kernel10.9.22.122不应打印？是的，抱歉。我需要将所有IP保存到containt Kernel。此解决方案将多次重复包含内核的IP。示例：35乘以10.9.22.123对于给定的样本，它有效。那么，您是否有多个相同的IP，并且希望删除重复的IP？请确认一下？相信我，这将是这里提供的最简单的解决方案之一，并且也兼容。如果这里的要求很明确，我看不到任何其他解决重复问题的答案，但如果您能确认，我可以增强这一点。此解决方案重复多次包含内核的IP。示例：35乘以10.9.22.123对于给定的样本，它有效。那么，您是否有多个相同的IP，并且希望删除重复的IP？请确认一下？相信我，这将是这里提供的最简单的解决方案之一，并且也兼容。如果这里的要求很明确，我看不到任何其他解决重复问题的答案，但如果你能确认，我可以增强这一点。我尝试了新的解决方案，它很有效，我尝试了

d第六版我更喜欢它因为它最快。。。“谢谢！”！！我尝试了新的解决方案，效果很好，我尝试了修订版6，我更喜欢它，因为它最快。。。“谢谢！”！！