Awk grep文件中的ip和端口_Awk_Sed_Grep

Awk grep文件中的ip和端口

awk sed grep

Awk grep文件中的ip和端口,awk,sed,grep,Awk,Sed,Grep,但它不起作用（第二种模式与其他pbs中的第一种模式相匹配）它只能单独工作 egrep -oP '([0-9]{1,3}\.){3}[0-9]{1,3} | [0-9]{2,5}' file.txt ->工作，但不能摆脱'在开始和结束；如果我在这个模型中删除它们，它也会与ips匹配，这是我不想要的我还尝试： egrep -oP '([0-9]{1,3}\.){3}[0-9]{1,3}' file.txt grep -oP "'[0-9]{2,5}'" file.txt sed's/\\

但它不起作用（第二种模式与其他pbs中的第一种模式相匹配）

它只能单独工作

egrep -oP '([0-9]{1,3}\.){3}[0-9]{1,3} | [0-9]{2,5}' file.txt

->工作，但不能摆脱'在开始和结束；如果我在这个模型中删除它们，它也会与ips匹配，这是我不想要的

我还尝试：

egrep -oP '([0-9]{1,3}\.){3}[0-9]{1,3}' file.txt
grep -oP "'[0-9]{2,5}'" file.txt

sed's/\\document\.write\（\'//g'file.txt）\'//g'

这里的想法是修剪ip和端口前后的所有垃圾

所需结果：

ip0端口0（我将把结果存储在稍后用于ssh连接的数组中）

ip1端口1

ip2端口2

…

您可以尝试以下方法：

sed 's/                    \<td\>\<script\>                            document\.write\(\'//g' file.txt | sed 's/\'\)\<\/script\>\<\/td\>'//g'

但是，请注意，这通常非常容易出错。如果您的ip端口线顺序将在序列中的某个位置交换，那么它将全部中断

一般来说，对于解析HTML文件，您可以使用其他更适合的语言，如python和

更简单的版本，没有单引号转义：

$ cat ipport.txt  | sed 's/.*write('"'"'//g' | sed 's/'"'"').*//g' | while read -r ip && read -r port; do echo "$ip $port"; done
89.223.92.30 9027
185.204.3.105 1081
91.238.137.108 8975

假设：

只对包含
```
document.write
```
的行感兴趣（即，我们不知道文件中的其他行是什么样子，但我们可以安全地忽略它们）
每个ip/端口对都位于文件中连续的“document.write”行上
每个
```
ip
```
值都是有效的IPv4地址

我们不必担心带有

文档的行上的任何其他类型的数据。在第一组单引号（“
）之间写入和值


我们的示例数据文件：
cat ipport.txt  | sed "s/.*write('//g" | sed "s/').*//g" | while read -r ip && read -r port; do echo "$ip $port"; done

其中：

-F“”
-使用单引号（”
）作为字段分隔符
/document.write/
-我们只对字符串为“document.write”的行感兴趣；忽略所有其他行
$2~/[0-9]+..[0-9]+/
-如果第二个字段是一个由句点（'.'）分隔的4元组数字，我们将$2保存为当前的ip
值
next
-一旦有了ip
值，我们将跳到输入文件中的下一行
$2！~/[.]／ >如果第二个字段不包含一个周期，那么我们将考虑这个端口号

打印ip，$2
-将我们的ip和端口值打印到标准输出

针对我们的数据文件（ip.dat
）运行上述awk
脚本会生成：
$ awk -F"'" '
/document.write/ && $2  ~ /[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+/ { ip=$2 ; next }
/document.write/ && $2 !~ /[.]/                               { print ip,$2  }
' ip.dat

尝试此awk
脚本：
89.223.92.30 9027
185.204.3.105 1081
91.238.137.108 8975

或
输入文件
说明：
@@favoretti:你能解释一下吗：““”@achille:摆脱了”
，因为我习惯了用单引号引用sed表达式。添加了一个更简单的版本…@achille如果你收到一条错误消息，你是不是碰巧在Mac上？如果是这样-brew安装gnu sed
，并在这两种情况下用gsed
替换sed。
$ awk -F"'" '
/document.write/ && $2  ~ /[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+/ { ip=$2 ; next }
/document.write/ && $2 !~ /[.]/                               { print ip,$2  }
' ip.dat

89.223.92.30 9027
185.204.3.105 1081
91.238.137.108 8975

awk -F "(^[^']*')|('[^']*$)" 'NR%2 {v = $2; next;}{print v OFS $2}' input.txt

awk -F "(^[^']*')|('[^']*$)" 'NR%2 {v = $2; next;}{print $2 OFS $2}' input.txt

        <td><script>                            document.write('89.223.92.30')</script></td>
        <td><script>                            document.write('9027')</script></td>
        <td><script>                            document.write('185.204.3.105')</script></td>
        <td><script>                            document.write('1081')</script></td>
        <td><script>                            document.write('91.238.137.108')</script></td>
        <td><script>                            document.write('8975')</script></td>

89.223.92.30 9027
185.204.3.105 1081
91.238.137.108 8975

BEGIN { # pre processig command
    FS = "(^[^']*')|('[^']*$)"; # set field separator to string outside  quote '
    # FS internal variable equivalent to awk argument -F
}
NR % 2 == 1 { # for each odd input line
    v = $2; # save 2nd field in variable v
    next; # skip processing to next line (even input line)
}
{ # for each even input line
    print v OFS $2; # print the saved variable v, right append current 2nd field
}