使用shell脚本进行文本过滤

使用shell脚本进行文本过滤,shell,text,awk,scripting,grep,Shell,Text,Awk,Scripting,Grep,我发现使用shell脚本很难过滤一些文本。 基本上,我登录到几个网络设备,并找到它们直接连接的邻居。然后我将这些结果导出到一个.txt文件中,如下所示: Host IP: 175.334.2.43 ------------------------- Device ID: first_device Entry address(es): IP address: 323.43.5.32 Platform: cisco 428, Capabilities: Router Switch I

我发现使用shell脚本很难过滤一些文本。 基本上,我登录到几个网络设备,并找到它们直接连接的邻居。然后我将这些结果导出到一个.txt文件中,如下所示:

    Host IP: 175.334.2.43

-------------------------
Device ID: first_device
Entry address(es):
  IP address: 323.43.5.32
Platform: cisco 428,  Capabilities: Router Switch IGMP
Interface: GigabitEthernet0/3,  Port ID (outgoing port): GigabitEthernet0/10
Holdtime : 130 sec


advertisement version: 2
Protocol Hello:  OUI=0x0fsdfs0C, Protocol ID=0x0fdf2; payload len=27, value=0dsgfjhb2CAE00FF0000
VTP Management Domain: ''
Native VLAN: 453
Duplex: full
Management address(es):
  IP address: 323.43.5.32

-------------------------
Device ID: second_device
Entry address(es):
  IP address: 323.43.5.398
Platform: cisco 428,  Capabilities: Router Switch IGMP
Interface: GigabitEthernet0/5,  Port ID (outgoing port): GigabitEthernet0/123
Holdtime : 130 sec


advertisement version: 2
Protocol Hello:  OUI=0x0fsdfs0C, Protocol ID=0x0fdf2; payload len=27, value=0dsgfjhb2CAE00FF0000
VTP Management Domain: ''
Native VLAN: 453
Duplex: full
Management address(es):
  IP address: 323.43.5.398

Host IP: 342.52.5.2

-------------------------
Device ID: third_device
Entry address(es):
  IP address: 32.43.15.32
Platform: cisco 428,  Capabilities: Router Switch IGMP
Interface: GigabitEthernet0/98,  Port ID (outgoing port): GigabitEthernet0/165
Holdtime : 130 sec


advertisement version: 2
Protocol Hello:  OUI=0x0fsdfs0C, Protocol ID=0x0fdf2; payload len=27, value=0dsgfjhb2CAE00FF0000
VTP Management Domain: ''
Native VLAN: 453
Duplex: full
Management address(es):
  IP address: 32.43.15.32

-------------------------
Device ID: fourth_device
Entry address(es):
  IP address: 0832.54.254.6
Platform: cisco 428,  Capabilities: Router Switch IGMP
Interface: GigabitEthernet0/543,  Port ID (outgoing port): GigabitEthernet0/16
Holdtime : 130 sec


advertisement version: 2
Protocol Hello:  OUI=0x0fsdfs0C, Protocol ID=0x0fdf2; payload len=27, value=0dsgfjhb2CAE00FF0000
VTP Management Domain: ''
Native VLAN: 453
Duplex: full
Management address(es):
  IP address: 0832.54.254.6
我想筛选此文件并按列组织它。我使用filter_res.sh脚本执行此操作:

#!/bin/bash
sed -e '/Management address(es):/{N;d;}' results.txt >results2.txt
grep "Host IP:" results2.txt | awk  '{print $3}' >host_ip.txt
grep "Device ID:.*" results2.txt | awk '{print $3 ","}' >dev_ids.txt
grep "IP address: " results2.txt | awk '{print $3 ","}' >cpe_ip.txt
grep "Platform: " results2.txt | awk '{print $2 $3}' >chassis.txt
grep "Interface:" results2.txt >interfaces.txt
awk '{print $7}' interfaces.txt >cpe_int.txt
awk '{print $2}' interfaces.txt >agg_int.txt
pr -mts' ' dev_ids.txt cpe_ip.txt chassis.txt agg_int.txt cpe_int.txt >final_results.txt
final_results.txt是可以的,只是我想在末尾添加最后一列,每行都有主机ip。这就是我得到的结果:

first_device, 323.43.5.32, cisco428, GigabitEthernet0/3, GigabitEthernet0/10
second_device, 323.43.5.398, cisco428, GigabitEthernet0/5, GigabitEthernet0/123
third_device, 32.43.15.32, cisco428, GigabitEthernet0/98, GigabitEthernet0/165
fourth_device, 0832.54.254.6, cisco428, GigabitEthernet0/543, GigabitEthernet0/16
我想要的是:

first_device, 323.43.5.32, cisco428, GigabitEthernet0/3, GigabitEthernet0/10, 175.334.2.43
second_device, 323.43.5.398, cisco428, GigabitEthernet0/5, GigabitEthernet0/123, 175.334.2.43
third_device, 32.43.15.32, cisco428, GigabitEthernet0/98, GigabitEthernet0/165, 342.52.5.2
fourth_device, 0832.54.254.6, cisco428, GigabitEthernet0/543, GigabitEthernet0/16, 342.52.5.2

您不需要所有这些中间步骤,而是将它们组合在一个
awk
脚本中。这里是一个黑客做它的方式,不建议长期使用,但也许你可以使用作为一个起点

$ awk -v RS="[-]+\n" -v c=',' '
              NR>1{print $3 c,$8 c,$10$11,$17,$22 c,hip} 
        /Host IP:/{hip=$NF}' file


first_device, 323.43.5.32, cisco428, GigabitEthernet0/3, GigabitEthernet0/10, 175.334.2.43
second_device, 323.43.5.398, cisco428, GigabitEthernet0/5, GigabitEthernet0/123, 175.334.2.43
third_device, 32.43.15.32, cisco428, GigabitEthernet0/98, GigabitEthernet0/165, 342.52.5.2
fourth_device, 0832.54.254.6, cisco428, GigabitEthernet0/543, GigabitEthernet0/16, 342.52.5.2

ps.由于多字符RS规范,需要
gawk

host_ip.txt
添加到
pr
命令的列文件列表时发生了什么?如果我添加了它,主机IP没有出现在正确的位置,它们应该在每个连接的设备上重复出现。
我发现使用shell脚本很难过滤一些文本。
-这很完美,因为shell脚本不是用于此目的的。Shell脚本用于操作文件和进程,并对工具调用进行排序。Awk脚本用于操作文本,因此在UNIX中,如果需要过滤(或以其他方式操作)文本,shell只需调用Awk即可。非常感谢!这正是我想做的!如果它解决了您的问题,请接受这个答案。无需将
-
放在括号表达式中,它在regexp中没有特殊意义,除非它放在括号表达式中。您应该提到,由于多字符,它是特定于gawk的。@Ed Morton,是的,我知道,但是
-+
看起来很有趣<代码>[-]是有效的正则表达式,因为它没有指定范围。添加了仅限于呆呆的注释。