awk sed perl基于关键字匹配合并行
由于我有限的awk/sed魔法,我在这个问题上一直在碰壁。我很乐意使用awk、sed、bash、perl或其他工具来完成这个文本操作 我有以下输出,并希望基于某种键匹配合并行:awk sed perl基于关键字匹配合并行,perl,bash,awk,sed,Perl,Bash,Awk,Sed,由于我有限的awk/sed魔法,我在这个问题上一直在碰壁。我很乐意使用awk、sed、bash、perl或其他工具来完成这个文本操作 我有以下输出,并希望基于某种键匹配合并行: Node: server1 Active Server: SECONDARY Standby Server: PRIMARY Primary 192.168.1.1 Secondary 192.168.1.2 Node: server2 Active Server: PRIMARY Standby Se
Node: server1
Active Server: SECONDARY
Standby Server: PRIMARY
Primary 192.168.1.1
Secondary 192.168.1.2
Node: server2
Active Server: PRIMARY
Standby Server: SECONDARY
Primary 10.1.1.1
Secondary 10.1.1.2
期望输出:
Node: server1
Active Server: Secondary 192.168.1.2
Standby Server: Primary 192.168.1.1
Node: server2
Active Server: Primary 10.1.1.1
Standby Server: Secondary 10.1.1.2
所以我需要根据单词“primary”和“secondary”合并行。我的第一个想法是将“Primary”改为“Primary”,这样更容易匹配
我的最终目标是:
server1,Active,192.168.1.2,Standby,192.168.1.1
server2,Active,10.1.1.1,Standy,10.1.1.2
(但我可以在帮助合并行后找出这一部分)
谢谢你的帮助 您可以使用此
awk
awk -v RS="" '{$5=tolower($5);sub(".",substr(toupper($5),1,1),$5);$8=tolower($8);sub(".",substr(toupper($8),1,1),$8);print $1,$2"\n"$3,$4,$5,$10"\n",$6,$7,$8,$12}' file
Node: server1
Active Server: Secondary 192.168.1.1
Standby Server: Primary 192.168.1.2
Node: server2
Active Server: Primary 10.1.1.1
Standby Server: Secondary 10.1.1.2
通过将
RS
设置为无,awk
与一组线一起工作。它是密集且非常难看的多线
perl -00 -nE'
s/ ^(\w+)\s+([\d.]+)\s* / $s{$1}=$2; ""/xmge;
($l=$_) =~ s! \s*\w+:\s*|\n !,!xg;
$l =~ s|\U$_|$s{$_}| for keys %s;
($_=$l) =~ s/^,|,$//g;
say
' file
输出
server1,Active,192.168.1.2,Standby,192.168.1.1
server2,Active,10.1.1.1,Standby,10.1.1.2
解释
# -00 => instead of single line read lines into $_ until \n\n+
perl -00 -nE'
# read and remove 'Primary|Secondary IP' into $s{Primary} = IP
s/ ^(\w+)\s+([\d.]+)\s* / $s{$1}=$2; ""/xmge;
# replace 'something:' or new line by ','
($l=$_) =~ s! \s*\w+:\s*|\n !,!xg;
# replace SECONDARY|PRIMARY with actual IP address
$l =~ s|\U$_|$s{$_}| for keys %s;
# remove ',' at beginning and end of the string
($_=$l) =~ s/^,|,$//g;
# print result
say
' file
或使用一个衬垫作为所需的中间溶液(最终溶液): 产出:
Node: server1
Active Server: Secondary 192.168.1.2
Standby Server: Primary 192.168.1.1
Node: server2
Active Server: Primary 10.1.1.1
Standby Server: Secondary 10.1.1.2
server1,Active,192.168.1.2,Standby,192.168.1.1
server2,Active,10.1.1.1,Standby,10.1.1.2
说明:
- 开关:
:以段落模式处理输入(由双返回分隔)-00
:启用行结束处理-l
:假设-p
循环程序“while(){…;print;}”
:评估perl代码-e
- 代码:
- 用以相同键开头的匹配行替换所有服务器值
- 删除底部的服务器列表
-n
而不是-p
,因为我们希望从记录之间的两个换行移动到一个新行。但是,正则表达式工具是相同的:
perl -00 -ne'
s/ Server: (\w+)(?=.*^\1\s+(\S+))/:$2/ismg;
s/\n[^:]+$//;
s/^Node: //;
s/[\n:]/,/g;
print "$_\n";
' file.txt
产出:
Node: server1
Active Server: Secondary 192.168.1.2
Standby Server: Primary 192.168.1.1
Node: server2
Active Server: Primary 10.1.1.1
Standby Server: Secondary 10.1.1.2
server1,Active,192.168.1.2,Standby,192.168.1.1
server2,Active,10.1.1.1,Standby,10.1.1.2
这个Perl解决方案似乎满足了您的要求。它只需逐行将值拉入散列,并在所有必需值都存在时转储散列内容 Update我使用了
any
from来代替grep
,以使代码更清晰
use strict;
use warnings;
use autodie;
use List::Util 'any';
my @names = qw/ node active standby primary secondary /;
open my $fh, '<', 'myfile.txt';
my %server;
while (my $line = <$fh>) {
next unless my ($key, $val) = lc($line) =~ /(\w+).*\s+(\S+)/;
%server = () if $key eq 'server';
$server{$key} = $val;
unless ( any { not exists $server{$_} } @names ) {
printf "%s,Active,%s,Standby,%s\n", @server{'node', $server{active}, $server{standby}};
%server = ();
}
}
再详细一点:
use strict;
use warnings;
use feature qw/say/;
my $struct;
local $/ = 'Node: ';
for my $record (<DATA>) {
next if $record =~ /^Node:/; # skip first
my ($node, @values) = split /\n\s*/, $record;
for my $line (@values) {
my ($intent, $actual, $ip);
if ( ($intent, $actual) = $line =~ /(Active|Standby) Server: (.*)$/ ) {
$struct->{$node}{lc($intent)} = lc($actual);
}
elsif ( ($actual, $ip) = $line =~ /(Primary|Secondary) (.*)$/ ) {
$struct->{$node}{lc($actual)} = $ip;
}
}
}
for my $node (sort keys %$struct) {
printf "Node: %s\n", $node;
printf "Active server: %s %s\n", ucfirst $struct->{$node}{active}, $struct->{$node}{$struct->{$node}{active}};
printf "Standby server: %s %s\n", ucfirst $struct->{$node}{standby}, $struct->{$node}{$struct->{$node}{standby}};
print "\n";
}
## Desired final output is simpler:
for my $node (sort keys %$struct) {
say join ',', $node, 'Active', $struct->{$node}{$struct->{$node}{active}}, 'Standby', $struct->{$node}{$struct->{$node}{standby}};
}
__DATA__
Node: server1
Active Server: SECONDARY
Standby Server: PRIMARY
Primary 192.168.1.1
Secondary 192.168.1.2
Node: server2
Active Server: PRIMARY
Standby Server: SECONDARY
Primary 10.1.1.1
Secondary 10.1.1.2
使用严格;
使用警告;
使用功能qw/say/;
我的$struct;
本地$/='节点:';
我的$record(){
如果$record=~/^Node://#则下一步跳过
我的($node,@values)=拆分/\n\s*/,$record;
对于我的$line(@values){
我的($intent,$actual,$ip);
如果($intent,$actual)=$line=~/(活动|备用)服务器:(.*)$/){
$struct->{$node}{lc($intent)}=lc($actual);
}
elsif(($actual,$ip)=$line=~/(Primary | Secondary)(.*)$/){
$struct->{$node}{lc($actual)}=$ip;
}
}
}
对于我的$node(排序键%$struct){
printf“节点:%s\n”,$Node;
printf“活动服务器:%s%s\n”,ucfirst$struct->{$node}{Active},$struct->{$node}{$struct->{$node}{Active}};
printf“备用服务器:%s%s\n”,ucfirst$struct->{$node}{Standby},$struct->{$node}{$struct->{$node}{Standby};
打印“\n”;
}
##期望的最终输出更简单:
对于我的$node(排序键%$struct){
说join',',$node,'Active',$struct->{$node}{$struct->{$node}{Active}},'Standby',$struct->{$node}{$struct->{$node}{Standby};
}
__资料__
节点:server1
活动服务器:辅助服务器
备用服务器:主服务器
主要192.168.1.1
次要192.168.1.2
节点:server2
活动服务器:主服务器
备用服务器:辅助服务器
小学10.1.1.1
中学10.1.1.2
awk中有一个选项
#!/usr/bin/awk -f
# Output processing goes in a function, as it's called from different places
function spew() {
split(servers[d["active"]], active);
split(servers[d["standby"]], standby);
printf("%s,%s,%s,%s,%s\n",
d["name"], active[1], active[2], standby[1], standby[2]);
}
# trim unnecessary (leading) whitespace
1 { $1=$1; }
# Store our references
$1=="Active" {
d["active"]=tolower($3);
}
#
$1=="Standby" {
d["standby"]=tolower($3);
}
# And store our data
/^ *[A-za-z]+ [0-9.]+$/ {
servers[tolower($1)]=tolower($0);
}
# Then, if we hit a new record, process the last one.
$1=="Node:" && length(d["name"]) {
spew();
}
# And if we've just process a record, clear our workspace.
$1=="Node:" {
delete d;
delete s;
d["name"]=$2;
}
# Finally, process the last record.
END {
spew();
}
与其他一些解决方案相比,这种方法的一个优点是它可以处理“主”和“次”以外的名称。我们的想法是,如果您有以下数据:
Node: serverN
Active Server: starfleet
Standby Server: babylon5
starfleet 172.16.0.1
babylon5 172.16.0.2
主动/备用行将通过其索引引用记录,而不是假定为“主要”或“次要”
我已经将所有内容规范化为小写,以便于处理,但您当然可以调整tolower()
awk ' s==0{print;s=1;next;}
s==1{i=$0;s=2;next;}
s==2{j=$0;s=3;next;}
s==3{r1=$0;s=4;next;}
s==4{r2=$0;
sub(/SECONDARY/,r2,i);sub(/PRIMARY/,r1,j);
sub(/SECONDARY/,r2,j);sub(/PRIMARY/,r1,i);
s=5; print i;print j;next}
s==5{s=0;print}' input.txt
输出:
Node: server1
Active Server: Secondary 192.168.1.2
Standby Server: Primary 192.168.1.1
Node: server2
Active Server: Primary 10.1.1.1
Standby Server: Secondary 10.1.1.2
打印当前输入部分的第一行,将接下来的四行存储在变量中,然后进行替换,然后打印结果。然后读取并打印空白行,然后再次开始下一节。
awk '
$1 == "Active" {active = tolower($NF); next}
$1 == "Standby" {standby = tolower($NF); next}
$1 == "Primary" {ip["primary"] = $0; next}
$1 == "Secondary" {
ip["secondary"] = $0
print "Active Server:",ip[active]
print "Standby Server:",ip[standby]
next
}
1
'
这假定“辅助”行位于“块”的末尾
要实现下一个输出:
awk -v OFS="," '
$1 == "Node:" {node = $NF}
$1 == "Active" {active = tolower($NF)}
$1 == "Standby" {standby = tolower($NF)}
$1 == "Primary" {ip["primary"] = $2}
$1 == "Secondary" {
ip["secondary"] = $2;
print node, "Active",ip[active],"Standup",ip[standby]
}
'
回应jhill的评论:
awk -v RS="" -v OFS=, '{
node = active = standby = ""
delete ip
for (i=1; i<NF; i++) {
if ($i == "Node:") {node=$(++i)}
else if ($i == "Active") {active = tolower( $(i+=2) )}
else if ($i == "Standby") {standby = tolower( $(i+=2) )}
else if ($i == "Primary") {ip["primary"] = $(++i)}
else if ($i == "Secondary") {ip["secondary"] = $(++i)}
}
print node, "Active", ip[active], "Standup", ip[standby]
}'
awk-vrs=“”-vofs=,'{
node=active=standby=“”
删除ip
对于(i=1;i您可以使用tr
消除空格,然后使用sed
将其放回正确的位置,并使用perl
获得所需的输出:
输入文件:
tiago@dell:/tmp$ cat file
Node: server1
Active Server: SECONDARY
Standby Server: PRIMARY
Primary 192.168.1.1
Secondary 192.168.1.2
Node: server2
Active Server: PRIMARY
Standby Server: SECONDARY
Primary 10.1.1.1
Secondary 10.1.1.2
脚本:
tiago@dell:/tmp$ cat test.sh
#! /bin/bash
tr -d '\n' < $1 | sed -r 's/(Node:)/\n\1/g' |\
perl -lne '
/^\s+$/ && next;
/Node:\s+(\w+.*?)\s/ && {$server=$1};
/Active Server:\s+(\w+.*?)\s/ && {$active=$1};
/Standby Server:\s+(\w+.*?)\s/ && {$standby=$1};
/Primary\s+(\w+.*?)\s/ && {$pri=$1};
/Secondary\s+(\w+.*?)\s/ && {$sec=$1};
if ( "$active" eq "PRIMARY" ){
$out="$server,Active,$pri,Standby,$sec";
}else{
$out="$server,Active,$sec,Standby,$pri";
}
print $out;
'
再次观察问题中所需的输出。嘿,Jotne,感谢您的输入。但是,这不会基于键“Primary”或“Secondary”合并行@user3574338 Fixed:)为了像我这样的Perl新手的利益,你能解释一下这里发生了什么吗?我认为这可以从一些解释中受益,因为似乎有很多事情在发生!@TomFenech你可能是对的。我会看看我能做些什么。你测试过吗?据我所知,只有一个$h{primary}
和一个$h{secondary}
,因此所有服务器都将使用同一对IP显示addresses@Borodin是的,我进行了测试。每个节点只能有一个主节点和辅助节点,但OP就是这样表示数据的。@Borodin将我的实现方法更改为使用前瞻断言以使其更简单,因此它与我用于完整解决方案的相同方法相匹配ion.这是Avinash的棘手部分!这就是为什么我需要基于公共密钥进行匹配相当自我记录,尽管那条grep
行花了一点时间…你需要做%server=()
两次?@TomFenech:不,如果数据文件是可靠的,就不需要了。我只是在为里面的任何奇怪东西投保。@TomFenech:如果(@names==grep{exists$server{$}}@names){..}
,你会更喜欢吗
tiago@dell:/tmp$ bash test.sh file
server1,Active,192.168.1.2,Standby,192.168.1.1
server2,Active,10.1.1.1,Standby,192.168.1.2