基于公共密钥在awk中合并3个文件
3个文件:基于公共密钥在awk中合并3个文件,awk,Awk,3个文件: n.txt id-3a,oc-ctrl-jr-0,ACTIVE,-,Running,cp=172.31.0.7 id-5e,oc-ctrl-jr-1,ACTIVE,-,Running,cp=172.31.0.6 id-5f,oc-ctrl-jr-2,ACTIVE,-,Running,cp=172.31.0.5 id-0,oc-comp-jr-0,ACTIVE,-,Running,cp=172.31.0.9 id-77,oc-comp-jr-1,ACTIVE,-,Running,cp=
n.txt
id-3a,oc-ctrl-jr-0,ACTIVE,-,Running,cp=172.31.0.7
id-5e,oc-ctrl-jr-1,ACTIVE,-,Running,cp=172.31.0.6
id-5f,oc-ctrl-jr-2,ACTIVE,-,Running,cp=172.31.0.5
id-0,oc-comp-jr-0,ACTIVE,-,Running,cp=172.31.0.9
id-77,oc-comp-jr-1,ACTIVE,-,Running,cp=172.31.0.8
bm.txt
server-10,id-77,power on,active,False
server-2,id-5f,power on,active,False
server-32,id-3a,power on,active,False
server-11,id-5e,power on,active,False
server-25,id-0,power on,active,False
第三个文件由bm.txt中每行的部分组成:
hosts.yaml
[..]
- arch: x86_64
zone: foo
cpu: 1
disk: 10
hw_model_type:
- bar-8
mac:
- aa:aa:aa:aa:aa:aa:aa
memory: 4096
name: server-32
desc: 'my host'
ip_addr: 192.168.117.33
info: false
type: bla
[..]
所需输出:
n name,b name,n power,n desc,n state,n cp,bm power,bm state,b error,ip
oc-ctrl-jr-0,server-32,ACTIVE,-,Running,cp=172.31.0.7,power on,active,False,192.168.117.33
oc-ctrl-jr-1,server-11,ACTIVE,-,Running,cp=172.31.0.6,power on,active,False,192.168.117.47
oc-ctrl-jr-2,server-2,ACTIVE,-,Running,cp=172.31.0.5,power on,active,False,192.168.117.87
oc-comp-jr-0,server-25,ACTIVE,-,Running,cp=172.31.0.9,power on,active,False,192.168.117.111
oc-comp-jr-1,server-10,ACTIVE,-,Running,cp=172.31.0.8,power on,active,False,192.168.117.3
我可以使用此代码连接前两个文件,但列的顺序与所需的不同:
awk -F, 'BEGIN{print"N Name,N Power,N Desc,N State,N CP,BM Name,BM Power,BM State,BM Error"}
NR==FNR{OFS=",";a[$2]=$1 OFS $3 OFS $4 OFS $5; next}
$1 in a {print $2,$3,$4,$5,$6,a[$1]}' bm.txt n.txt
但我不知道如何重新排序,也不知道如何将第三个文件解析添加到主代码中。
我可以单独解析第三个文件,如下所示:
awk '$0~"name: server-32$"{getline;getline;print $NF}' hosts.yaml
我会很感激如何获得所需的输出。以及如何改进当前代码的任何建议
谢谢
输出:
oc-ctrl-jr-1,server-11,ACTIVE,-,Running,cp=172.31.0.6,power on,active,False,
oc-ctrl-jr-2,server-2,ACTIVE,-,Running,cp=172.31.0.5,power on,active,False,
oc-comp-jr-0,server-25,ACTIVE,-,Running,cp=172.31.0.9,power on,active,False,
oc-ctrl-jr-0,server-32,ACTIVE,-,Running,cp=172.31.0.7,power on,active,False,192.168.117.33
oc-comp-jr-1,server-10,ACTIVE,-,Running,cp=172.31.0.8,power on,active,False,
有了GNU
awk
,您可以试一下以下用所示样本编写和测试的代码吗
awk '
ARGIND==1{
if($0~/server-[0-9]+/){
foundServer=1
serverName=$2
}
if(foundServer && $0 ~ /ip_addr:/){
servername[serverName]=$2
serverName=foundServer=""
}
next
}
ARGIND==2{
if(setFS==""){ FS=OFS=",";setFS=1 }
server[$2]=$1
powerarr[$2]=$3 OFS $4 OFS $5
next
}
ARGIND==3{
if($1 in server){
print $2 OFS server[$1] OFS $3 OFS $4 OFS $5 OFS $6 OFS powerarr[$1],(servername[server[$1]]!=""?servername[server[$1]]:"NA")
}
}
' hosts.yaml bm.txt n.txt
说明:为上述内容添加详细说明
awk '
##Starting awk program from here.
ARGIND==1{
##Checking condition if this is first Input_file then do following.
if($0~/server-[0-9]+/){
##Checking condition if line has server with digits then do following.
foundServer=1
##Setting foundServer to 1 here.
serverName=$2
##Setting serverName to 2nd field which is value from yaml file.
}
if(foundServer && $0 ~ /ip_addr:/){
##Checking condition if foundServer is SET and line has ip_addr in it then do following.
servername[serverName]=$2
##Creating servername array with index of serverName with value of 2nd field.
serverName=foundServer=""
##Nullifying serverName and foundServer here.
}
next
##next will skip all further statements from here.
}
ARGIND==2{
##Checking condition if this is 2nd Input_file is being read then do following.
if(setFS==""){ FS=OFS=",";setFS=1 }
##Checking condition if setFS is NULL then set FS and OFS as comma here and setting setFS to 1.
server[$2]=$1
##Creating server with index of 2nd field which has 1st field as value.
powerarr[$2]=$3 OFS $4 OFS $5
##Creating powerarr with index as 2nd field and $3 OFS $4 OFS $5 as value.
next
##next will skip all further statements from here.
}
ARGIND==3{
##Checking condition if this is 3rd Input_file is being read then do following.
if($1 in server){
##Checking condition if 1st field is present in server then do following.
print $2 OFS server[$1] OFS $3 OFS $4 OFS $5 OFS $6 OFS powerarr[$1],(servername[server[$1]]!=""?servername[server[$1]]:"NA")
##Printing needed values as per OP here.
}
}
' hosts.yaml bm.txt n.txt ##Mentioning Input_file names here.
这些文件有多大 如果总容量小于3-4GB,我可能会建议将每个文件全部加载到awk中,并利用基本上已经是DB索引的关联数组 可能比试图跟上多路文件连接的步伐要轻松一些。除非Unicode是一个问题,否则GEnSub()是必不可少的,考虑在MAWK1.3.4和MAWK2β中运行由其他人建议的那些精确代码。
在我的日常使用案例中,看到恐龙时代的脚本语言在自己的游戏中击败tr、cut、sed、grep,真是太离奇了。它们相对较小。通常少于100行。以上建议的答案很有效
awk '
##Starting awk program from here.
ARGIND==1{
##Checking condition if this is first Input_file then do following.
if($0~/server-[0-9]+/){
##Checking condition if line has server with digits then do following.
foundServer=1
##Setting foundServer to 1 here.
serverName=$2
##Setting serverName to 2nd field which is value from yaml file.
}
if(foundServer && $0 ~ /ip_addr:/){
##Checking condition if foundServer is SET and line has ip_addr in it then do following.
servername[serverName]=$2
##Creating servername array with index of serverName with value of 2nd field.
serverName=foundServer=""
##Nullifying serverName and foundServer here.
}
next
##next will skip all further statements from here.
}
ARGIND==2{
##Checking condition if this is 2nd Input_file is being read then do following.
if(setFS==""){ FS=OFS=",";setFS=1 }
##Checking condition if setFS is NULL then set FS and OFS as comma here and setting setFS to 1.
server[$2]=$1
##Creating server with index of 2nd field which has 1st field as value.
powerarr[$2]=$3 OFS $4 OFS $5
##Creating powerarr with index as 2nd field and $3 OFS $4 OFS $5 as value.
next
##next will skip all further statements from here.
}
ARGIND==3{
##Checking condition if this is 3rd Input_file is being read then do following.
if($1 in server){
##Checking condition if 1st field is present in server then do following.
print $2 OFS server[$1] OFS $3 OFS $4 OFS $5 OFS $6 OFS powerarr[$1],(servername[server[$1]]!=""?servername[server[$1]]:"NA")
##Printing needed values as per OP here.
}
}
' hosts.yaml bm.txt n.txt ##Mentioning Input_file names here.