使用awk的vlookup:如何将组/模式与id匹配
我正在使用awk,并希望为一个与主列表ID相匹配的组模式做类似于vlookup的事情使用awk的vlookup:如何将组/模式与id匹配,awk,vlookup,Awk,Vlookup,我正在使用awk,并希望为一个与主列表ID相匹配的组模式做类似于vlookup的事情 master.txt 1.1.1: name00 1.1.2: name01 1.1.3: name02 1.1.4: name03 1.1.5: name04 1.2.2: name05 1.2.3: name06 1.2.4: name07 1.2.5: name08 1.2.6: name09 1.3.13: name10 1.3.14: name11 1.3.15: name12 1.3.16: nam
master.txt
1.1.1: name00
1.1.2: name01
1.1.3: name02
1.1.4: name03
1.1.5: name04
1.2.2: name05
1.2.3: name06
1.2.4: name07
1.2.5: name08
1.2.6: name09
1.3.13: name10
1.3.14: name11
1.3.15: name12
1.3.16: name13
1.3.17: name14
1.1: groupvalue0
1.2: groupvalue1
1.3: groupvalue2
1.1: groupvalue10
1.3: groupvalue12
group.txt
1.1.1: name00
1.1.2: name01
1.1.3: name02
1.1.4: name03
1.1.5: name04
1.2.2: name05
1.2.3: name06
1.2.4: name07
1.2.5: name08
1.2.6: name09
1.3.13: name10
1.3.14: name11
1.3.15: name12
1.3.16: name13
1.3.17: name14
1.1: groupvalue0
1.2: groupvalue1
1.3: groupvalue2
1.1: groupvalue10
1.3: groupvalue12
每个组可以有多个值
组1.1
的相应主数据:
此处(^1.1.)
与以下内容匹配(仅从开始匹配)
同样适用于1.2
此处(^1.2.)
匹配以下内容(仅从开始匹配)
基于匹配,我希望结果为
name00: groupvalue0 # 1.1.1 just shown for reference
name01: groupvalue0 # 1.1.2 just shown for reference
name02: groupvalue0 # 1.1.3 just shown for reference
name03: groupvalue0 # 1.1.4 just shown for reference
name04: groupvalue0 # 1.1.5 just shown for reference
name05: groupvalue1 # 1.2.2 just shown for reference
name06: groupvalue1 # 1.2.3 just shown for reference
name07: groupvalue1 # 1.2.4 just shown for reference
name08: groupvalue1 # 1.2.5 just shown for reference
name09: groupvalue1 # 1.2.6 just shown for reference
name10: groupvalue2 # 1.3.13 just shown for reference
name11: groupvalue2 # 1.3.14 just shown for reference
name12: groupvalue2 # 1.3.15 just shown for reference
name13: groupvalue2 # 1.3.16 just shown for reference
name14: groupvalue2 # 1.3.17 just shown for reference
name00: groupvalue10 # 1.1.1 just shown for reference
name01: groupvalue10 # 1.1.2 just shown for reference
name02: groupvalue10 # 1.1.3 just shown for reference
name03: groupvalue10 # 1.1.4 just shown for reference
name04: groupvalue10 # 1.1.5 just shown for reference
name10: groupvalue12 # 1.3.13 just shown for reference
name11: groupvalue12 # 1.3.14 just shown for reference
name12: groupvalue12 # 1.3.15 just shown for reference
name13: groupvalue12 # 1.3.16 just shown for reference
name14: groupvalue12 # 1.3.17 just shown for reference
我正在尝试使用下面的awk代码。但是如何在数组中使用模式
BEGIN{
FS=": "
#print(var)
}
{
if(NR==FNR) # process first file only
{
a[$1]=$2; # hash to a array {id is key, name} value
next; # process next record without executing following code
} else
{ # process second file
pattern= "^"$1"\..*" # eg: ^1\.1\..*
# can i use pattern in array
print a[pattern]":",$2 # output name (the value of) from array a and property
}
} master.txt group.txt
EDIT1:因为OP对问题做了一些更改,所以在此添加了相应的解决方案
awk '
BEGIN{
OFS=":"
}
FNR==NR{
a[$1]=(a[$1]?a[$1]",":"")$2
next
}
{
sub(/\.[0-9]+:/,OFS,$1)
key=$1
}
(key in a){
delete array
num=split(a[key],array,",")
for(i=1;i<=num;i++){
printf("%s%s",$2 OFS " "array[i],i==num?ORS:" ")
}
}
' groups.txt masters.txt
说明:添加上述内容的详细说明
awk ' ##Starting awk program from here.
FNR==NR{ ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
a[$1]=$2 ##Creating array a with index of $1 and valus is $2.
next ##next will skip all statements from here.
}
{
key=substr($0,1,3)":" ##Creating key which has 1st 3 characters and colon here.
}
(key in a){ ##Checking key in array a if yes then do following.
print $2":\t"a[key] ##Printing 2nd field colon tab and value of a with key index here.
}
' groups.txt masters.txt ##Mentioning Input_file names here.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program from here.
OFS=":" ##Setting OFS as : here.
}
FNR==NR{ ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
a[$1]=$2 ##Creating array a with index $1 and value $2 here.
next ##next will skip all further statements from here.
}
{
sub(/\.[0-9]+:/,OFS,$1) ##Substituting dot digits and colon with OFS in 1st field.
key=$1 ##Creating variable key which hs 1st field in it.
}
(key in a){ ##Checking condition if key from current line is present in array a then do following.
print $2,"\t"a[key] ##Printing 2nd field tab and value of array a here.
}
' groups.txt masters.txt ##Mentioning Input_file names here.
第二种溶液:与第一种溶液略有不同
awk '
BEGIN{
OFS=":"
}
FNR==NR{
a[$1]=$2
next
}
{
sub(/\.[0-9]+:/,OFS,$1)
key=$1
}
(key in a){
print $2,"\t"a[key]
}
' groups.txt masters.txt
说明:添加上述内容的详细说明
awk ' ##Starting awk program from here.
FNR==NR{ ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
a[$1]=$2 ##Creating array a with index of $1 and valus is $2.
next ##next will skip all statements from here.
}
{
key=substr($0,1,3)":" ##Creating key which has 1st 3 characters and colon here.
}
(key in a){ ##Checking key in array a if yes then do following.
print $2":\t"a[key] ##Printing 2nd field colon tab and value of a with key index here.
}
' groups.txt masters.txt ##Mentioning Input_file names here.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program from here.
OFS=":" ##Setting OFS as : here.
}
FNR==NR{ ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
a[$1]=$2 ##Creating array a with index $1 and value $2 here.
next ##next will skip all further statements from here.
}
{
sub(/\.[0-9]+:/,OFS,$1) ##Substituting dot digits and colon with OFS in 1st field.
key=$1 ##Creating variable key which hs 1st field in it.
}
(key in a){ ##Checking condition if key from current line is present in array a then do following.
print $2,"\t"a[key] ##Printing 2nd field tab and value of array a here.
}
' groups.txt masters.txt ##Mentioning Input_file names here.
A、 稍有不同的做法:
awk -F": " '{ a[$1]=$2 }
END{ for (i in a) { x=substr(i,1,3);
if (x in a && x!=i) {
print a[i] ":", a[x] }
}
}' master.txt group.txt
我只是将所有信息添加到数组(
a
),然后查看是否有包含前3个位置的索引。即使您的输入包含其他:
s或空格,而不仅仅是示例中所示的空格,以下内容也会起作用:
$ cat tst.awk
BEGIN { OFS=": " }
{
group = substr($0,match($0,/[0-9]+\.[0-9]+/),RLENGTH)
name = substr($0,match($0,/[^:]+: */)+RLENGTH)
}
NR==FNR {
map[group,++cnt[group]] = name
next
}
{
for (i=1; i<=cnt[group]; i++) {
print map[group,i], name
}
}
不要使用单词
模式
,因为它模棱两可。使用字符串
或regexp
,无论您在每个上下文中的意思是什么,这将大大有助于您提出一个可靠的解决方案,因为这样您的需求和代码就不再是模糊的。什么是++cnt[group]
。你能把它展示得更不紧凑些吗++cnt[1.1]
与cnt[1.1]=cnt[1.1]+1相同所以通过使用多维数组映射[1.1,1]。。。映射[1.1,5],我们可以保存所有数据并在以后检索。此外,我们还知道每个组从cnt存储了多少数据[1.1],感谢您提供的解决方案。非常精彩的逻辑——