使用awk的vlookup:如何将组/模式与id匹配

使用awk的vlookup:如何将组/模式与id匹配,awk,vlookup,Awk,Vlookup,我正在使用awk,并希望为一个与主列表ID相匹配的组模式做类似于vlookup的事情 master.txt 1.1.1: name00 1.1.2: name01 1.1.3: name02 1.1.4: name03 1.1.5: name04 1.2.2: name05 1.2.3: name06 1.2.4: name07 1.2.5: name08 1.2.6: name09 1.3.13: name10 1.3.14: name11 1.3.15: name12 1.3.16: nam

我正在使用awk,并希望为一个与主列表ID相匹配的组模式做类似于vlookup的事情

master.txt

1.1.1: name00
1.1.2: name01
1.1.3: name02
1.1.4: name03
1.1.5: name04
1.2.2: name05
1.2.3: name06
1.2.4: name07
1.2.5: name08
1.2.6: name09
1.3.13: name10
1.3.14: name11
1.3.15: name12
1.3.16: name13
1.3.17: name14
1.1: groupvalue0
1.2: groupvalue1
1.3: groupvalue2
1.1: groupvalue10
1.3: groupvalue12
group.txt

1.1.1: name00
1.1.2: name01
1.1.3: name02
1.1.4: name03
1.1.5: name04
1.2.2: name05
1.2.3: name06
1.2.4: name07
1.2.5: name08
1.2.6: name09
1.3.13: name10
1.3.14: name11
1.3.15: name12
1.3.16: name13
1.3.17: name14
1.1: groupvalue0
1.2: groupvalue1
1.3: groupvalue2
1.1: groupvalue10
1.3: groupvalue12
每个组可以有多个值

1.1
的相应主数据: 此处
(^1.1.)
与以下内容匹配(仅从开始匹配)

同样适用于
1.2
此处
(^1.2.)
匹配以下内容(仅从开始匹配)

基于匹配,我希望结果为

name00: groupvalue0  # 1.1.1 just shown for reference
name01: groupvalue0  # 1.1.2 just shown for reference
name02: groupvalue0  # 1.1.3 just shown for reference
name03: groupvalue0  # 1.1.4 just shown for reference
name04: groupvalue0  # 1.1.5 just shown for reference
name05: groupvalue1  # 1.2.2 just shown for reference
name06: groupvalue1  # 1.2.3 just shown for reference
name07: groupvalue1  # 1.2.4 just shown for reference
name08: groupvalue1  # 1.2.5 just shown for reference
name09: groupvalue1  # 1.2.6 just shown for reference
name10: groupvalue2  # 1.3.13 just shown for reference
name11: groupvalue2  # 1.3.14 just shown for reference
name12: groupvalue2  # 1.3.15 just shown for reference
name13: groupvalue2  # 1.3.16 just shown for reference
name14: groupvalue2  # 1.3.17 just shown for reference
name00: groupvalue10  # 1.1.1 just shown for reference
name01: groupvalue10  # 1.1.2 just shown for reference
name02: groupvalue10  # 1.1.3 just shown for reference
name03: groupvalue10  # 1.1.4 just shown for reference
name04: groupvalue10  # 1.1.5 just shown for reference
name10: groupvalue12  # 1.3.13 just shown for reference
name11: groupvalue12  # 1.3.14 just shown for reference
name12: groupvalue12  # 1.3.15 just shown for reference
name13: groupvalue12  # 1.3.16 just shown for reference
name14: groupvalue12  # 1.3.17 just shown for reference
我正在尝试使用下面的awk代码。但是如何在数组中使用模式

BEGIN{
    FS=": "
    #print(var)
}
{ 
  if(NR==FNR)         # process first file only
  {                    
    a[$1]=$2;           # hash to a array {id is key, name} value
    next;               # process next record without executing following code
  } else
  {                    # process second file
    pattern= "^"$1"\..*" # eg: ^1\.1\..*
    # can i use pattern in array
    print a[pattern]":",$2  # output name (the value of) from array a and property
  }

} master.txt group.txt
EDIT1:因为OP对问题做了一些更改,所以在此添加了相应的解决方案

awk '
BEGIN{
  OFS=":"
}
FNR==NR{
  a[$1]=(a[$1]?a[$1]",":"")$2
  next
}
{
  sub(/\.[0-9]+:/,OFS,$1)
  key=$1
}
(key in a){
  delete array
  num=split(a[key],array,",")
  for(i=1;i<=num;i++){
    printf("%s%s",$2 OFS " "array[i],i==num?ORS:" ")
  }
}
' groups.txt masters.txt
说明:添加上述内容的详细说明

awk '                         ##Starting awk program from here.
FNR==NR{                      ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
  a[$1]=$2                    ##Creating array a with index of $1 and valus is $2.
  next                        ##next will skip all statements from here.
}
{
  key=substr($0,1,3)":"       ##Creating key which has 1st 3 characters and colon here.
}
(key in a){                   ##Checking key in array a if yes then do following.
  print $2":\t"a[key]         ##Printing 2nd field colon tab and value of a with key index here.
}
' groups.txt masters.txt      ##Mentioning Input_file names here.
awk '                           ##Starting awk program from here.
BEGIN{                          ##Starting BEGIN section of this program from here.
  OFS=":"                       ##Setting OFS as : here.
}
FNR==NR{                        ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
  a[$1]=$2                      ##Creating array a with index $1 and value $2 here.
  next                          ##next will skip all further statements from here.
}
{
  sub(/\.[0-9]+:/,OFS,$1)       ##Substituting dot digits and colon with OFS in 1st field.
  key=$1                        ##Creating variable key which hs 1st field in it.
}
(key in a){                     ##Checking condition if key from current line is present in array a then do following.
  print $2,"\t"a[key]           ##Printing 2nd field tab and value of array a here.
}
' groups.txt masters.txt        ##Mentioning Input_file names here.


第二种溶液:与第一种溶液略有不同

awk '
BEGIN{
  OFS=":"
}
FNR==NR{
  a[$1]=$2
  next
}
{
  sub(/\.[0-9]+:/,OFS,$1)
  key=$1
}
(key in a){
  print $2,"\t"a[key]
}
' groups.txt masters.txt
说明:添加上述内容的详细说明

awk '                         ##Starting awk program from here.
FNR==NR{                      ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
  a[$1]=$2                    ##Creating array a with index of $1 and valus is $2.
  next                        ##next will skip all statements from here.
}
{
  key=substr($0,1,3)":"       ##Creating key which has 1st 3 characters and colon here.
}
(key in a){                   ##Checking key in array a if yes then do following.
  print $2":\t"a[key]         ##Printing 2nd field colon tab and value of a with key index here.
}
' groups.txt masters.txt      ##Mentioning Input_file names here.
awk '                           ##Starting awk program from here.
BEGIN{                          ##Starting BEGIN section of this program from here.
  OFS=":"                       ##Setting OFS as : here.
}
FNR==NR{                        ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
  a[$1]=$2                      ##Creating array a with index $1 and value $2 here.
  next                          ##next will skip all further statements from here.
}
{
  sub(/\.[0-9]+:/,OFS,$1)       ##Substituting dot digits and colon with OFS in 1st field.
  key=$1                        ##Creating variable key which hs 1st field in it.
}
(key in a){                     ##Checking condition if key from current line is present in array a then do following.
  print $2,"\t"a[key]           ##Printing 2nd field tab and value of array a here.
}
' groups.txt masters.txt        ##Mentioning Input_file names here.

A、 稍有不同的做法:

awk -F": " '{ a[$1]=$2 }
            END{ for (i in a) { x=substr(i,1,3); 
                                if (x in a && x!=i) { 
                                   print a[i] ":", a[x] }
                              }
            }' master.txt group.txt

我只是将所有信息添加到数组(
a
),然后查看是否有包含前3个位置的索引。

即使您的输入包含其他
s或空格,而不仅仅是示例中所示的空格,以下内容也会起作用:

$ cat tst.awk
BEGIN { OFS=": " }
{
    group = substr($0,match($0,/[0-9]+\.[0-9]+/),RLENGTH)
    name  = substr($0,match($0,/[^:]+: */)+RLENGTH)
}
NR==FNR {
    map[group,++cnt[group]] = name
    next
}
{
    for (i=1; i<=cnt[group]; i++) {
        print map[group,i], name
    }
}

不要使用单词
模式
,因为它模棱两可。使用
字符串
regexp
,无论您在每个上下文中的意思是什么,这将大大有助于您提出一个可靠的解决方案,因为这样您的需求和代码就不再是模糊的。什么是
++cnt[group]
。你能把它展示得更不紧凑些吗
++cnt[1.1]
cnt[1.1]=cnt[1.1]+1相同