使用awk的vlookup：如何将组/模式与id匹配_Awk_Vlookup

使用awk的vlookup：如何将组/模式与id匹配

awk

使用awk的vlookup：如何将组/模式与id匹配,awk,vlookup,Awk,Vlookup,我正在使用awk，并希望为一个与主列表ID相匹配的组模式做类似于vlookup的事情 master.txt 1.1.1: name00 1.1.2: name01 1.1.3: name02 1.1.4: name03 1.1.5: name04 1.2.2: name05 1.2.3: name06 1.2.4: name07 1.2.5: name08 1.2.6: name09 1.3.13: name10 1.3.14: name11 1.3.15: name12 1.3.16: nam

我正在使用awk，并希望为一个与主列表ID相匹配的组模式做类似于vlookup的事情

master.txt

1.1.1: name00
1.1.2: name01
1.1.3: name02
1.1.4: name03
1.1.5: name04
1.2.2: name05
1.2.3: name06
1.2.4: name07
1.2.5: name08
1.2.6: name09
1.3.13: name10
1.3.14: name11
1.3.15: name12
1.3.16: name13
1.3.17: name14

1.1: groupvalue0
1.2: groupvalue1
1.3: groupvalue2
1.1: groupvalue10
1.3: groupvalue12

group.txt

1.1.1: name00
1.1.2: name01
1.1.3: name02
1.1.4: name03
1.1.5: name04
1.2.2: name05
1.2.3: name06
1.2.4: name07
1.2.5: name08
1.2.6: name09
1.3.13: name10
1.3.14: name11
1.3.15: name12
1.3.16: name13
1.3.17: name14

1.1: groupvalue0
1.2: groupvalue1
1.3: groupvalue2
1.1: groupvalue10
1.3: groupvalue12

每个组可以有多个值

组

1.1

的相应主数据：此处

（^1.1.）

与以下内容匹配（仅从开始匹配）

同样适用于

1.2

此处

（^1.2.）

匹配以下内容（仅从开始匹配）

基于匹配，我希望结果为

name00: groupvalue0  # 1.1.1 just shown for reference
name01: groupvalue0  # 1.1.2 just shown for reference
name02: groupvalue0  # 1.1.3 just shown for reference
name03: groupvalue0  # 1.1.4 just shown for reference
name04: groupvalue0  # 1.1.5 just shown for reference
name05: groupvalue1  # 1.2.2 just shown for reference
name06: groupvalue1  # 1.2.3 just shown for reference
name07: groupvalue1  # 1.2.4 just shown for reference
name08: groupvalue1  # 1.2.5 just shown for reference
name09: groupvalue1  # 1.2.6 just shown for reference
name10: groupvalue2  # 1.3.13 just shown for reference
name11: groupvalue2  # 1.3.14 just shown for reference
name12: groupvalue2  # 1.3.15 just shown for reference
name13: groupvalue2  # 1.3.16 just shown for reference
name14: groupvalue2  # 1.3.17 just shown for reference
name00: groupvalue10  # 1.1.1 just shown for reference
name01: groupvalue10  # 1.1.2 just shown for reference
name02: groupvalue10  # 1.1.3 just shown for reference
name03: groupvalue10  # 1.1.4 just shown for reference
name04: groupvalue10  # 1.1.5 just shown for reference
name10: groupvalue12  # 1.3.13 just shown for reference
name11: groupvalue12  # 1.3.14 just shown for reference
name12: groupvalue12  # 1.3.15 just shown for reference
name13: groupvalue12  # 1.3.16 just shown for reference
name14: groupvalue12  # 1.3.17 just shown for reference

我正在尝试使用下面的awk代码。但是如何在数组中使用模式

BEGIN{
    FS=": "
    #print(var)
}
{ 
  if(NR==FNR)         # process first file only
  {                    
    a[$1]=$2;           # hash to a array {id is key, name} value
    next;               # process next record without executing following code
  } else
  {                    # process second file
    pattern= "^"$1"\..*" # eg: ^1\.1\..*
    # can i use pattern in array
    print a[pattern]":",$2  # output name (the value of) from array a and property
  }

} master.txt group.txt

EDIT1:因为OP对问题做了一些更改，所以在此添加了相应的解决方案

awk '
BEGIN{
  OFS=":"
}
FNR==NR{
  a[$1]=(a[$1]?a[$1]",":"")$2
  next
}
{
  sub(/\.[0-9]+:/,OFS,$1)
  key=$1
}
(key in a){
  delete array
  num=split(a[key],array,",")
  for(i=1;i<=num;i++){
    printf("%s%s",$2 OFS " "array[i],i==num?ORS:" ")
  }
}
' groups.txt masters.txt

说明：添加上述内容的详细说明

awk '                         ##Starting awk program from here.
FNR==NR{                      ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
  a[$1]=$2                    ##Creating array a with index of $1 and valus is $2.
  next                        ##next will skip all statements from here.
}
{
  key=substr($0,1,3)":"       ##Creating key which has 1st 3 characters and colon here.
}
(key in a){                   ##Checking key in array a if yes then do following.
  print $2":\t"a[key]         ##Printing 2nd field colon tab and value of a with key index here.
}
' groups.txt masters.txt      ##Mentioning Input_file names here.

awk '                           ##Starting awk program from here.
BEGIN{                          ##Starting BEGIN section of this program from here.
  OFS=":"                       ##Setting OFS as : here.
}
FNR==NR{                        ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
  a[$1]=$2                      ##Creating array a with index $1 and value $2 here.
  next                          ##next will skip all further statements from here.
}
{
  sub(/\.[0-9]+:/,OFS,$1)       ##Substituting dot digits and colon with OFS in 1st field.
  key=$1                        ##Creating variable key which hs 1st field in it.
}
(key in a){                     ##Checking condition if key from current line is present in array a then do following.
  print $2,"\t"a[key]           ##Printing 2nd field tab and value of array a here.
}
' groups.txt masters.txt        ##Mentioning Input_file names here.

第二种溶液：与第一种溶液略有不同

awk '
BEGIN{
  OFS=":"
}
FNR==NR{
  a[$1]=$2
  next
}
{
  sub(/\.[0-9]+:/,OFS,$1)
  key=$1
}
(key in a){
  print $2,"\t"a[key]
}
' groups.txt masters.txt

说明：添加上述内容的详细说明

awk '                         ##Starting awk program from here.
FNR==NR{                      ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
  a[$1]=$2                    ##Creating array a with index of $1 and valus is $2.
  next                        ##next will skip all statements from here.
}
{
  key=substr($0,1,3)":"       ##Creating key which has 1st 3 characters and colon here.
}
(key in a){                   ##Checking key in array a if yes then do following.
  print $2":\t"a[key]         ##Printing 2nd field colon tab and value of a with key index here.
}
' groups.txt masters.txt      ##Mentioning Input_file names here.

awk '                           ##Starting awk program from here.
BEGIN{                          ##Starting BEGIN section of this program from here.
  OFS=":"                       ##Setting OFS as : here.
}
FNR==NR{                        ##Checking condition if FNR==NR which will be TRUE when groups.txt is being read.
  a[$1]=$2                      ##Creating array a with index $1 and value $2 here.
  next                          ##next will skip all further statements from here.
}
{
  sub(/\.[0-9]+:/,OFS,$1)       ##Substituting dot digits and colon with OFS in 1st field.
  key=$1                        ##Creating variable key which hs 1st field in it.
}
(key in a){                     ##Checking condition if key from current line is present in array a then do following.
  print $2,"\t"a[key]           ##Printing 2nd field tab and value of array a here.
}
' groups.txt masters.txt        ##Mentioning Input_file names here.

A、稍有不同的做法：

awk -F": " '{ a[$1]=$2 }
            END{ for (i in a) { x=substr(i,1,3); 
                                if (x in a && x!=i) { 
                                   print a[i] ":", a[x] }
                              }
            }' master.txt group.txt

我只是将所有信息添加到数组（

），然后查看是否有包含前3个位置的索引。

即使您的输入包含其他

：

s或空格，而不仅仅是示例中所示的空格，以下内容也会起作用：

$ cat tst.awk
BEGIN { OFS=": " }
{
    group = substr($0,match($0,/[0-9]+\.[0-9]+/),RLENGTH)
    name  = substr($0,match($0,/[^:]+: */)+RLENGTH)
}
NR==FNR {
    map[group,++cnt[group]] = name
    next
}
{
    for (i=1; i<=cnt[group]; i++) {
        print map[group,i], name
    }
}

不要使用单词

模式

，因为它模棱两可。使用

字符串

或

regexp

，无论您在每个上下文中的意思是什么，这将大大有助于您提出一个可靠的解决方案，因为这样您的需求和代码就不再是模糊的。什么是

++cnt[group]

。你能把它展示得更不紧凑些吗

++cnt[1.1]

与

cnt[1.1]=cnt[1.1]+1相同