Shell 在awk中以两个文件处理打印内容

Shell 在awk中以两个文件处理打印内容,shell,awk,Shell,Awk,我指的是这个链接。 我有一个配置文件,其中第2列是功能,第3列是动作。我还有一个大文件,需要将此文件的第1列与配置文件的第1列相匹配,并根据功能执行操作 假设:在File.txt列中命名为Min(第三列)、Median(第四列)、Max(第五列) Config.txt Apple All Max Car abc Median Car xyz Min Book cvb Median Book pqr Max File.txt Apple first 10

我指的是这个链接。 我有一个
配置文件
,其中
第2列是功能
第3列是动作
。我还有一个大文件,需要将此文件的第1列与配置文件的第1列相匹配,并根据功能执行操作

假设:在File.txt列中命名为
Min(第三列)、Median(第四列)、Max(第五列)

Config.txt

Apple  All  Max
Car    abc  Median
Car    xyz  Min
Book   cvb  Median
Book   pqr  Max
File.txt

Apple  first   10  20  30
Apple  second  20  30  40
Car    abc     10  20  30
Car    xyz     20  30  40
Car    wxyz    10  20  30
Book   cvb     60  70  80
Book   pqr     80  90  100
预期产出:

Apple  first   30
Apple  second  40
Car    abc     20
Car    xyz     20
Car    wxyz    10
Book   cvb     70
Book   pqr     100
上述输出通过以下方法生成:

1) 由于
file.txt
很大,因此如果
config file
的功能(第二列)是ALL,那么所有匹配的第一列将根据
config file
的第三列执行操作

2) 否则,如果
配置文件的第二列
匹配为
**子字符串**
file.txt的第二列

以下是我尝试过的:

awk 'BEGIN {m["Min"]=3;m["Median"]=4;m["Max"]=5}

        NR==FNR{ arr[$1]=$2;brr[$1]=$3;next}
                ($1 in arr && arr[$1]=="All") {print $1,$2,$m[brr[$1]]}
                ($1 in arr && $2==arr[$1] ) {print $1 ,$2,$m[brr[$1]]}
' Config.txt File.txt
代码输出:

Apple  first   30
Apple  second  40
Book   pqr     100
Car    xyz     20 
上述输出仅打印匹配的第1列的一个字段(如
Book cvb 70
未打印)。另外,我如何将该字符串作为结束字符串进行匹配(例如config.txt中定义的
xyz
与file.txt的
xyz和wxyz
匹配)


请帮助我解决上述问题。谢谢!

您预期的样本输出与显示的输入样本文件(例如-->
Car abc 200
中没有
200
file.txt
)不符,如果我得到的正确,请尝试以下操作

awk '
BEGIN{
  b["min"]=3
  b["max"]=5
  b["median"]=4
}
FNR==NR{
  c[$1]
  ++d[$1]
  a[$1 d[$1]]=tolower($NF)
  next
}
($1 in c){
  if(e[$1]<d[$1]){
      ++e[$1]
  }
  else{
      e[$1]!=""?e[$1]:++e[$1]
  }
  print $1,$2,$b[a[$1 e[$1]]]
}' config.txt file.txt
解释:现在为上述代码添加解释

awk '                                       ##Starting awk program here.
BEGIN{                                      ##Mentioning BEGIN section here which will be executed once and before reading Input_file only.
  b["min"]=3                                ##Creating an array named b whose index is string min and value is 3.
  b["max"]=5                                ##Creating an array named b whose index is string max and value is 5.
  b["median"]=4                             ##Creating an array named b whose index is string median and value is 4.
}                                           ##Closing BLOCK section here.
FNR==NR{                                    ##Checking condition FNR==NR which will be executed when 1st Input_file named config.txt is being read.
  c[$1]                                     ##Creating an array named c whose index is $1.
  ++d[$1]                                   ##Creating an array named d and with index is $1 whose value is keep increasing with 1 on its each occurence.
  a[$1 d[$1]]=tolower($NF)                  ##Creating an array named a whose index is $1 and value of d[$1] and value is small letters value of $NF(last column) of current line.
  next                                      ##Using next keyword of awk to skip all further statements from here.
}
($1 in c){                                  ##Checking conditions if $1 of current line is present of array c then do following.
  if(e[$1]<d[$1]){                          ##Checking condition if value of e[$1] is lesser than d[$1] then do following.
      ++e[$1]                               ##Creating array named e whose index is $1 and incrementing its value with 1 here.
  }
  else{                                     ##Using else for above if condition here.
      e[$1]!=""?e[$1]:++e[$1]               ##Checking if e[$1] is NULL then increment it with 1 or leave it as it is.
  }
  print $1,$2,$b[a[$1 e[$1]]]               ##Printing 1st, 2nd fields value along with field value of array b whose index is value of array a with index of $1 e[$1] here.
}' config.txt file.txt                      ##Mentioning Input_files here.
awk'##在这里启动awk程序。
BEGIN{###这里提到BEGIN部分,它将在读取输入文件之前执行一次。
b[“min”]=3##创建一个名为b的数组,其索引为字符串min,值为3。
b[“max”]=5##创建一个名为b的数组,其索引为字符串max,值为5。
b[“median”]=4##创建一个名为b的数组,其索引为字符串median,值为4。
}###这里是封闭区。
FNR==NR{{###检查条件FNR==NR,当读取名为config.txt的第一个输入文件时将执行该条件。
c[$1]##创建一个索引为$1的名为c的数组。
++d[$1]##创建一个名为d的数组,其索引为$1,其值在每次出现时随1不断增加。
a[$1 d[$1]]=tolower($NF)##创建一个名为a的数组,其索引为$1,值为d[$1],值为当前行的小写字母值$NF(最后一列)。
下一步##使用awk的next关键字跳过此处的所有后续语句。
}
(c中的$1){##检查条件如果数组c中存在当前行的$1,则执行以下操作。

如果(e[$1]在你的
假设中:
也许你的意思是
File.txt
而不是
Config.txt
(?)@是的,在文件中。txt@RATNESHTIWARI,在发布样本时请小心,因为第行
Car abc 200
与您的第一个输入文件不匹配,如果是这种情况,请更新您的样本。@RavinderSingh13,谢谢,我更新了它。谢谢您的快速回复!!您能简要解释一下代码吗ode>tolower
在这里。请解释你的方法!!@RATNESHTIWARI,是的,解释正在进行中。顺便说一句
tolower
我用它将字符串转换成小写形式。这样你的文件就不会有像
MAX或
MAX`或
MAX
等的混乱。因为它正在用小写字母更改它们,所以我不需要硬编码这些值需要匹配。@RATNESHTIWARI,现在添加了完整的解释,请检查一下。感谢上面的解释。但是解释需要使用上述变量/数组的目的及其对输出的影响。@RATNESHTIWARI,目的是匹配OP/您的预期输出,如果您更清楚地阅读解释,您可能会得到它的脉冲,l让我知道你的困惑,然后试着解决它们。
awk '                                       ##Starting awk program here.
BEGIN{                                      ##Mentioning BEGIN section here which will be executed once and before reading Input_file only.
  b["min"]=3                                ##Creating an array named b whose index is string min and value is 3.
  b["max"]=5                                ##Creating an array named b whose index is string max and value is 5.
  b["median"]=4                             ##Creating an array named b whose index is string median and value is 4.
}                                           ##Closing BLOCK section here.
FNR==NR{                                    ##Checking condition FNR==NR which will be executed when 1st Input_file named config.txt is being read.
  c[$1]                                     ##Creating an array named c whose index is $1.
  ++d[$1]                                   ##Creating an array named d and with index is $1 whose value is keep increasing with 1 on its each occurence.
  a[$1 d[$1]]=tolower($NF)                  ##Creating an array named a whose index is $1 and value of d[$1] and value is small letters value of $NF(last column) of current line.
  next                                      ##Using next keyword of awk to skip all further statements from here.
}
($1 in c){                                  ##Checking conditions if $1 of current line is present of array c then do following.
  if(e[$1]<d[$1]){                          ##Checking condition if value of e[$1] is lesser than d[$1] then do following.
      ++e[$1]                               ##Creating array named e whose index is $1 and incrementing its value with 1 here.
  }
  else{                                     ##Using else for above if condition here.
      e[$1]!=""?e[$1]:++e[$1]               ##Checking if e[$1] is NULL then increment it with 1 or leave it as it is.
  }
  print $1,$2,$b[a[$1 e[$1]]]               ##Printing 1st, 2nd fields value along with field value of array b whose index is value of array a with index of $1 e[$1] here.
}' config.txt file.txt                      ##Mentioning Input_files here.