Shell 在awk/unix中按重复列合并行，条件为_Shell_Awk_Sed

Shell 在awk/unix中按重复列合并行，条件为

shell awk sed

Shell 在awk/unix中按重复列合并行，条件为,shell,awk,sed,Shell,Awk,Sed,如何使用awk/sed/unix命令来处理数据。我的数据如下： /abc/def1.0/Acc101 500 50 /abc/def1.0/Acc101 401 27 /abc/def1.0/Acc101 200 101 /abc/def1.0/Acc201 200 4 /abc/def1.0/Acc301 304 2 /abc/def1.0/Acc401 200 204 对于第一列$1中的每个唯一字符串，我们如何合并由value分隔的值。列$2是代码，如果它的200表示成功，则表示它失败$3

如何使用awk/sed/unix命令来处理数据。我的数据如下：

/abc/def1.0/Acc101 500 50
/abc/def1.0/Acc101 401 27
/abc/def1.0/Acc101 200 101
/abc/def1.0/Acc201 200 4
/abc/def1.0/Acc301 304 2
/abc/def1.0/Acc401 200 204

对于第一列$1中的每个唯一字符串，我们如何合并由value分隔的值。列$2是代码，如果它的200表示成功，则表示它失败$3是事件的计数

下面是示例输出，我们区分$1，验证$2中的值为200或不为200的值，并合并/求和$3中的计数。样本如下：

/abc/def1.0/Acc101 101 77
/abc/def1.0/Acc201 4 0
/abc/def1.0/Acc301 0 2
/abc/def1.0/Acc401 204 0

该行的信息： /abc/def1.0/Acc101 77

77=从$3算起的50+27之和，其值为$2！=二百

非常感谢您的帮助。

为了方便起见，您可以将输入文件读取两次，并可以尝试执行一次

awk '
FNR==NR{
  mainarray[$1]
  if($2!=200){
    sum[$1]+=$NF
  }
  if($2==200){
    Found200[$1]+=$NF
  }
  next
}
($1 in mainarray) && !($1 in Found200){
  print $1,0,sum[$1]!=""?sum[$1]:0
  next
}
$2==200{
  print $1,Found200[$1]!=""?Found200[$1]:0,sum[$1]!=""?sum[$1]:0
}
'  Input_file  Input_file

说明：添加上述详细信息

awk '                                                           ##Starting awk program from here.
FNR==NR{                                                        ##FNR==NR condition will be TRUE when first time Input_file will be read.
  mainarray[$1]                                                 ##Creating array with index $1 here.
  if($2!=200){                                                  ##Creating array named sumwith index $1 and keep adding last column value in it.
    sum[$1]+=$NF                                                ##Creating array named sumwith index $1 and keep adding last column value in it
  }
  if($2==200){                                                  ##Checking condition if 2nd field is equal to 200 then do following.
    Found200[$1]+=$NF                                           ##Creating array Found200 with index #1and keep adding last column value to its value.
  }
  next                                                          ##next will skip all further statements from here.
}
($1 in mainarray) && !($1 in Found200){                         ##Checking condition if $1 is present in mainarray and $1 is NOT present in Found200 array.
  print $1,0,sum[$1]!=""?sum[$1]:0                              ##Printing first field, zero and value of sum with $1 here.
  next                                                          ##next will skip all further statements from here.
}
$2==200{                                                        ##Checking condition if 3rd field is 200 then do following.
  print $1,$NF!=""?Found200[$1]:0,sum[$1]!=""?sum[$1]:0         ##Printing first field, Found200 vaue with sum value.
}
' Input_file  Input_file                                      ##Mentioning Input_file names here.

差不多

awk '{ groups[$1] = 1; if ($2 == 200) succ[$1] += $3; else fail[$1] += $3 }
     END { PROCINFO["sorted_in"] = "@ind_str_asc"
           for (g in groups) print g, succ[g]+0, fail[g]+0 }' input.txt
/abc/def1.0/Acc101 101 77
/abc/def1.0/Acc201 4 0
/abc/def1.0/Acc301 0 2
/abc/def1.0/Acc401 204 0

如果使用GNU awk，

PROCINFO

行将导致排序输出，否则顺序是任意的，如果您希望排序，可以通过管道将其发送到

sort