Shell 在awk/unix中按重复列合并行,条件为
如何使用awk/sed/unix命令来处理数据。我的数据如下:Shell 在awk/unix中按重复列合并行,条件为,shell,awk,sed,Shell,Awk,Sed,如何使用awk/sed/unix命令来处理数据。我的数据如下: /abc/def1.0/Acc101 500 50 /abc/def1.0/Acc101 401 27 /abc/def1.0/Acc101 200 101 /abc/def1.0/Acc201 200 4 /abc/def1.0/Acc301 304 2 /abc/def1.0/Acc401 200 204 对于第一列$1中的每个唯一字符串,我们如何合并由value分隔的值。列$2是代码,如果它的200表示成功,则表示它失败$3
/abc/def1.0/Acc101 500 50
/abc/def1.0/Acc101 401 27
/abc/def1.0/Acc101 200 101
/abc/def1.0/Acc201 200 4
/abc/def1.0/Acc301 304 2
/abc/def1.0/Acc401 200 204
对于第一列$1中的每个唯一字符串,我们如何合并由value分隔的值。列$2是代码,如果它的200表示成功,则表示它失败$3是事件的计数
下面是示例输出,我们区分$1,验证$2中的值为200或不为200的值,并合并/求和$3中的计数。样本如下:
/abc/def1.0/Acc101 101 77
/abc/def1.0/Acc201 4 0
/abc/def1.0/Acc301 0 2
/abc/def1.0/Acc401 204 0
该行的信息:
/abc/def1.0/Acc101 77
77=从$3算起的50+27之和,其值为$2!=二百
非常感谢您的帮助。为了方便起见,您可以将输入文件读取两次,并可以尝试执行一次
awk '
FNR==NR{
mainarray[$1]
if($2!=200){
sum[$1]+=$NF
}
if($2==200){
Found200[$1]+=$NF
}
next
}
($1 in mainarray) && !($1 in Found200){
print $1,0,sum[$1]!=""?sum[$1]:0
next
}
$2==200{
print $1,Found200[$1]!=""?Found200[$1]:0,sum[$1]!=""?sum[$1]:0
}
' Input_file Input_file
说明:添加上述详细信息
awk ' ##Starting awk program from here.
FNR==NR{ ##FNR==NR condition will be TRUE when first time Input_file will be read.
mainarray[$1] ##Creating array with index $1 here.
if($2!=200){ ##Creating array named sumwith index $1 and keep adding last column value in it.
sum[$1]+=$NF ##Creating array named sumwith index $1 and keep adding last column value in it
}
if($2==200){ ##Checking condition if 2nd field is equal to 200 then do following.
Found200[$1]+=$NF ##Creating array Found200 with index #1and keep adding last column value to its value.
}
next ##next will skip all further statements from here.
}
($1 in mainarray) && !($1 in Found200){ ##Checking condition if $1 is present in mainarray and $1 is NOT present in Found200 array.
print $1,0,sum[$1]!=""?sum[$1]:0 ##Printing first field, zero and value of sum with $1 here.
next ##next will skip all further statements from here.
}
$2==200{ ##Checking condition if 3rd field is 200 then do following.
print $1,$NF!=""?Found200[$1]:0,sum[$1]!=""?sum[$1]:0 ##Printing first field, Found200 vaue with sum value.
}
' Input_file Input_file ##Mentioning Input_file names here.
差不多
awk '{ groups[$1] = 1; if ($2 == 200) succ[$1] += $3; else fail[$1] += $3 }
END { PROCINFO["sorted_in"] = "@ind_str_asc"
for (g in groups) print g, succ[g]+0, fail[g]+0 }' input.txt
/abc/def1.0/Acc101 101 77
/abc/def1.0/Acc201 4 0
/abc/def1.0/Acc301 0 2
/abc/def1.0/Acc401 204 0
如果使用GNU awk,PROCINFO
行将导致排序输出,否则顺序是任意的,如果您希望排序,可以通过管道将其发送到sort