如果前两列相同,则awk一行减去下一行
如果我们想减去$17,如果他们的$1和$2相同:输入如果前两列相同,则awk一行减去下一行,awk,Awk,如果我们想减去$17,如果他们的$1和$2相同:输入 targetID,cpd_number,Cell_assay_id,Cell_alt_assay_id,Cell_type_desc,Cell_Operator,Cell_result_value,Cell_unit_value,assay_id,alt_assay_id,type_desc,operator,result_value,unit_value,Ratio_operator,Ratio,log_ratio,Cell_experim
targetID,cpd_number,Cell_assay_id,Cell_alt_assay_id,Cell_type_desc,Cell_Operator,Cell_result_value,Cell_unit_value,assay_id,alt_assay_id,type_desc,operator,result_value,unit_value,Ratio_operator,Ratio,log_ratio,Cell_experiment_date,experiment_date,Cell_discipline,discipline
111,CPD-123456,2222,1111,IC50,,6.1,uM,1183,1265,Ki,,0.16,uM,,38.125,1.7511,2003-03-03 00:00:00,2003-02-10 00:00:00,Cell,Enzyme
111,CPD-123456,2222,1111,IC50,,9.02053,uM,1183,1265,Ki,,0.16,uM,,56.3783,-1.5812,2003-02-27 00:00:00,2003-02-10 00:00:00,Cell,Enzyme
111,CPD-777888,3333,4444,IC50,,6.1,uM,1183,1265,Ki,,0.16,uM,,38.125,-1,2003-03-03 00:00:00,2003-02-10 00:00:00,Cell,Enzyme
111,CPD-777888,3333,4444,IC50,,9.02053,uM,1183,1265,Ki,,0.16,uM,,56.3783,-3,2003-02-27 00:00:00,2003-02-10 00:00:00,Cell,Enzyme
期望输出应为(1.7511-(-1.5812)=3.3323);(-1-(-3)=2)
第一次尝试:
awk -F, ' last != $1""$2 && last{ # ONLY When last key "TargetID + Cpd_number"
print C # differs from actual , print line + substraction
C=0} # reset acumulators
{ # This block process each line of infile
C -= $17 # C calc
line=$0 # Line will be actual line without activity
last=$1""$2} # Store the key in orther to track switching
END{ # This block triggers after the complete file read
# to print the last average that cannot be trigger during
# the previous block
print C}' input
它将给出以下输出:
-0.1699
4
第二次尝试:
#!/bin/bash
tail -n+2 test > test2 # remove the title/header
awk -F, '$1 == $1 && $2 == $2 {print $17}' test2 >> test3 # print $17 if the $1 and $2 are the same
awk 'NR==1{s=$1;next}{s-=$1}END{print s}' test3
rm test2 test3
测试3将是
1.7511
-1.5812
-1
-3
输出为
7.3323
任何一位古鲁都能给出一些评论吗?谢谢 您可以尝试下面的awk命令
awk '
BEGIN { FS = "," }
NR == 1 { next } # skip header line
{ # accumulate totals
if ($1 SUBSEP $2 in a) # if key already exists
a[$1,$2] -= $17 # subtract $17 from value
else # if first appearance of this key
a[$1,$2] = $17 # set value to $17
}
END { # print results
for (x in a)
print a[x]
}
' file
$ awk -F, 'NR==1{next} {var=$1; foo=$2; bar=$17; getline;} $1==var && $2==foo{xxx=bar-$17; print xxx}' file
3.3323
2
请从您的输入中删除
**
。谢谢您的评论。但是,输出将变为“2 3.3323”,而不是“3.3323 2”。谢谢您的评论!它可以简化为“awk-F,'NR==1{next}{bar=$17;getline;}$1==$1&&$2=$2{xxx=bar-$17;print xxx}”,我想我写这个脚本时缺少了“getline”。在阅读了“getline”手册之后,我仍然不太理解“getline”在这里的含义。你能解释一下吗?谢谢@Chubaka getline函数用于获取下一行。将第一行的col1和col2值分配给var、foo变量,将$17分配给bar变量后,现在将获取下一行。下一行分配给$0,下一行column1分配给$1,就像这样。我想你现在明白了,我明白了。所以“getline”具有读取下一行的功能,而“next”只是跳转到下一行并忽略当前行,对吗?
$ awk -F, 'NR==1{next} {var=$1; foo=$2; bar=$17; getline;} $1==var && $2==foo{xxx=bar-$17; print xxx}' file
3.3323
2