Shell 用条件分隔和计算列表中元素的数量
我想分离并计算输入列表中的元素数。 input.txt包含两列,$1是元素ID,$2是它的比率(数字) 我想将这些比率分为以下几类:Shell 用条件分隔和计算列表中元素的数量,shell,for-loop,awk,while-loop,Shell,For Loop,Awk,While Loop,我想分离并计算输入列表中的元素数。 input.txt包含两列,$1是元素ID,$2是它的比率(数字) 我想将这些比率分为以下几类: Greater than or equal to 2 Between 1 and 2 Between 0.5 and 1 Between -0.5 and 0.5 Between -1 and -0.5 Between -2 and -1 Less than or equal to 2 然后将每个类别的计数打印到单独的输出文件results.txt中: Tota
Greater than or equal to 2
Between 1 and 2
Between 0.5 and 1
Between -0.5 and 0.5
Between -1 and -0.5
Between -2 and -1
Less than or equal to 2
然后将每个类别的计数打印到单独的输出文件results.txt中:
Total 16
> 2 3
1 to 2 1
0.5 to 1 2
-0.5 to 0.5 6
-0.5 to -1 1
-1 to -2 1
< -2 2
总计16个
> 2 3
1对2 1
0.5至1.2
-0.5至0.5 6
-0.5至-1
-1至2 1
< -2 2
我可以使用以下命令行执行此操作:
awk $2 > 2 {print $1,$2} input.txt | wc -l
awk $2 > 0.5 && $2 < 1 {print $1,$2} input.txt | wc -l
awk $2 > -0.5 && $2 < 0.5 {print $1,$2} input.txt | wc -l
awk $2 > -0.5 && $2 < -1 {print $1,$2} input.txt | wc -l
awk $2 > -1 && $2 < -0.5 {print $1,$2} input.txt | wc -l
awk $2 > -2 && $2 < -1 {print $1,$2} input.txt | wc -l
awk $2 < -2 {print $1,$2} input.txt | wc -l
awk$2>2{print$1,$2}input.txt | wc-l
awk$2>0.5&&$2<1{print$1,$2}input.txt|wc-l
awk$2>-0.5&&$2<0.5{print$1,$2}input.txt | wc-l
awk$2>-0.5&&$2<-1{print$1,$2}input.txt | wc-l
awk$2>-1&&$2<-0.5{print$1,$2}input.txt | wc-l
awk$2>-2&&$2<-1{print$1,$2}input.txt|wc-l
awk$2<-2{print$1,$2}input.txt | wc-l
我认为有一种更快的方法可以使用while或for循环的shell脚本来实现,但我不知道如何实现。任何建议都会非常棒。您只需处理一次文件,简单的方法是:
awk '$2>=2{a++;next}
$2>0.5 && $2 <1 {b++;next}
$2>-0.5 && $2 <0.5 {c++;next}
...
$2<=-2{x++;next}
END{print "total:",NR;
print ">2:",a;
print "1-2:",b;
...
print "<-2:",x
}' file
awk'$2>=2{a++;next}
$2>0.5&&$2-0.5&&$2一种方法是通过维护您感兴趣的每个类别的运行计数,使用单个awk命令来实现这一点
#!/bin/bash
if [ $# -ne 1 ]
then
echo "Usage: $0 INPUT"
exit 1
fi
awk ' {
if ($2 > 2) count[0]++
else if ($2 > 1) count[1]++
else if ($2 > 0.5) count[2]++
else if ($2 > -0.5) count[3]++
else if ($2 > -1) count[4]++
else if ($2 > -2) count[5]++
else count[6]++
} END {
print " > 2\t", count[0]
print " 1 to 2\t", count[1]
print " 0.5 to 1\t", count[2]
print "-0.5 to 0.5\t", count[3]
print "-1 to -0.5\t", count[4]
print "-2 to -1\t", count[5]
print " < -2\t", count[6]
}' $1
#/bin/bash
如果[$#-ne 1]
然后
echo“用法:$0输入”
出口1
fi
awk'{
如果($2>2)计数[0]++
如果($2>1),则计数[1]++
否则如果($2>0.5)计数[2]++
否则如果($2>-0.5)计数[3]++
否则如果($2>-1)计数[4]++
否则如果($2>-2)计数[5]++
其他计数[6]++
}结束{
打印“>2\t”,计数[0]
打印“1到2\t”,计数[1]
打印“0.5到1\t”,计数[2]
打印“-0.5到0.5\t”,计数[3]
打印“-1到-0.5\t”,计数[4]
打印“-2到-1\t”,计数[5]
打印“<-2\t”,计数[6]
}' $1
您可以使用sort
对条目进行数字排序,然后计算每个间隔中的条目数。例如,考虑到您的输入:
cut -f 2 -d ' ' input.txt | sort -nr | awk '
BEGIN { split("2 1 0.5 -0.5 -1 -2", inter); i = 1; }
{
if (i > 6) { ++c; next; }
if ($1 >= inter[i]) ++c;
else if (i == 1) { print c, "greater than", inter[i++]; c = 1; }
else { print c, "between", inter[i - 1], "and", inter[i++]; c = 1; }
}
END { print c, "lower than", inter[i - 1]; }'
如果您的输入已经排序,您甚至可以使用以下命令缩短命令行:
awk 'BEGIN { split("2 1 0.5 -0.5 -1 -2", inter); i = 1; }
{
if (i > 6) { ++c; next; }
if ($2 >= inter[i]) ++c;
else if (i == 1) { print c, "greater than", inter[i++]; c = 1; }
else { print c, "between", inter[i - 1], "and", inter[i++]; c = 1; }
}
END { print c, "lower than", inter[i - 1]; }' input.txt
以及生成的输出--您可以根据需要格式化:
3 greater than 2
1 between 2 and 1
2 between 1 and 0.5
6 between 0.5 and -0.5
1 between -0.5 and -1
1 between -1 and -2
2 lower than -2
使用script.awk
:
{
if ($2>=2) counter1++
else if ($2>=1) counter2++
else if ($2>=0.5) counter3++
else if ($2>=-0.5) counter4++
else if ($2>=-1) counter5++
else if ($2>=-2) counter6++
else counter7++
}
END{
print "Greater than 2: "counter1
print "Between 1 and 2: "counter2
print "Between 0.5 and 1: "counter3
print "Between -0.5 and 0.5: "counter4
print "Between -1 and -0.5: "counter5
print "Between -2 and -1: "counter6
print "Less than 2: "counter7
}
脚本toto:
awk '
$2>2 { count[1]++; label[1]="Greater than or equal to 2"; }
($2>1 && $2<=2) { count[2]++; label[2]="Between 1 and 2"; }
($2>0.5 && $2<=1) { count[3]++; label[3]="Between 0.5 and 1"; }
($2>-0.5 && $2<=0.5) { count[4]++; label[4]="Between -0.5 and 0.5"; }
($2>-1 && $2<=-0.5) { count[5]++; label[5]="Between -1 and -0.5"; }
($2>-2 && $2<=-1) { count[6]++; label[6]="Between -2 and -1"; }
$2<=-2 { count[7]++; label[7]="Less than or equal to 2"; }
END { for (i=1;i<=7;i++)
{ printf "%-30s %s\n" ,label[i], count[i];
}
}
' /tmp/input.txt
固定的。我追求的是清晰而不是简洁^^^哈哈!谢谢你的评论!是OP为自己的目的选择了最好的方法(你的选择也很简洁!^^^^啊,比我的更简洁^^^+1应该选择这个方法……这确实是OP希望有一个while/for循环(即一种将程序/方法与实际值分开的方法)。
{
if ($2>=2) counter1++
else if ($2>=1) counter2++
else if ($2>=0.5) counter3++
else if ($2>=-0.5) counter4++
else if ($2>=-1) counter5++
else if ($2>=-2) counter6++
else counter7++
}
END{
print "Greater than 2: "counter1
print "Between 1 and 2: "counter2
print "Between 0.5 and 1: "counter3
print "Between -0.5 and 0.5: "counter4
print "Between -1 and -0.5: "counter5
print "Between -2 and -1: "counter6
print "Less than 2: "counter7
}
awk '
$2>2 { count[1]++; label[1]="Greater than or equal to 2"; }
($2>1 && $2<=2) { count[2]++; label[2]="Between 1 and 2"; }
($2>0.5 && $2<=1) { count[3]++; label[3]="Between 0.5 and 1"; }
($2>-0.5 && $2<=0.5) { count[4]++; label[4]="Between -0.5 and 0.5"; }
($2>-1 && $2<=-0.5) { count[5]++; label[5]="Between -1 and -0.5"; }
($2>-2 && $2<=-1) { count[6]++; label[6]="Between -2 and -1"; }
$2<=-2 { count[7]++; label[7]="Less than or equal to 2"; }
END { for (i=1;i<=7;i++)
{ printf "%-30s %s\n" ,label[i], count[i];
}
}
' /tmp/input.txt
. /tmp/toto
Greater than or equal to 2 3
Between 1 and 2 1
Between 0.5 and 1 2
Between -0.5 and 0.5 6
Between -1 and -0.5 1
Between -2 and -1 1
Less than or equal to 2 2