如何使用bash/awk查找列表中的X最小值?
好的,问题是,我有一个包含N行的列表,如下所示:如何使用bash/awk查找列表中的X最小值?,bash,sorting,awk,Bash,Sorting,Awk,好的,问题是,我有一个包含N行的列表,如下所示: 4.96035894 2.94014535 9.71651378 On 8.37470259 9.08139103 10.23145322 Off 5.73085411 4.21656546 9.98718707 On 6.40892867 9.44195654 8.83707549 On 4.26065784 3.74966832 7.89520829 On 8.89601431 9.84208918 9.63054539
4.96035894 2.94014535 9.71651378 On
8.37470259 9.08139103 10.23145322 Off
5.73085411 4.21656546 9.98718707 On
6.40892867 9.44195654 8.83707549 On
4.26065784 3.74966832 7.89520829 On
8.89601431 9.84208918 9.63054539 On
9.10538764 8.58408119 10.87454882 On
6.21494725 4.61164407 9.08378204 Off
7.62256424 9.59449339 10.84506558 Off
6.49210768 4.03768151 10.75221925 Off
5.04079861 4.99362253 10.34349177 Off
...
目标是在第三个字段中找到值最低的X(X4.96035894 2.94014535 9.71651378 On
8.37470259 9.08139103 10.23145322 Off
5.73085411 4.21656546 9.98718707 On
6.40892867 9.44195654 8.83707549 Off // this row is changed
4.26065784 3.74966832 7.89520829 Off // this row is changed
8.89601431 9.84208918 9.63054539 On
9.10538764 8.58408119 10.87454882 On
6.21494725 4.61164407 9.08378204 Off // this row is changed
7.62256424 9.59449339 10.84506558 Off
6.49210768 4.03768151 10.75221925 Off
5.04079861 4.99362253 10.34349177 Off
...
我想我可以处理X=1的特定情况,即最小值行,但我不知道如何扩展到任意的X。可能是一个X大小的数组填充并在浏览列表时进行编辑?类似的操作可以:
x=3
f=3
awk -v f="$f" '{print $f, NR, $0}' file |
sort -n |
awk -v x="$x" 'NR<=x{sub(/On/,"Off")} {print}' |
sort -k2n |
awk '{sub(/[^ ]+ +[^ ]+ +/,""); print}'
x=3
f=3
awk-v f=“$f”{打印$f,NR,$0}文件|
排序-n|
awk-v x=“$x””NR有趣的问题,您需要巧妙地使用数组:
BEGIN {
if (!x) # If x wasn't set using -v default is 3
x=3
if (!field) # If field wasn't set using -v default is 3
field=3
}
{
lines[NR]=$0 # Store each line in an array
sort[NR]=$field # Store the field in an array
field_a[$field]=$0 # Line lookup on field
}
END{
asort(sort) # Sort the fields
for (j=1;j<=NR;j++) { # For every line in the file
for(i=1;i<=x;i++) { # For the top x values
if (lines[j] == field_a[sort[i]]) { # If current line in top x
sub(/On/,"Off",lines[j]) # Do the subsitution
break # Grab the next line
}
}
print lines[j] # print the line
}
}
默认情况下,它关闭字段3中最低的3个值,但您可以使用-v
选项指定字段和值的数量。例如,让我们关闭字段3中最低的10个值,只打开最大值:
$ awk -v x=10 -f script.awk file
4.96035894 2.94014535 9.71651378 Off
8.37470259 9.08139103 10.23145322 Off
5.73085411 4.21656546 9.98718707 Off
6.40892867 9.44195654 8.83707549 Off
4.26065784 3.74966832 7.89520829 Off
8.89601431 9.84208918 9.63054539 Off
9.10538764 8.58408119 10.87454882 On
6.21494725 4.61164407 9.08378204 Off
7.62256424 9.59449339 10.84506558 Off
6.49210768 4.03768151 10.75221925 Off
5.04079861 4.99362253 10.34349177 Off
字段2的最大值是多少
$ awk -v x=10 -v field=2 -f script.awk file
4.96035894 2.94014535 9.71651378 Off
8.37470259 9.08139103 10.23145322 Off
5.73085411 4.21656546 9.98718707 Off
6.40892867 9.44195654 8.83707549 Off
4.26065784 3.74966832 7.89520829 Off
8.89601431 9.84208918 9.63054539 On
9.10538764 8.58408119 10.87454882 Off
6.21494725 4.61164407 9.08378204 Off
7.62256424 9.59449339 10.84506558 Off
6.49210768 4.03768151 10.75221925 Off
5.04079861 4.99362253 10.34349177 Off
注意:使用asort()
函数需要GNU awk
和另一种方法:
n=4
field=3
newval=FOO
# find the line numbers that need to be updated
set -- $(
cat -n file |
sort -nk $((++field)),$field |
awk -v n=$n 'FNR <= n {print $1}'
)
# now, update the value for the specific lines
awk -v val="$newval" -v lines=" $* " 'lines ~ " "FNR" " {$NF = val} 1' file
n=4
字段=3
newval=FOO
#查找需要更新的行号
设置--$(
cat-n文件|
排序-nk$(++字段)),$字段|
awk-vn=$n'FNR另一种方法,读取文件两次,边读边排序
awk '
NR==FNR{
S[0]=$field
# sort the value into place
for(i=1;i<=n;i++){
if(S[i-1]>S[i]){
c=S[i-1]
S[i-1]=S[i]
S[i]=c
}
}
# shift the highest value into oblivion
if(NR>n) for(i=n; i>=1; i--) S[i]=S[i-1]
next
}
# Create associative array entries for the values
FNR==1 {
for(i=1;i<=n;i++){
A[S[i]]
}
}
# if $field is one of the values then change the last field (assuming there are no other fields with value of $NF)
$field in A {
sub($NF,"Off")
}
1
' n=3 field=3 file file
awk'
NR==FNR{
S[0]=$field
#将值排序到位
for(i=1;iS[i]){
c=S[i-1]
S[i-1]=S[i]
S[i]=c
}
}
#将最高值转换为遗忘值
对于(i=n;i>=1;i--)S[i]=S[i-1],如果(NR>n)
下一个
}
#为值创建关联数组项
FNR==1{
对于(i=1;iSo,到目前为止,您自己尝试过做什么?您具体无法管理的是什么?您能否举例说明您的预期输出应该是什么样子?向我们展示您为X=1所做的工作,我们将尝试扩展该工作以供您的帮助您可能希望在末尾抛出一个列-t
,以对格式进行排序,但请注意第三列将与原始文件中的左对齐而不是右对齐。我还将$6
更改为$NF
,以便解决方案适用于包含4个以上字段的文件。@sudo\u-很好,我已更新了答案以解决这两个问题。
n=4
field=3
newval=FOO
# find the line numbers that need to be updated
set -- $(
cat -n file |
sort -nk $((++field)),$field |
awk -v n=$n 'FNR <= n {print $1}'
)
# now, update the value for the specific lines
awk -v val="$newval" -v lines=" $* " 'lines ~ " "FNR" " {$NF = val} 1' file
awk '
NR==FNR{
S[0]=$field
# sort the value into place
for(i=1;i<=n;i++){
if(S[i-1]>S[i]){
c=S[i-1]
S[i-1]=S[i]
S[i]=c
}
}
# shift the highest value into oblivion
if(NR>n) for(i=n; i>=1; i--) S[i]=S[i-1]
next
}
# Create associative array entries for the values
FNR==1 {
for(i=1;i<=n;i++){
A[S[i]]
}
}
# if $field is one of the values then change the last field (assuming there are no other fields with value of $NF)
$field in A {
sub($NF,"Off")
}
1
' n=3 field=3 file file