使用awk对同一文件中的多个表进行排序

使用awk对同一文件中的多个表进行排序,awk,Awk,在我的工作流程中,将创建包含具有两行标题的简单表的文件(请参见本文末尾)。我想使用以下方法按数字对这些表进行排序: (head -n 2 && tail -n +3 | sort -n -r) > ordered.txt 这很好,但我不知道如何分割文件,以便我可以对每个表进行排序并将其打印到一个文件中。我的做法是: awk '/^TARGET/ {(head -n 2 && tail -n +3 | sort -n -r) >> ordered

在我的工作流程中,将创建包含具有两行标题的简单表的文件(请参见本文末尾)。我想使用以下方法按数字对这些表进行排序:

(head -n 2 && tail -n +3 | sort -n -r) > ordered.txt
这很好,但我不知道如何分割文件,以便我可以对每个表进行排序并将其打印到一个文件中。我的做法是:

awk '/^TARGET/ {(head -n 2 && tail -n +3 | sort -n -r) >> ordered.txt}' output.txt
但是,这会导致错误消息。我想避免任何中间输出文件。我的awk命令缺少什么

输入文件如下所示:

TARGET  1
Sample1 Sample2 Sample3 Pattern
3   3   3   z..........................Z........................................z.........Z...z
147 171 49  Z..........................Z........................................Z.........Z...Z
27  28  13  z..........................Z........................................z.........z...z
75  64  32  Z..........................Z........................................Z.........z...Z

TARGET  2
Sample1 Sample2 Sample3 Pattern
2   0   1   z..........................z........................................z.........Z...Z
21  21  7   z..........................Z........................................Z.........Z...Z
1   0   0   ...........................Z........................................Z.............Z
4   8   6   Z..........................Z........................................z.........Z...z
2   0   1   Z..........................Z........................................Z.........Z....
1   0   0   z..........................Z........................................Z.............Z
1   0   0   z...................................................................Z.........Z...Z

TARGET  3
Sample1 Sample2 Sample3 Pattern
1   0   0   z..........................Z........................................z.............z
1   3   0   z..........................z........................................Z.........Z...Z
1   1   0   Z..........................Z........................................Z.............z
1   0   0   Z..........................Z........................................Z.............Z
0   1   2   ...........................Z........................................Z.........Z...Z
0   0   1   z..........................z........................................z..............
我的输出应该是这样的-任何行都不会掉下来:

    TARGET  1
Sample1 Sample2 Sample3 Pattern
147 171 49  Z..........................Z........................................Z.........Z...Z
75  64  32  Z..........................Z........................................Z.........z...Z
27  28  13  z..........................Z........................................z.........z...z
3   3   3   z..........................Z........................................z.........Z...z

TARGET  2
Sample1 Sample2 Sample3 Pattern
21  21  7   z..........................Z........................................Z.........Z...Z
4   8   6   Z..........................Z........................................z.........Z...z
2   0   1   z..........................z........................................z.........Z...Z
2   0   1   z..........................z........................................z.........Z...Z
1   0   0   ...........................Z........................................Z.............Z
1   0   0   ...........................Z........................................Z.............Z
1   0   0   ...........................Z........................................Z.............Z

TARGET  3
Sample1 Sample2 Sample3 Pattern
1   0   0   z..........................Z........................................z.............z
1   0   0   z..........................Z........................................z.............z
1   0   0   z..........................Z........................................z.............z
1   0   0   z..........................Z........................................z.............z
0   1   2   ...........................Z........................................Z.........Z...Z
0   0   1   z..........................z........................................z..............
需要GNU awk用于:

输出

TARGET  1
Sample1 Sample2 Sample3 Pattern
3   3   3   z..........................Z........................................z.........Z...z
27  28  13  z..........................Z........................................z.........z...z
75  64  32  Z..........................Z........................................Z.........z...Z
147 171 49  Z..........................Z........................................Z.........Z...Z

TARGET  2
Sample1 Sample2 Sample3 Pattern
1   0   0   ...........................Z........................................Z.............Z
1   0   0   z...................................................................Z.........Z...Z
1   0   0   z..........................Z........................................Z.............Z
2   0   1   Z..........................Z........................................Z.........Z....
2   0   1   z..........................z........................................z.........Z...Z
4   8   6   Z..........................Z........................................z.........Z...z
21  21  7   z..........................Z........................................Z.........Z...Z

TARGET  3
Sample1 Sample2 Sample3 Pattern
0   0   1   z..........................z........................................z..............
0   1   2   ...........................Z........................................Z.........Z...Z
1   0   0   Z..........................Z........................................Z.............Z
1   0   0   z..........................Z........................................z.............z
1   1   0   Z..........................Z........................................Z.............z
1   3   0   z..........................z........................................Z.........Z...Z

这有点乱,但假设您不想在排序时丢失记录,这应该会起作用

 awk 'function sortit(){
           x=asort(a)
           for(i=1;i<=x;i++)print b[a[i]" "d[i]++]
           delete(a);delete(b);delete(c);delete(d)
     }                             
     /^[0-9]/{a[$0]=$1;b[$1" "c[$1]++]=$0}
     /TARGET/{print;getline;print}
     !NF{sortit();print}
     END(sortit()}' file
awk'函数sortit(){
x=a端口(a)

对于(i=1;iIt不清楚输出应该是什么。输出应该像glenn jackman的输出,只是数字的降序。所以你想让表2的一半消失?你的awk命令缺少的是awk语言。awk不是shell,就像C不是shell一样。它是一个完全独立的工具,有自己的语言。我是使用@Jidder-我不知道你的输出应该是什么,请在发布时解释清楚。不,我不希望任何一行消失。我刚刚注意到glenn的代码就是这样。谢谢你的提示!效果非常好-我只是很难理解这两行:{table[$1,$2,$3]=$0}&END{output_table()}。为什么使用GAWK?它更快吗?首先,我使用GAWK,因为我使用GNU系统,GAWK是默认的awk。但是,GAWK实现数组遍历排序功能,所以这是这个答案所必需的。此行
{table[$1,$2,$3]=$0}
存储“table”数组中的行。数组键是与
subset
awk变量连接的前3列。这允许您进行后面的数字排序。
END{output_table()}
将打印第三个表。这是必需的,因为文件末尾可能没有空行。一半的记录应该消失吗?确认。完全正确。答案已更新。只需进行微小更改。您是对的:我不想跳过任何行。因此,您的代码可以完成此任务。您的代码中只有一个输入错误:使用括号不要在“END”后加上一个花括号。谢谢你分享这段代码!但是-它不适用于所有的输入文件。以“0 0 1 x…”开头的行将被删除-就像在我修改的输入文件中一样!
 awk 'function sortit(){
           x=asort(a)
           for(i=1;i<=x;i++)print b[a[i]" "d[i]++]
           delete(a);delete(b);delete(c);delete(d)
     }                             
     /^[0-9]/{a[$0]=$1;b[$1" "c[$1]++]=$0}
     /TARGET/{print;getline;print}
     !NF{sortit();print}
     END(sortit()}' file