Awk 提取每一行并将其附加到其中一列的另一个文件中

Awk 提取每一行并将其附加到其中一列的另一个文件中,awk,Awk,如何从下面的文件中提取每一行,并将内容附加到以第3列命名的文件中(例如主干线3710) HWI-ST9450069:2:1101:17889:2254#GNNNNN/11 16主干线3710113730 1 28M1D61M11M1D11M*0 AATAGGAAGCCGGCTATCGAAGAGAGAGAGAGAGAGAGAGAGAGAGAGGCCATACACACACAATCT*NM:1:5ms:144 AS:i:148NN:i:0TP:A:P:cm:i:5S1:i:40 s2:i:59 de:0

如何从下面的文件中提取每一行,并将内容附加到以第3列命名的文件中(例如主干线3710)

HWI-ST9450069:2:1101:17889:2254#GNNNNN/11 16主干线3710113730 1 28M1D61M11M1D11M*0 AATAGGAAGCCGGCTATCGAAGAGAGAGAGAGAGAGAGAGAGAGAGAGGCCATACACACACAATCT*NM:1:5ms:144 AS:i:148NN:i:0TP:A:P:cm:i:5S1:i:40 s2:i:59 de:0.0490:i:0
HWI-ST945_0069:2:1101:17753:2257#GNNNNN/11 16主干线2546 1217877 23 S62M114M*0 ATTTGTGTGTGTGTTTTCATGTTGCATGCATGGCAAGGTCATAAAAAAATCAAAAGACATAGATAGATAGATAGATATATATTCAACA*NM:i:3MS:i:118 AS:118 nn:i:0TP:A:P cm:i:3 s1:i:51 s2:i:0 de:f:0.0390 rl:
HWI-ST945_0069:2:1101:17922:2282#annn/11 16主干线2065 955626 2 16S20M2I7M111M3I35M1D6M*0 GaAcGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaTaTaTaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGaGa
HWI-ST945_0069:2:1101:17799:2282#annnn/11 16主干线3859 11630 1 75M1I24M*0 tcgtgtgttcacagatatcatatcagatatcagataccatccatccatccatccatccatccatcttcatccatccatccatccatccatccatccatcgaatccatccatcgaatccatccatcgaatccatccatccatcgaatccatccatccatcgaatccatccatcgaatccatccatccatcgaatccatccatccatcgaatcgaatccatccatccatccatccatccatcgaatccatccatccatccatccatccatccatccatccatccatcgaat
HWI-ST945_0069:2:1101:17876:2290#GNNNNN/11 0主干线1630 114655 2 23S37M40S*0 AcatgcatagAttgCtCtCaaagCtCtCaaGtCtCaaGtCtCaaGtCaaGtGatGatGatGatGatGatGatGatGatGatGatGatGatGatGatAg*NM:1:1ms:i:64 nn:i:i:0 tp:P cm:i:3 s1:i:27 s2:i:0 de:f:0
HWI-ST945_0069:2:1101:17982:2293 #gnnnn/11 4*0 0**0 TGATTAAATATATATATATATATATATATATATATATATATATATATATATATATATATATATAGAGAGAGAGAAAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGATTTATATATAGATATATAGATATATATATAGATAGATAGATAGAGAGAGAGAGAGAGAGAGAGATATATATATATATATATATATATA
HWI-ST945_0069:2:1101:17833:2296#gnnnn/11 4*0 0**0 TGAGGTTCCAGATTAATGCCATTCAAACTTCTACTGGGAATTCAGGTTCAGCGAGTTGCAGCACTTAGTGAGGAGAGAGTCAGTCAGCTGACTTAGGAGAGAGAGATGGATTCCCATCCC*rl:i:0
HWI-ST945_0069:2:1101:17908:2302#gnnnn/11 4*0 0**0 TAATTTAATAGTGCATGGACTTTCAGATTGTGTTCATCAATGGGTCGACTTCTTATGGATGGATGGATGTGTGAGGATGGATGGATGGATGGATGGATGGATTTAGATGGATTTAGATGGATTTG*rl:i:0
HWI-ST945_0069:2:1101:17759:2305#annnn/11 16主干线870 367318 10 27M1I34M3D6M1I31M*0 acataggcctcatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatgcatccatccatccagtttttttttagatagatgcatgcatgcatgcatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatccatcttt*NM:13 ms:78 AS:78 AS:78 AS:78 nn:78 nn:i:0tp:0tp:0tp
HWI-ST945_0069:2:1101:17878:2318#GNNNNN/11 0主干线2304 815440 1 67M1D27M6S*0 tgggttcttttattaaagaaccttatgctttacttacttactgaatcaatgaaatactcttactcaattaccaattcaatccaattcaattacattatctctctctt*NM:i:3 ms:i:160 AS:i:160 nn:i:i:0 tp:A:P cm:i:7 s1:60 s2:i:71 de:f:0.0316 rl:0
提前谢谢你。

类似的

awk '$3 != "*" { print $0 >> $3; close($3) }' input.txt

应该有用。请注意,每次打印后都要使用
close()
,以避免在有大量不同的输出文件时耗尽文件描述符。

您想要一个名为
*
的文件吗?这一点很好。如果是星号,则忽略该行。@user977828:发布您自己为解决该问题所做的努力this@Inian,您将标记从bash更改为awk,但我在问题中看不到任何一种语言。我错过了什么?@ghoti:这两个标签都来自OP。我看不出在这个标签中使用
bash
标签有任何关联。就这么一个人!如果不存在重复的
$3
条目,这将很好。如果存在,则意味着打开和关闭每个重复条目的文件句柄。理想的方法是将它们存储起来,最后全部关闭,比如
awk'$3!~“*”{fh[$3];print$0>>$3}END{for(fh中的句柄)close(handle)}文件
@Inian该
END
块没有意义,因为所有打开的文件都在退出时关闭。如果OP的数据最多只能写入几十个不同的文件,那么省略
close()
可能没问题。成百上千?然后,您对打开的文件的限制有问题,需要一种方法来修剪它们。这是最简单的方法。它们可能是2000到10.000个文件。