Shell 在bash中出现某些nb之前,进行Cat

Shell 在bash中出现某些nb之前,进行Cat,shell,awk,cat,Shell,Awk,Cat,我有一个文件,例如: @SRR9110374.1 1/1 GAGTATAAAGAAGAAAGTAAATCTCGGTTCGTCTCTTCATCGAGAGAAATGTCGACGAGAAAAAAAAAACAAGGGCTCATTTAAAGCCTTTCAAATCCT + BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF @SRR9110374.2 2/1 A

我有一个文件,例如:

@SRR9110374.1 1/1
GAGTATAAAGAAGAAAGTAAATCTCGGTTCGTCTCTTCATCGAGAGAAATGTCGACGAGAAAAAAAAAACAAGGGCTCATTTAAAGCCTTTCAAATCCT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR9110374.2 2/1
ATATGGAACAAGTTAAAAAAAATAAAAAGCAAAGAAATAATGTTTTGTCATCGAAAGTGTCGACATAAAAACAGGTTGGCATCTGGCCTGGTATCTCA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<BFFFFFFF<FFFFFFFFFFF
@SRR9110374.3 3/1
NTATAACCGTATCAAAGAAGTTTACCCCGAGAGAAGCACGCAGTTTCCCACAGGTAATTTTCTCACAAGCGAGAGAAACATCATACCGCAATCAGGAAC
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFF
@SRR9110374.4 4/1
GATAAAGAATATAGCTATGTATAGCCGGGATATATTAAGTGATTGAAATATCTCTTAGAAATCCATAGAATAGTAGTGTATCGAATAGGAGGAAGCGAAA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR9110374.5 5/1
CTTCCAATGCTTGCCAAAGTTCATTGTCGTTGTAATTATCGAAAGGATCTAAATTCTTTCTCAACGAACCCGAGAATAGGAAGGGTTCTTGAGGAATTAT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFFBFFF/FFF
@SRR9110374.6 6/1
ACCGATAATCTTTCCTTCTCAAGAATTTTGTTAATATTCCACATTTTTAAATAGATTTCATTTCTCTCTCTCTTTCTCTCTCTTTTTCTTGTCCTCGATG
+
BBBBBFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFBFFFF///FF
@SRR9110374.7 7/1
GTTGTGCTGAGAATGTTAATAAATTACAAAATGTTATCACTAACTTGGAAATATTCGAATCGACAGATATCGCGTTTGTCGTGTTGTATTAATATATTC
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR9110374.8 8/1
GTCATAGAACGGGGGAGGGGAGGAAGAAGAAAGGAAGGGAAAAAAACGAGAGAGAGAGAGGGGATTACGCTCGCCGTTCGAATCGTTAGGCGTCCGTTT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFBBFBBFF
@SRR9110374.9 9/1
AATTATTATTTAATCGACGCGTCTATCGATAAATCATCCTCGAATGCTAAGCAAAACTGAACTTCCGCAAATATTGCACACGAAACGTTGAAACAAAG
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
我应该得到:

@SRR9110374.1 1/1
GAGTATAAAGAAGAAAGTAAATCTCGGTTCGTCTCTTCATCGAGAGAAATGTCGACGAGAAAAAAAAAACAAGGGCTCATTTAAAGCCTTTCAAATCCT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR9110374.2 2/1
ATATGGAACAAGTTAAAAAAAATAAAAAGCAAAGAAATAATGTTTTGTCATCGAAAGTGTCGACATAAAAACAGGTTGGCATCTGGCCTGGTATCTCA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<BFFFFFFF<FFFFFFFFFFF
@SRR9110374.3 3/1
NTATAACCGTATCAAAGAAGTTTACCCCGAGAGAAGCACGCAGTTTCCCACAGGTAATTTTCTCACAAGCGAGAGAAACATCATACCGCAATCAGGAAC
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFF
@SRR9110374.4 4/1
GATAAAGAATATAGCTATGTATAGCCGGGATATATTAAGTGATTGAAATATCTCTTAGAAATCCATAGAATAGTAGTGTATCGAATAGGAGGAAGCGAAA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

Ps:真正的文件非常大,所以我应该得到一个适合的方法,这将是很好的。

请您尝试以下内容

Nb_occurence=4
awk -v nb_occur="$Nb_occurence" '
BEGIN{
  occur=0
}
/@/{
  occur++
}
occur>nb_occur{
  exit
}
occur
' Input_file
Ps:真实的文件非常大,所以我应该得到一个适合的方法 那太好了

要加快输入文件读取速度,请执行以下操作:


为了加快您对输入文件的处理,我使用了exit,因此,一旦您提到的读取次数完成,它将尽快从输入文件中出来,因为我们不需要进一步读取它,因此它应该比您的解决方案更快。

请尝试以下操作

Nb_occurence=4
awk -v nb_occur="$Nb_occurence" '
BEGIN{
  occur=0
}
/@/{
  occur++
}
occur>nb_occur{
  exit
}
occur
' Input_file
Ps:真实的文件非常大,所以我应该得到一个适合的方法 那太好了

要加快输入文件读取速度,请执行以下操作:

为了加快您对输入文件的处理速度,我使用了exit,因此,一旦您提到的读取次数完成,它将尽快从输入文件中出来,因为我们不需要进一步读取它,因此它应该比您的解决方案快。

您应该以这种方式重写awk:

awk -v occurence=$Nb_occurence 'BEGIN{ found=0} /@/{found=found+1} {if ( found < occurence ) print }' file
而且您不需要cat,awk可以读取文件

您应该通过以下方式重写您的awk:

awk -v occurence=$Nb_occurence 'BEGIN{ found=0} /@/{found=found+1} {if ( found < occurence ) print }' file
您不需要cat,awk可以读取文件

另一个awk:

$ awk -v n=4 '/@/&&!n--{exit}1' file
输出:

@SRR9110374.1 1/1
GAGTATAAAGAAGAAAGTAAATCTCGGTTCGTCTCTTCATCGAGAGAAATGTCGACGAGAAAAAAAAAACAAGGGCTCATTTAAAGCCTTTCAAATCCT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR9110374.2 2/1
ATATGGAACAAGTTAAAAAAAATAAAAAGCAAAGAAATAATGTTTTGTCATCGAAAGTGTCGACATAAAAACAGGTTGGCATCTGGCCTGGTATCTCA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<BFFFFFFF<FFFFFFFFFFF
@SRR9110374.3 3/1
NTATAACCGTATCAAAGAAGTTTACCCCGAGAGAAGCACGCAGTTTCCCACAGGTAATTTTCTCACAAGCGAGAGAAACATCATACCGCAATCAGGAAC
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFF
@SRR9110374.4 4/1
GATAAAGAATATAGCTATGTATAGCCGGGATATATTAAGTGATTGAAATATCTCTTAGAAATCCATAGAATAGTAGTGTATCGAATAGGAGGAAGCGAAA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
另一个awk:

$ awk -v n=4 '/@/&&!n--{exit}1' file
输出:

@SRR9110374.1 1/1
GAGTATAAAGAAGAAAGTAAATCTCGGTTCGTCTCTTCATCGAGAGAAATGTCGACGAGAAAAAAAAAACAAGGGCTCATTTAAAGCCTTTCAAATCCT
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@SRR9110374.2 2/1
ATATGGAACAAGTTAAAAAAAATAAAAAGCAAAGAAATAATGTTTTGTCATCGAAAGTGTCGACATAAAAACAGGTTGGCATCTGGCCTGGTATCTCA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<BFFFFFFF<FFFFFFFFFFF
@SRR9110374.3 3/1
NTATAACCGTATCAAAGAAGTTTACCCCGAGAGAAGCACGCAGTTTCCCACAGGTAATTTTCTCACAAGCGAGAGAAACATCATACCGCAATCAGGAAC
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFF
@SRR9110374.4 4/1
GATAAAGAATATAGCTATGTATAGCCGGGATATATTAAGTGATTGAAATATCTCTTAGAAATCCATAGAATAGTAGTGTATCGAATAGGAGGAAGCGAAA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

这样做,您将读取整个文件,如果文件很大,这可能会变得有害!看一看!正当但这是操作的方式。我只是调整他/她的解决方案。这样做,您将读取整个文件,如果文件很大,这可能会变得有害!看一看!正当但这是OP的方式。我只是调整他/她的解决方案。