Awk 根据每行一列中的最大数字筛选文件
我有以下文件:Awk 根据每行一列中的最大数字筛选文件,awk,Awk,我有以下文件: chr11_pilon3.g3568.t1 transcript:OIT01734 transcript:OIT01734 1.1e-107 389.8 1000 218 992 1 216 130 345 MDALTRHIQGDVPWCMLFADDIILIDETRAGVSERLEIWRQTLESKGFKISRSKTEYLECKFGDEPSGVGREVMLGSQAIAKRDSVRYLGSVIQGDGEIDGDVTHRIGAGWSKWRLASGVLCDKKI
chr11_pilon3.g3568.t1 transcript:OIT01734 transcript:OIT01734 1.1e-107 389.8 1000 218 992 1 216 130 345 MDALTRHIQGDVPWCMLFADDIILIDETRAGVSERLEIWRQTLESKGFKISRSKTEYLECKFGDEPSGVGREVMLGSQAIAKRDSVRYLGSVIQGDGEIDGDVTHRIGAGWSKWRLASGVLCDKKIPHKLKGKFFRAMVRPAMFYEAECWPVKNSHIQRMKVAEMRMLRWMCGHTRLDKIKNEVIRQKVGVAPVDKKMGEARLRWFGHVRRRGPDA MDALTRHIQGDVPWCMLFADDIVLIDETRVGVNERLEVWRQTLESKGFKLSRSKTEYLECKFSAESSEVGRDVKLGSQVIAKRDSFRYLGSVIQGEGEIDGDVTHRIGAGWSKWRLASGVLCDKKVPQKLKGKFYRAVVRPAMLYGAECWPVKNSHVQRMKVAEMRMLRWMRGLTRLDRIRNEVIREKVGVALVDEKMREARLRWYGHVRRRRPDA MDALTRHIQGDVPWCMLFADDIILIDETRAGVSERLEIWRQTLESKGFKISRSKTEYLECKFGDEPSGVGREVMLGSQAIAKRDSVRYLGSVIQGDGEIDGDVTHRIGAGWSKWRLASGVLCDKKIPHKLKGKFFRAMVRPAMFYEAECWPVKNSHIQRMKVAEMRMLRWMCGHTRLDKIKNEVIRQKVGVAPVDKKMGEARLRWFGHVRRRGPDAR* MKVWERVVEARVREMTSISVNQFGFMPGRSTTEAIHLVRRLVEHFRDKKKDLHMVFIDLENAYDKVPREVLWRCLEAKSVPEAYIRVIKDMYDGAKTRVRTVGGDSDHFPVVMGLHQGSALSPLLFALVMDALTRHIQGDVPWCMLFADDIVLIDETRVGVNERLEVWRQTLESKGFKLSRSKTEYLECKFSAESSEVGRDVKLGSQVIAKRDSFRYLGSVIQGEGEIDGDVTHRIGAGWSKWRLASGVLCDKKVPQKLKGKFYRAVVRPAMLYGAECWPVKNSHVQRMKVAEMRMLRWMRGLTRLDRIRNEVIREKVGVALVDEKMREARLRWYGHVRRRRPDAPVRIYKSAILGHLNSHGSQNALAGPVEAEENRQKTKKEVMEEIIQKSKFFKAQKAKDREENDELTEQLDKDFTSLVESKALLSLTQPDKINALKALVNKNISVGNVKKDEVADVPRKASIGKEKPDTYEMLVSEMALDMRARPSDRTKTPEEIAQEEKERLELLEQEXXXXXXXXXXXXXXDGNASDDNSKLVKDPRTVSGDDLGDDLEEVPRTKLGWIGEILRRKENELESEDAASSGDSDDGEDEGXXXXXXXXXXXXXXXXXXXXDEEQGKTQTIKDWEQSDDDIIDTELEDDDEGFGDDAKKVVKIKDHKEENLSITVAAENKKKMQVFYGVLLQYFAVLANKKPLNSKLLNLLVKPLMEMSAVSPYFAAICARQRLQRTRAQFCEDLKNTGKSSWPSLKTIFLLRLWSMIFPCSDFRHCVMTPAILLMCEYLMRCTIISGRDIAIASFLCSLLLSVIKQSQKFCPEAIVFIQTLLMAALDRKQRSNSQLDNLMEIKELGPLLCIRSSKVEMDSLDFLTLMDLPEDSQYFHSDNYRTSMLVTVLETLQGFVNVYKELISFPEIFMLISKLLCKMAGENHIPDALREKIKDVSQLIDTKAQEHHMLRQPLKMRKKKPVPIRMLNPKFEENFVKGRDYDPDRERA 389.8 1000 216 85.6 185 31 200 0 0 92.6 0 22IV6AV2SN4IV11IL12GSDA1PS1GE3ED1MK4AV6VF9DE29IV1HQ6FY2MV5FL1EG10IV14CR1HL4KR1KR5QE5PL2KE2GR6FY6GR3 85.6 1.1e-107 99.1
gene.10002.1.1.p1 NisylKD957037g0001.1 NisylKD957037g0001.1 0.0e+00 1218.8 3152 668 780 5 667 122 780 KVIARCRPELAHIPSLEEAPVFHPSEEEFEDTLKYVGSILPHVKHYGICRIVPPSSWKPPSCIEEESTVYGVNTHIQRTSELQNLFFKKRLEGACTRTNNKQQKTLSRKSDFGLDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESGFPHERGVTIHRPQYVESGWNLNNTPKLQDSLLRFGSHESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLFQNMAFQFSPSILTSEGIPVYRCVQNPKEFVLILPGAYHAHVDSGFNCSEAVNFAPFDWLPHGQNAVDLYSEQRRKTSISYDKLLFEAATERIRALAELPLLHKKFFDNLKWRAVCRSNEILTKALKSRFATEVRRRKYMCASLESRKMEDDFCATAKRECSICYYDLYLSAIGCTCSPQKYTCLLHAKQLCSCAWREKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGFPVSDFSKDASKDEMKVKSESGQSLDVEQDRKEASIPSVGPSARTNNLNRVTGSWVEADGLSHQPQPKGIVNDTVEVIFPKISQHATVGKNIMISSNTVLKKHLARESSSTKRTVIILSDDEN KVIARCRPELARIPSLEEAPVFHPNTLKYVASILPHVKHYGICRIVPPSSWKPPSRIEEPSTVYGVNTHIQRTSDLQNLFFKKRLEGACTRTNNKQQKTLSGKSDFGHDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESSFPHEGDVTSRRPPQYVESGWNLNNTPKLQDSLLRFGSRESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLLQNIAFQFSPSVLTSEGIPVYRCVQNPKEFVLLLPGAYHAHADSGFNCSEAVNFAPFDWLPHGQNAVELYSEQGRKTSISYDKLLFEAATEGIRALPELPLLHKNFFDNLKWRAVYRSNEILTKALKSRVSTEVRRRTYLCASLESRKMEDDFCATTKRECPICYYDLYLSAIGCKCSPHKYTCLLHAKQLCPCAWSEKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGLPVSDVFKDASKDGMKVKSESGQSLDIEQDRKEEVSIPSVGPSARTNNVNRVSGSWVEADGSSHRPQSKGIINDKIEVLFPKISQHATVGKNIMTSSNTVLKKHLARESSSTKRSVIILSDDEN MFGFKVIARCRPELAHIPSLEEAPVFHPSEEEFEDTLKYVGSILPHVKHYGICRIVPPSSWKPPSCIEEESTVYGVNTHIQRTSELQNLFFKKRLEGACTRTNNKQQKTLSRKSDFGLDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESGFPHERGVTIHRPQYVESGWNLNNTPKLQDSLLRFGSHESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLFQNMAFQFSPSILTSEGIPVYRCVQNPKEFVLILPGAYHAHVDSGFNCSEAVNFAPFDWLPHGQNAVDLYSEQRRKTSISYDKLLFEAATERIRALAELPLLHKKFFDNLKWRAVCRSNEILTKALKSRFATEVRRRKYMCASLESRKMEDDFCATAKRECSICYYDLYLSAIGCTCSPQKYTCLLHAKQLCSCAWREKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGFPVSDFSKDASKDEMKVKSESGQSLDVEQDRKEASIPSVGPSARTNNLNRVTGSWVEADGLSHQPQPKGIVNDTVEVIFPKISQHATVGKNIMISSNTVLKKHLARESSSTKRTVIILSDDEN* MGAKRTRSNSESDDGYKLSVPPGFESLMSFTLKKVKNSEEACNSVALGSGFAQGPSLVAATSTIISTGKLKSSVRHRPWILDDHVDHIEDDSEFEDDKSLSSSAFLPKGVIRGCSSCHNCQKVIARCRPELARIPSLEEAPVFHPNTLKYVASILPHVKHYGICRIVPPSSWKPPSRIEEPSTVYGVNTHIQRTSDLQNLFFKKRLEGACTRTNNKQQKTLSGKSDFGHDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESSFPHEGDVTSRRPPQYVESGWNLNNTPKLQDSLLRFGSRESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLLQNIAFQFSPSVLTSEGIPVYRCVQNPKEFVLLLPGAYHAHADSGFNCSEAVNFAPFDWLPHGQNAVELYSEQGRKTSISYDKLLFEAATEGIRALPELPLLHKNFFDNLKWRAVYRSNEILTKALKSRVSTEVRRRTYLCASLESRKMEDDFCATTKRECPICYYDLYLSAIGCKCSPHKYTCLLHAKQLCPCAWSEKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGLPVSDVFKDASKDGMKVKSESGQSLDIEQDRKEEVSIPSVGPSARTNNVNRVSGSWVEADGSSHRPQSKGIINDKIEVLFPKISQHATVGKNIMTSSNTVLKKHLARESSSTKRSVIILSDDEN 1218.8 3152 665 91.0 605 52 621 3 8 93.4 0 11HR12SNE-E-E-F-E-D-5GA24CR3EP14ED26RG5LH85GS4RGGD2ISHR2-P24HR70FL2MI7IV20IL8VA25DE5RG17RG4AP7KN10CY13FVAS6KT1ML16AT4SP13TK3QH12SP3RS36FL4FVSF6EG12VI6-EAV13LV3TS8LS2QR2PS3VI2TKVI2IL15IT19TS9 91.0 0.0e+00 99.3
gene.10002.1.4.p1 NisylKD957037g0001.1 NisylKD957037g0001.1 0.0e+00 1216.8 3147 671 780 9 670 123 780 VIARCRPELAHIPSLEEAPVFHPSEEEFEDTLKYVGSILPHVKHYGICRIVPPSSWKPPSCIEEESTVYGVNTHIQRTSELQNLFFKKRLEGACTRTNNKQQKTLSRKSDFGLDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESGFPHERGVTIHRPQYVESGWNLNNTPKLQDSLLRFGSHESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLFQNMAFQFSPSILTSEGIPVYRCVQNPKEFVLILPGAYHAHVDSGFNCSEAVNFAPFDWLPHGQNAVDLYSEQRRKTSISYDKLLFEAATERIRALAELPLLHKKFFDNLKWRAVCRSNEILTKALKSRFATEVRRRKYMCASLESRKMEDDFCATAKRECSICYYDLYLSAIGCTCSPQKYTCLLHAKQLCSCAWREKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGFPVSDFSKDASKDEMKVKSESGQSLDVEQDRKEASIPSVGPSARTNNLNRVTGSWVEADGLSHQPQPKGIVNDTVEVIFPKISQHATVGKNIMISSNTVLKKHLARESSSTKRTVIILSDDEN VIARCRPELARIPSLEEAPVFHPNTLKYVASILPHVKHYGICRIVPPSSWKPPSRIEEPSTVYGVNTHIQRTSDLQNLFFKKRLEGACTRTNNKQQKTLSGKSDFGHDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESSFPHEGDVTSRRPPQYVESGWNLNNTPKLQDSLLRFGSRESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLLQNIAFQFSPSVLTSEGIPVYRCVQNPKEFVLLLPGAYHAHADSGFNCSEAVNFAPFDWLPHGQNAVELYSEQGRKTSISYDKLLFEAATEGIRALPELPLLHKNFFDNLKWRAVYRSNEILTKALKSRVSTEVRRRTYLCASLESRKMEDDFCATTKRECPICYYDLYLSAIGCKCSPHKYTCLLHAKQLCPCAWSEKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGLPVSDVFKDASKDGMKVKSESGQSLDIEQDRKEEVSIPSVGPSARTNNVNRVSGSWVEADGSSHRPQSKGIINDKIEVLFPKISQHATVGKNIMTSSNTVLKKHLARESSSTKRSVIILSDDEN MFGFKARIVIARCRPELAHIPSLEEAPVFHPSEEEFEDTLKYVGSILPHVKHYGICRIVPPSSWKPPSCIEEESTVYGVNTHIQRTSELQNLFFKKRLEGACTRTNNKQQKTLSRKSDFGLDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESGFPHERGVTIHRPQYVESGWNLNNTPKLQDSLLRFGSHESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLFQNMAFQFSPSILTSEGIPVYRCVQNPKEFVLILPGAYHAHVDSGFNCSEAVNFAPFDWLPHGQNAVDLYSEQRRKTSISYDKLLFEAATERIRALAELPLLHKKFFDNLKWRAVCRSNEILTKALKSRFATEVRRRKYMCASLESRKMEDDFCATAKRECSICYYDLYLSAIGCTCSPQKYTCLLHAKQLCSCAWREKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGFPVSDFSKDASKDEMKVKSESGQSLDVEQDRKEASIPSVGPSARTNNLNRVTGSWVEADGLSHQPQPKGIVNDTVEVIFPKISQHATVGKNIMISSNTVLKKHLARESSSTKRTVIILSDDEN* MGAKRTRSNSESDDGYKLSVPPGFESLMSFTLKKVKNSEEACNSVALGSGFAQGPSLVAATSTIISTGKLKSSVRHRPWILDDHVDHIEDDSEFEDDKSLSSSAFLPKGVIRGCSSCHNCQKVIARCRPELARIPSLEEAPVFHPNTLKYVASILPHVKHYGICRIVPPSSWKPPSRIEEPSTVYGVNTHIQRTSDLQNLFFKKRLEGACTRTNNKQQKTLSGKSDFGHDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESSFPHEGDVTSRRPPQYVESGWNLNNTPKLQDSLLRFGSRESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLLQNIAFQFSPSVLTSEGIPVYRCVQNPKEFVLLLPGAYHAHADSGFNCSEAVNFAPFDWLPHGQNAVELYSEQGRKTSISYDKLLFEAATEGIRALPELPLLHKNFFDNLKWRAVYRSNEILTKALKSRVSTEVRRRTYLCASLESRKMEDDFCATTKRECPICYYDLYLSAIGCKCSPHKYTCLLHAKQLCPCAWSEKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGLPVSDVFKDASKDGMKVKSESGQSLDIEQDRKEEVSIPSVGPSARTNNVNRVSGSWVEADGSSHRPQSKGIINDKIEVLFPKISQHATVGKNIMTSSNTVLKKHLARESSSTKRSVIILSDDEN 1216.8 3147 664 91.0 604 52 620 3 8 93.4 0 10HR12SNE-E-E-F-E-D-5GA24CR3EP14ED26RG5LH85GS4RGGD2ISHR2-P24HR70FL2MI7IV20IL8VA25DE5RG17RG4AP7KN10CY13FVAS6KT1ML16AT4SP13TK3QH12SP3RS36FL4FVSF6EG12VI6-EAV13LV3TS8LS2QR2PS3VI2TKVI2IL15IT19TS9 91.0 0.0e+00 98.7
gene.10002.1.5.p1 NisylKD957037g0001.1 NisylKD957037g0001.1 0.0e+00 1218.8 3152 668 780 5 667 122 780 KVIARCRPELAHIPSLEEAPVFHPSEEEFEDTLKYVGSILPHVKHYGICRIVPPSSWKPPSCIEEESTVYGVNTHIQRTSELQNLFFKKRLEGACTRTNNKQQKTLSRKSDFGLDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESGFPHERGVTIHRPQYVESGWNLNNTPKLQDSLLRFGSHESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLFQNMAFQFSPSILTSEGIPVYRCVQNPKEFVLILPGAYHAHVDSGFNCSEAVNFAPFDWLPHGQNAVDLYSEQRRKTSISYDKLLFEAATERIRALAELPLLHKKFFDNLKWRAVCRSNEILTKALKSRFATEVRRRKYMCASLESRKMEDDFCATAKRECSICYYDLYLSAIGCTCSPQKYTCLLHAKQLCSCAWREKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGFPVSDFSKDASKDEMKVKSESGQSLDVEQDRKEASIPSVGPSARTNNLNRVTGSWVEADGLSHQPQPKGIVNDTVEVIFPKISQHATVGKNIMISSNTVLKKHLARESSSTKRTVIILSDDEN KVIARCRPELARIPSLEEAPVFHPNTLKYVASILPHVKHYGICRIVPPSSWKPPSRIEEPSTVYGVNTHIQRTSDLQNLFFKKRLEGACTRTNNKQQKTLSGKSDFGHDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESSFPHEGDVTSRRPPQYVESGWNLNNTPKLQDSLLRFGSRESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLLQNIAFQFSPSVLTSEGIPVYRCVQNPKEFVLLLPGAYHAHADSGFNCSEAVNFAPFDWLPHGQNAVELYSEQGRKTSISYDKLLFEAATEGIRALPELPLLHKNFFDNLKWRAVYRSNEILTKALKSRVSTEVRRRTYLCASLESRKMEDDFCATTKRECPICYYDLYLSAIGCKCSPHKYTCLLHAKQLCPCAWSEKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGLPVSDVFKDASKDGMKVKSESGQSLDIEQDRKEEVSIPSVGPSARTNNVNRVSGSWVEADGSSHRPQSKGIINDKIEVLFPKISQHATVGKNIMTSSNTVLKKHLARESSSTKRSVIILSDDEN MFGFKVIARCRPELAHIPSLEEAPVFHPSEEEFEDTLKYVGSILPHVKHYGICRIVPPSSWKPPSCIEEESTVYGVNTHIQRTSELQNLFFKKRLEGACTRTNNKQQKTLSRKSDFGLDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESGFPHERGVTIHRPQYVESGWNLNNTPKLQDSLLRFGSHESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLFQNMAFQFSPSILTSEGIPVYRCVQNPKEFVLILPGAYHAHVDSGFNCSEAVNFAPFDWLPHGQNAVDLYSEQRRKTSISYDKLLFEAATERIRALAELPLLHKKFFDNLKWRAVCRSNEILTKALKSRFATEVRRRKYMCASLESRKMEDDFCATAKRECSICYYDLYLSAIGCTCSPQKYTCLLHAKQLCSCAWREKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGFPVSDFSKDASKDEMKVKSESGQSLDVEQDRKEASIPSVGPSARTNNLNRVTGSWVEADGLSHQPQPKGIVNDTVEVIFPKISQHATVGKNIMISSNTVLKKHLARESSSTKRTVIILSDDEN* MGAKRTRSNSESDDGYKLSVPPGFESLMSFTLKKVKNSEEACNSVALGSGFAQGPSLVAATSTIISTGKLKSSVRHRPWILDDHVDHIEDDSEFEDDKSLSSSAFLPKGVIRGCSSCHNCQKVIARCRPELARIPSLEEAPVFHPNTLKYVASILPHVKHYGICRIVPPSSWKPPSRIEEPSTVYGVNTHIQRTSDLQNLFFKKRLEGACTRTNNKQQKTLSGKSDFGHDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESSFPHEGDVTSRRPPQYVESGWNLNNTPKLQDSLLRFGSRESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLLQNIAFQFSPSVLTSEGIPVYRCVQNPKEFVLLLPGAYHAHADSGFNCSEAVNFAPFDWLPHGQNAVELYSEQGRKTSISYDKLLFEAATEGIRALPELPLLHKNFFDNLKWRAVYRSNEILTKALKSRVSTEVRRRTYLCASLESRKMEDDFCATTKRECPICYYDLYLSAIGCKCSPHKYTCLLHAKQLCPCAWSEKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGLPVSDVFKDASKDGMKVKSESGQSLDIEQDRKEEVSIPSVGPSARTNNVNRVSGSWVEADGSSHRPQSKGIINDKIEVLFPKISQHATVGKNIMTSSNTVLKKHLARESSSTKRSVIILSDDEN 1218.8 3152 665 91.0 605 52 621 3 8 93.4 0 11HR12SNE-E-E-F-E-D-5GA24CR3EP14ED26RG5LH85GS4RGGD2ISHR2-P24HR70FL2MI7IV20IL8VA25DE5RG17RG4AP7KN10CY13FVAS6KT1ML16AT4SP13TK3QH12SP3RS36FL4FVSF6EG12VI6-EAV13LV3TS8LS2QR2PS3VI2TKVI2IL15IT19TS9 91.0 0.0e+00 99.3
gene.10002.1.6.p1 NisylKD957037g0001.1 NisylKD957037g0001.1 0.0e+00 1440.2 3727 799 780 15 798 1 780 MGAKRTRSNGESDDGYKLSVPPGFESLMSFTLKKVKNSEEACNSVALESEFAQSPSQVAATSTIISIGKLKSSVRHRPWILDDHVDHIEDDSEFEDDKSLSSIAFLPKGVIRGCSSCHNCQKVIARCRPELAHIPSLEEAPVFHPSEEEFEDTLKYVGSILPHVKHYGICRIVPPSSWKPPSCIEEESTVYGVNTHIQRTSELQNLFFKKRLEGACTRTNNKQQKTLSRKSDFGLDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESGFPHERGVTIHRPQYVESGWNLNNTPKLQDSLLRFGSHESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLFQNMAFQFSPSILTSEGIPVYRCVQNPKEFVLILPGAYHAHVDSGFNCSEAVNFAPFDWLPHGQNAVDLYSEQRRKTSISYDKLLFEAATERIRALAELPLLHKKFFDNLKWRAVCRSNEILTKALKSRFATEVRRRKYMCASLESRKMEDDFCATAKRECSICYYDLYLSAIGCTCSPQKYTCLLHAKQLCSCAWREKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGFPVSDFSKDASKDEMKVKSESGQSLDVEQDRKEASIPSVGPSARTNNLNRVTGSWVEADGLSHQPQPKGIVNDTVEVIFPKISQHATVGKNIMISSNTVLKKHLARESSSTKRTVIILSDDEN MGAKRTRSNSESDDGYKLSVPPGFESLMSFTLKKVKNSEEACNSVALGSGFAQGPSLVAATSTIISTGKLKSSVRHRPWILDDHVDHIEDDSEFEDDKSLSSSAFLPKGVIRGCSSCHNCQKVIARCRPELARIPSLEEAPVFHPNTLKYVASILPHVKHYGICRIVPPSSWKPPSRIEEPSTVYGVNTHIQRTSDLQNLFFKKRLEGACTRTNNKQQKTLSGKSDFGHDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESSFPHEGDVTSRRPPQYVESGWNLNNTPKLQDSLLRFGSRESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLLQNIAFQFSPSVLTSEGIPVYRCVQNPKEFVLLLPGAYHAHADSGFNCSEAVNFAPFDWLPHGQNAVELYSEQGRKTSISYDKLLFEAATEGIRALPELPLLHKNFFDNLKWRAVYRSNEILTKALKSRVSTEVRRRTYLCASLESRKMEDDFCATTKRECPICYYDLYLSAIGCKCSPHKYTCLLHAKQLCPCAWSEKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGLPVSDVFKDASKDGMKVKSESGQSLDIEQDRKEEVSIPSVGPSARTNNVNRVSGSWVEADGSSHRPQSKGIINDKIEVLFPKISQHATVGKNIMTSSNTVLKKHLARESSSTKRSVIILSDDEN MSDCTWQRYKGEVLMGAKRTRSNGESDDGYKLSVPPGFESLMSFTLKKVKNSEEACNSVALESEFAQSPSQVAATSTIISIGKLKSSVRHRPWILDDHVDHIEDDSEFEDDKSLSSIAFLPKGVIRGCSSCHNCQKVIARCRPELAHIPSLEEAPVFHPSEEEFEDTLKYVGSILPHVKHYGICRIVPPSSWKPPSCIEEESTVYGVNTHIQRTSELQNLFFKKRLEGACTRTNNKQQKTLSRKSDFGLDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESGFPHERGVTIHRPQYVESGWNLNNTPKLQDSLLRFGSHESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLFQNMAFQFSPSILTSEGIPVYRCVQNPKEFVLILPGAYHAHVDSGFNCSEAVNFAPFDWLPHGQNAVDLYSEQRRKTSISYDKLLFEAATERIRALAELPLLHKKFFDNLKWRAVCRSNEILTKALKSRFATEVRRRKYMCASLESRKMEDDFCATAKRECSICYYDLYLSAIGCTCSPQKYTCLLHAKQLCSCAWREKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGFPVSDFSKDASKDEMKVKSESGQSLDVEQDRKEASIPSVGPSARTNNLNRVTGSWVEADGLSHQPQPKGIVNDTVEVIFPKISQHATVGKNIMISSNTVLKKHLARESSSTKRTVIILSDDEN* MGAKRTRSNSESDDGYKLSVPPGFESLMSFTLKKVKNSEEACNSVALGSGFAQGPSLVAATSTIISTGKLKSSVRHRPWILDDHVDHIEDDSEFEDDKSLSSSAFLPKGVIRGCSSCHNCQKVIARCRPELARIPSLEEAPVFHPNTLKYVASILPHVKHYGICRIVPPSSWKPPSRIEEPSTVYGVNTHIQRTSDLQNLFFKKRLEGACTRTNNKQQKTLSGKSDFGHDIERKEFGCCNEHFEFENGPKLMLKYFKHYADHFKKQYFVKEDQITASEPSIQDIEGEYWRIIENPTEEIEVLQGTSAEIKATESSFPHEGDVTSRRPPQYVESGWNLNNTPKLQDSLLRFGSRESSSILLPRLSIGMCFSSNLWRIEEHHLYLLSYIHFGAPKIFYGVPGSYRCKFEEAVKKHLPQLSAHPCLLQNIAFQFSPSVLTSEGIPVYRCVQNPKEFVLLLPGAYHAHADSGFNCSEAVNFAPFDWLPHGQNAVELYSEQGRKTSISYDKLLFEAATEGIRALPELPLLHKNFFDNLKWRAVYRSNEILTKALKSRVSTEVRRRTYLCASLESRKMEDDFCATTKRECPICYYDLYLSAIGCKCSPHKYTCLLHAKQLCPCAWSEKYLLIRYEIDELNIMVEALDGKVSAVHKWAKEKLGLPVSDVFKDASKDGMKVKSESGQSLDIEQDRKEEVSIPSVGPSARTNNVNRVSGSWVEADGSSHRPQSKGIINDKIEVLFPKISQHATVGKNIMTSSNTVLKKHLARESSSTKRSVIILSDDEN 1440.2 3727 786 91.5 719 59 735 3 8 93.5 0 9GS37EG1EG3SG2QL9IT35IS29HR12SNE-E-E-F-E-D-5GA24CR3EP14ED26RG5LH85GS4RGGD2ISHR2-P24HR70FL2MI7IV20IL8VA25DE5RG17RG4AP7KN10CY13FVAS6KT1ML16AT4SP13TK3QH12SP3RS36FL4FVSF6EG12VI6-EAV13LV3TS8LS2QR2PS3VI2TKVI2IL15IT19TS9 91.5 0.0e+00 98.1
上面的文件有一些类似的ID
gene.10002.1.1.p1
gene.10002.1.4.p1
gene.10002.1.5.p1
gene.10002.1.6.p1
通过只保留gene.10002
,ID变得相同。我使用这个awk脚本(感谢@anubhava)只保留具有最小值的相同ID的行(第30列)
awk'{
如果(/^gene\./){
拆分($1,a,/\./)
k=a[1]““a[2]
}
其他的
k=1美元
}
!(k单位:分钟)| | 30美元在这里修复OP的尝试,请您尝试以下内容。您应该更改您的条件,以便在$31>=max[k]
中对=
条件进行比较,因为我们现在正在寻找最大值,所以在本文后面的部分也添加了详细的解释
awk '{
if (/^gene\./) {
split($1, a, /\./)
k = a[1] "." a[2]
}
else
k = $1
}
!(k in max) || $31 >= max[k] {
if(!(k in max))
ord[++n] = k
else if (max[k] == $31) {
print
next
}
max[k] = $31
rec[k] = $0
}
END {
for (i=1; i<=n; ++i)
print rec[ord[i]]
}' Input_file
awk'{
如果(/^gene\./){
拆分($1,a,/\./)
k=a[1]““a[2]
}
其他的
k=1美元
}
!(k最大值)| |$31>=最大值[k]{
如果(!(最大值为k))
ord[++n]=k
否则,如果(最大[k]==31美元){
打印
下一个
}
最高[k]=31元
记录[k]=0美元
}
结束{
对于(i=1;i=max[k]{##如果k不在max数组中且第31个字段>=max[k]则检查条件
如果(!(最大值中的k)##如果上述任何条件为真,则检查最大值中是否不存在k
ord[++n]=k##创建索引为n增加值且其值为k的ord
else如果(max[k]=$31){##else打印最大重复行,则无需继续在数组中追加它。
打印##在这里打印。
next##next将跳过此处的所有进一步语句。
}
max[k]=$31##创建索引为k、值为31的max字段。
rec[k]=$0##使用k的索引和当前行的值创建rec。
}
结束{##从此处开始此程序的结束块。
对于(i=1;i修复OP的尝试,请尝试以下内容。您应该更改您的条件,以便对$31>=max[k]
中的=
条件进行比较,因为我们现在正在寻找最大值,所以在本文后面的部分也添加了详细的解释
awk '{
if (/^gene\./) {
split($1, a, /\./)
k = a[1] "." a[2]
}
else
k = $1
}
!(k in max) || $31 >= max[k] {
if(!(k in max))
ord[++n] = k
else if (max[k] == $31) {
print
next
}
max[k] = $31
rec[k] = $0
}
END {
for (i=1; i<=n; ++i)
print rec[ord[i]]
}' Input_file
awk'{
如果(/^gene\./){
拆分($1,a,/\./)
k=a[1]““a[2]
}
其他的
k=1美元
}
!(最大值为k)| |$31>=最大值[k]{
如果(!(最大值为k))
ord[++n]=k
否则,如果(最大[k]==31美元){
打印
下一个
}
最高[k]=31元
记录[k]=0美元
}
结束{
对于(i=1;i=max[k]{##如果k不在max数组中且第31个字段>=max[k]则检查条件
如果(!(最大值中的k)##如果上述任何条件为真,则检查最大值中是否不存在k
ord[++n]=k##创建索引为n增加值且其值为k的ord
else如果(max[k]=$31){##else打印最大重复行,则无需继续在数组中追加它。
打印##在这里打印。
next##next将跳过此处的所有进一步语句。
}
max[k]=$31##创建索引为k、值为31的max字段。
rec[k]=$0##使用k的索引和当前行的值创建rec。
}
结束{##从此处开始此程序的结束块。
对于(i=1;iIs这是一个不同于“是”的问题,这是一个不同的问题,因为这是一个查找最高值的问题。在unix.stackexchange上,该问题指删除相同的命中。这是一个不同于“是”的问题,这是一个不同的问题,因为这是一个查找最高值的问题。在unix.stackexchange上问题是删除相同的命中数。谢谢,这似乎有效。但是,我是否需要通过$31+0
将字符串值转换为整数?@user977828,请查看此差异echo“abcd12234”| awk'{print$0+0}
为0且echo“1234abcd12234”| awk'{print$0+0}“
是1234,所以如果您的第31个字段是从数字开始的,并且希望有初始数字,则继续,否则请使用sub函数删除所有非数字内容,并仅获取该字段中的数字。@user977828,若要仅获取第31个字段中的数字,请执行类似于sub(/[^0-9]+/,”“,$31)的操作
,你应该很擅长。谢谢你,我用新数据更新了我的问题,因此我不得不将$31
更改为$13
。不幸的是,我发现了我在更新的问题部分描述的awk脚本中的一个问题。如何修复它?@user977828,很抱歉不清楚,请更清楚。添加更多细节,然后让我知道。谢谢你,它似乎起作用了。但是,我需要通过$31+0
将字符串值转换为整数吗?@user977828,请查看此差异echo“abcd12234”| awk'{print$0+0}
为0且echo“1234abcd12234”| awk'{print$0+0}“
是1234,所以如果您的第31个字段是从数字开始的,并且希望有初始数字,则继续,否则请使用sub函数删除所有非数字内容,并仅获取该字段中的数字。@user977828,若要仅获取第31个字段中的数字,请执行类似于sub(/[^0-9]+/,”“,$31)的操作
,你应该很擅长。谢谢你,我用新数据更新了我的问题,因此我不得不将$31
更改为$13
。不幸的是,我发现了我在更新的问题部分描述的awk脚本中的一个问题。如何修复它?@user977828,很抱歉不清楚,请更清楚。添加更多细节,然后让我知道。
awk '{
if (/^gene\./) {
split($1, a, /\./)
k = a[1] "." a[2]
}
else
k = $1
}
!(k in max) || $31 >= max[k] {
if(!(k in max))
ord[++n] = k
else if (max[k] == $31) {
print
next
}
max[k] = $31
rec[k] = $0
}
END {
for (i=1; i<=n; ++i)
print rec[ord[i]]
}' Input_file
awk '{ ##Starting awk program from here.
if (/^gene\./) { ##Checking condition if line is NOT starting from gene. then do following.
split($1, a, /\./) ##Splitting first field into array a with delimiter dot here.
k = a[1] "." a[2] ##Creating variable k with value of a[1] DOT a[2] here.
}
else ##In case line NOT starting from gene. then do following.
k = $1 ##Setting 1st field value to k here.
}
!(k in max) || $31 >= max[k] { ##Checking condition if k is NOT in max array and 31st field is >= max[k]
if(!(k in max)) ##If above any of the condition is true then check if k is NOT present in max
ord[++n] = k ##Creating ord with index of increasing value of n and its value is k
else if (max[k] == $31) { ##else printing maximum duplicate line, no need to keep appending it in array.
print ##Printing it here.
next ##next will skip all further statements from here.
}
max[k] = $31 ##Creating max with index of k and value of 31st field.
rec[k] = $0 ##Creating rec with index of k and value of current line.
}
END { ##Starting END block of this program from here.
for (i=1; i<=n; ++i) ##Starting a for loop from i=1 to till value of n here.
print rec[ord[i]] ##Printing array rec with index of; value of ord array which has i index.
}' Input_file ##Mentioning Input_file name here.