File 替换字符串之间的空白
我有一个像这样的文件File 替换字符串之间的空白,file,sed,File,Sed,我有一个像这样的文件 >TCONS_00000066 +1 PPAAARTDLSPPQHVLHVYKRYGPPRQRRRPCPQTWWWQLPHRAAATHPRGEGPRASNPTRQQHFILVYNFSSFLSSWLSLSLLSSPFCYLYICDCHGNTEDEGPLMY*LVSSSLGAFVCKDFHLIDLLDLLFWIEAGYLHAVLHTILQSGRSDR*SRPKYRLTELSVCISVRTSSVINSKC*HN >TCONS_00000066 +2 RRLLRAPTC
>TCONS_00000066 +1
PPAAARTDLSPPQHVLHVYKRYGPPRQRRRPCPQTWWWQLPHRAAATHPRGEGPRASNPTRQQHFILVYNFSSFLSSWLSLSLLSSPFCYLYICDCHGNTEDEGPLMY*LVSSSLGAFVCKDFHLIDLLDLLFWIEAGYLHAVLHTILQSGRSDR*SRPKYRLTELSVCISVRTSSVINSKC*HN
>TCONS_00000066 +2
RRLLRAPTCHHPSTSSTYTSATVHRGSVDVLVRKHGGGSFLIEQQQLILEGKGPELLILHGNNTLYLCIISLRF*VHGYLCLSYLLPFAISIFVIAMEIQKTRGR*CIDL*VLVWGLSFARIFI*LIFLICYFGSKLATFMPCCIPYFSLVGQTDDRDRSID*PNFRFVYL*GQVLSSIQNVNII
>TCONS_00000066 +3
AGCCAHRLVTTPARPPRIQALRSTAAASTSLSANMVVAASSSSSSNSSSRGRAQSF*SYTATTLYTCV*FLFVSEFMAIFVSLIFSLLLSLYL*LPWKYRRRGAADVLTCEF*FGGFRLQGFSFD*SS*FVILDRSWLPSCRVAYHTSVWSVRPMIETEVSINRTFGLYICEDKFCHQFKMLT*
>TCONS_00000066 -1
YYVNILN**QNLSSQIYKPKVRLIDTSVSIIGLTDQTEVWYATRHEGSQLRSKITNQEDQSNENPCKRKPPN*NSQVNTSAAPRLLYFHGNHKYRDSKREKIRETKIAMNSETKRNYTQV*SVVAV*D*KLWALPLEDELLLLDEEAATTMFADKDVDAAAVDRSACIRGGRAGVVTSRCAQQPA
>TCONS_00000066 -2
IMLTF*IDDRTCPHRYTNRKFG*SILRSRSSV*PTRLKYGMQHGMKVASFDPK*QIKKINQMKILANESPQTRTHKSIHQRPLVFCISMAITNIEIAKGRR*ERQR*P*TQKRREIIHKYKVLLPCRIRSSGPFPSRMSCCCSMRKLPPPCLRTRTSTLPRWTVALVYVEDVLGW*QVGARSSRR
>TCONS_00000066 -3
LC*HFELMTELVLTDIQTESSVNRYFGLDHRSDRPD*SMVCNTA*R*PASIQNNKSRRSIK*KSLQTKAPKLELTSQYISGPSSSVFPWQSQI*R*QKGEDKRDKDSHELRNEEKLYTSIKCCCRVGLEALGPSPRG*VAAAR*GSCHHHVCGQGRRRCRGGP*RLYTWRTCWGGDKSVRAAAG
>TCONS_00000130 +1
LPARPRLQGALQRHRGGKPINQSINQWW*LGQLKTKKERSN*SSC*IVKWYAGEGGDSGSGGGGRGDGGGDGEPARRHHARRRPPRQELPLQVDEPVRANEEGWVQGSWHQAARHGTGRFLQRRAHPNRDHQFARTTA*NPLPNVHPSAGRAMEKKIKGKEEKMKSPCITN*FVMMQAAVRVRSSLIGSIR*ICFTKGATDRLSWLAVWVHIHTTQTQILTI*PFAKNIFTNEQLPKLISNLTLLLNAKSCGAEFRHLSAK*YGAECTLAR*LSLPSAVARHSAPADVALRCLSSAPHDLALSKKVRFEISFGSGSFVKLVFTKG*IVKICATQTHSQEDMNIK*SREGHGFSPGFVPFGCTCTEMIYVVGLTDTKEHM***MIFVLLCQSFTLVFLTCFLSSTVVLRIQ*PQLMRLKWILAN*AYSLIFWLMVIL
>TCONS_00000130 +2
FQLALAFRELCNGIAEVNQSTNQSINGGSWVNSKQRKKEAINHLVEL*NGMQAKVEIVVREGEVGETVVATVNQLAATTLVVGLHDKSFLYRSTNPYERMRRVGCRVLGIRQHATARDGSFNAELTQIETINLHVPPPKIPFPMFTLPLGVLWRKRSKAKKRK*SHHASQINL**CRLQCELGAH*LDQSDEFVLPKEQLTD*AG*LSGYIYTRHKHKF*QFNPLQKIFLQMNSYQNLFQI*PFCLTPNRVALNLDTSAPNSMALNVRWHADLVSHPPWHGIQRQLTWR*GV*VPRHMI*R*AKRSDLK*VLAAVHL*N*FLQRVKLSKFVRHKHTHKKT*TSSEAGRGTVSHLDLCHLVVLVQR*SMLLD*QTPRNTCSSK*FLFYFVKVLHLYS*PVSCLAQ*C*EFSNLS**D*NGYWPIKLIASSFGLWLYL
>TCONS_00000130 +3
SSSPSPSGSSATASRR*TNQPINQSMVVVGSTQNKERKKQLIILLNCEMVCRRRWR*WFGRGRSGRRWWRR*TSSPPPRSSSASTTRASSTGRRTRTSE*GGLGAGFLASGSTPRHGTVPSTPSSPKSRPSICTYHRLKSPSQCSPFRWACYGEKDQRQRRENEVTMHHKLICDDAGCSAS*ELTDWINPMNLFYQRSN*QIELASCLGTYTHDTNTNFDNLTLCKKYFYK*TATKTYFKSDPFA*RQIVWR*I*TPQRQIVWR*MYVGTLT*SPIRRGTAFSAS*RGAEVSKFRAT*FSVEQKGQI*NKFWQRFICKISFYKGLNCQNLCDTNTLTRRHEHQVKQGGARFLTWICAIWLYLYRDDLCCWIDRHQGTHVVVNDFCFTLSKFYTCIPDLFLV*HSSVKNSVTSVDEIKMDIGQLSL*PHLLAYGYTY
>TCONS_00000130 -1
ISITISQKMRL*A*LANIHFNLIN*GY*ILNTTVLDKKQVRNTSVKL*QSKTKIIYYYMCSLVSVNPTT*IISVQVQPNGTNPGEKPCPSLLHLMFMSSCECVCVAQILTI*PFVKTNFTNEPLPKLISNLTFLLNAKSCGAELRHLSATSAGAECRATADGRLSQRANVHSAPYYLALRCLNSAPHDLALSKRVRFEISFGSCSFVKIFFAKG*IVKICVCVVCICTQTASQLNLSVAPLVKQIHRIDPISELLTRTAACIITN*FVMHGDFIFSSLPLIFFSIARPAEG*TLGRGF*AVVRAN*WSRFG*ARR*RNRPVPWRAA*CQEPCTQPSSFARTGSSTCRGSSCRGGRRRAWWRRAGSPSPPPSPRPPPPEPLSPPSPAYHFTIQQDD*LLLSFFVLS*PNYHH*LIDWLIGLPPRCRCRAP*RRGRAG
>TCONS_00000130 -2
我想删除id行中字符串之间的空格
新文件应该是
>TCONS_00000066_+1
PPAAARTDLSPPQHVLHVYKRYGPPRQRRRPCPQTWWWQLPHRAAATHPRGEGPRASNPTRQQHFILVYNFSSFLSSWLSLSLLSSPFCYLYICDCHGNTEDEGPLMY*LVSSSLGAFVCKDFHLIDLLDLLFWIEAGYLHAVLHTILQSGRSDR*SRPKYRLTELSVCISVRTSSVINSKC*HN
>TCONS_00000066_+2
RRLLRAPTCHHPSTSSTYTSATVHRGSVDVLVRKHGGGSFLIEQQQLILEGKGPELLILHGNNTLYLCIISLRF*VHGYLCLSYLLPFAISIFVIAMEIQKTRGR*CIDL*VLVWGLSFARIFI*LIFLICYFGSKLATFMPCCIPYFSLVGQTDDRDRSID*PNFRFVYL*GQVLSSIQNVNII
>TCONS_00000066_+3
AGCCAHRLVTTPARPPRIQALRSTAAASTSLSANMVVAASSSSSSNSSSRGRAQSF*SYTATTLYTCV*FLFVSEFMAIFVSLIFSLLLSLYL*LPWKYRRRGAADVLTCEF*FGGFRLQGFSFD*SS*FVILDRSWLPSCRVAYHTSVWSVRPMIETEVSINRTFGLYICEDKFCHQFKMLT*
>TCONS_00000066_-1
YYVNILN**QNLSSQIYKPKVRLIDTSVSIIGLTDQTEVWYATRHEGSQLRSKITNQEDQSNENPCKRKPPN*NSQVNTSAAPRLLYFHGNHKYRDSKREKIRETKIAMNSETKRNYTQV*SVVAV*D*KLWALPLEDELLLLDEEAATTMFADKDVDAAAVDRSACIRGGRAGVVTSRCAQQPA
>TCONS_00000066_-2
IMLTF*IDDRTCPHRYTNRKFG*SILRSRSSV*PTRLKYGMQHGMKVASFDPK*QIKKINQMKILANESPQTRTHKSIHQRPLVFCISMAITNIEIAKGRR*ERQR*P*TQKRREIIHKYKVLLPCRIRSSGPFPSRMSCCCSMRKLPPPCLRTRTSTLPRWTVALVYVEDVLGW*QVGARSSRR
>TCONS_00000066_-3
LC*HFELMTELVLTDIQTESSVNRYFGLDHRSDRPD*SMVCNTA*R*PASIQNNKSRRSIK*KSLQTKAPKLELTSQYISGPSSSVFPWQSQI*R*QKGEDKRDKDSHELRNEEKLYTSIKCCCRVGLEALGPSPRG*VAAAR*GSCHHHVCGQGRRRCRGGP*RLYTWRTCWGGDKSVRAAAG
>TCONS_00000130_+1
LPARPRLQGALQRHRGGKPINQSINQWW*LGQLKTKKERSN*SSC*IVKWYAGEGGDSGSGGGGRGDGGGDGEPARRHHARRRPPRQELPLQVDEPVRANEEGWVQGSWHQAARHGTGRFLQRRAHPNRDHQFARTTA*NPLPNVHPSAGRAMEKKIKGKEEKMKSPCITN*FVMMQAAVRVRSSLIGSIR*ICFTKGATDRLSWLAVWVHIHTTQTQILTI*PFAKNIFTNEQLPKLISNLTLLLNAKSCGAEFRHLSAK*YGAECTLAR*LSLPSAVARHSAPADVALRCLSSAPHDLALSKKVRFEISFGSGSFVKLVFTKG*IVKICATQTHSQEDMNIK*SREGHGFSPGFVPFGCTCTEMIYVVGLTDTKEHM***MIFVLLCQSFTLVFLTCFLSSTVVLRIQ*PQLMRLKWILAN*AYSLIFWLMVIL
>TCONS_00000130_+2
FQLALAFRELCNGIAEVNQSTNQSINGGSWVNSKQRKKEAINHLVEL*NGMQAKVEIVVREGEVGETVVATVNQLAATTLVVGLHDKSFLYRSTNPYERMRRVGCRVLGIRQHATARDGSFNAELTQIETINLHVPPPKIPFPMFTLPLGVLWRKRSKAKKRK*SHHASQINL**CRLQCELGAH*LDQSDEFVLPKEQLTD*AG*LSGYIYTRHKHKF*QFNPLQKIFLQMNSYQNLFQI*PFCLTPNRVALNLDTSAPNSMALNVRWHADLVSHPPWHGIQRQLTWR*GV*VPRHMI*R*AKRSDLK*VLAAVHL*N*FLQRVKLSKFVRHKHTHKKT*TSSEAGRGTVSHLDLCHLVVLVQR*SMLLD*QTPRNTCSSK*FLFYFVKVLHLYS*PVSCLAQ*C*EFSNLS**D*NGYWPIKLIASSFGLWLYL
>TCONS_00000130_+3
SSSPSPSGSSATASRR*TNQPINQSMVVVGSTQNKERKKQLIILLNCEMVCRRRWR*WFGRGRSGRRWWRR*TSSPPPRSSSASTTRASSTGRRTRTSE*GGLGAGFLASGSTPRHGTVPSTPSSPKSRPSICTYHRLKSPSQCSPFRWACYGEKDQRQRRENEVTMHHKLICDDAGCSAS*ELTDWINPMNLFYQRSN*QIELASCLGTYTHDTNTNFDNLTLCKKYFYK*TATKTYFKSDPFA*RQIVWR*I*TPQRQIVWR*MYVGTLT*SPIRRGTAFSAS*RGAEVSKFRAT*FSVEQKGQI*NKFWQRFICKISFYKGLNCQNLCDTNTLTRRHEHQVKQGGARFLTWICAIWLYLYRDDLCCWIDRHQGTHVVVNDFCFTLSKFYTCIPDLFLV*HSSVKNSVTSVDEIKMDIGQLSL*PHLLAYGYTY
>TCONS_00000130_-1
ISITISQKMRL*A*LANIHFNLIN*GY*ILNTTVLDKKQVRNTSVKL*QSKTKIIYYYMCSLVSVNPTT*IISVQVQPNGTNPGEKPCPSLLHLMFMSSCECVCVAQILTI*PFVKTNFTNEPLPKLISNLTFLLNAKSCGAELRHLSATSAGAECRATADGRLSQRANVHSAPYYLALRCLNSAPHDLALSKRVRFEISFGSCSFVKIFFAKG*IVKICVCVVCICTQTASQLNLSVAPLVKQIHRIDPISELLTRTAACIITN*FVMHGDFIFSSLPLIFFSIARPAEG*TLGRGF*AVVRAN*WSRFG*ARR*RNRPVPWRAA*CQEPCTQPSSFARTGSSTCRGSSCRGGRRRAWWRRAGSPSPPPSPRPPPPEPLSPPSPAYHFTIQQDD*LLLSFFVLS*PNYHH*LIDWLIGLPPRCRCRAP*RRGRAG
>TCONS_00000130_-2
我使用了
sed
和tr
,但没有得到所需的输出。似乎您正试图用\ucode>符号替换空格。如果是,那么你可以考虑这个,
sed 's/[[:blank:]]\+/_/g' file
或
您需要捕获要保留的角色。因此,在这里,您要保留的字符是t字符
+8位
。因此,将与此匹配的模式放入捕获组,\(…\)
。并将以下一个或多个空格与此模式匹配。您必须转义+
,这样它将重复前面的标记一次或多次,否则它将匹配文本+
符号,因为基本sed使用BRE
(Baisc正则表达式)使用tr
,它的确切目的是用其他字符替换字符
tr ' ' '_' < file
例如,它具有以下效果:
$ cat a
hello world this
is a sample file
$ tr -s ' ' '_' < a
hello_world_this
is_a_sample_file
$cat a
你好,这个世界
这是一个示例文件
$tr-s''.'
当然,如果要保存原始文件中的更改,您必须将其输出到文件中并将其移回原始文件。我使用了类似sed的/TCONS[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][[:blank:][0-9][0-9][0-9][0-9]_+/g'ORF6frame\u或fpredictor.fa>new我也使用了这个命令sed s/\/\ug/ORF6frame\u或fpredictor.fa>new您发出的第二个命令运行良好。你能指出我的错误吗。为什么我的命令不起作用?哇,我不知道你可以合并-s
和''.
。漂亮,+1!
tr -s ' ' '_' < file
$ cat a
hello world this
is a sample file
$ tr -s ' ' '_' < a
hello_world_this
is_a_sample_file