Unix 删除Fasta标题中两个字符之间的字母
这应该是非常琐碎的,但我正在努力解决这个问题。 所以我有一个fasta文件,我通过在另一个文件中使用匹配的头来更改头Unix 删除Fasta标题中两个字符之间的字母,unix,fasta,Unix,Fasta,这应该是非常琐碎的,但我正在努力解决这个问题。 所以我有一个fasta文件,我通过在另一个文件中使用匹配的头来更改头 grep -e ">" head original.fasta| head >TRINITY_GG_86181_c0_g1_i4.mrna1 strand=+ >TRINITY_GG_136405_c0_g1_i4.mrna1 strand=+ >TRINITY_GG_52087_c0_g1_i1.mrna1 strand=+ &
grep -e ">" head original.fasta| head
>TRINITY_GG_86181_c0_g1_i4.mrna1 strand=+
>TRINITY_GG_136405_c0_g1_i4.mrna1 strand=+
>TRINITY_GG_52087_c0_g1_i1.mrna1 strand=+
>TRINITY_GG_22558_c0_g1_i2.mrna1 strand=+
>TRINITY_GG_29872_c0_g1_i3.mrna1 strand=+
head matches.txt
TRINITY_GG_86181_c0_g1_i4.mrna1 NM_001123383.1
TRINITY_GG_136405_c0_g1_i4.mrna1 NM_001321912.3
TRINITY_GG_52087_c0_g1_i1.mrna1 NM_001376885.1
TRINITY_GG_22558_c0_g1_i2.mrna1 NM_003043.6
TRINITY_GG_29872_c0_g1_i3.mrna1 NM_001363619.2
TRINITY_GG_129652_c0_g1_i3.mrna1 NM_001258446.1
awk 'FNR==NR{
a[">"$1]=$2;next
}
$1 in a{
sub(/>/,">"a[$1]"|",$1)
}1' matches.txt original.fasta > new.fa
问题是新ID刚刚添加,现在我需要删除旧部分(在| TRINITY…(空格)之间)
有人能帮我吗
grep -e ">" new.fa | head
>NM_001123383.1|TRINITY_GG_86181_c0_g1_i4.mrna1 strand=+
>NM_001321912.3|TRINITY_GG_136405_c0_g1_i4.mrna1 strand=+
>NM_001376885.1|TRINITY_GG_52087_c0_g1_i1.mrna1 strand=+
>NM_003043.6|TRINITY_GG_22558_c0_g1_i2.mrna1 strand=+
>NM_001363619.2|TRINITY_GG_29872_c0_g1_i3.mrna1 strand=+
>NM_001258446.1|TRINITY_GG_129652_c0_g1_i3.mrna1 strand=-
>NM_001252018.2|TRINITY_GG_141414_c0_g1_i1.mrna1 strand=-
>NM_001301072.2|TRINITY_GG_78808_c0_g1_i3.mrna1 strand=+
>NM_001128205.2|TRINITY_GG_13706_c0_g1_i1.mrna1 strand=+
>NM_001039802.2|TRINITY_GG_122170_c0_g1_i1.mrna1 strand=+
期望输出
>NM_001123383.1 strand=+
>NM_001321912.3 strand=+
>NM_001376885.1 strand=+
>NM_003043.6 strand=+
>NM_001363619.2 strand=+
>NM_001258446.1 strand=-
>NM_001252018.2 strand=-
>NM_001301072.2 strand=+
>NM_001128205.2 strand=+
>NM_001039802.2 strand=+
因此,我需要有效地删除标题中|和空格之间的所有内容
谁能帮帮我吗?
谢谢注意:您可以使用“^>将grep搜索定位到行的开头