Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/arduino/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Unix 删除Fasta标题中两个字符之间的字母_Unix_Fasta - Fatal编程技术网

Unix 删除Fasta标题中两个字符之间的字母

Unix 删除Fasta标题中两个字符之间的字母,unix,fasta,Unix,Fasta,这应该是非常琐碎的,但我正在努力解决这个问题。 所以我有一个fasta文件,我通过在另一个文件中使用匹配的头来更改头 grep -e ">" head original.fasta| head >TRINITY_GG_86181_c0_g1_i4.mrna1 strand=+ >TRINITY_GG_136405_c0_g1_i4.mrna1 strand=+ >TRINITY_GG_52087_c0_g1_i1.mrna1 strand=+ &

这应该是非常琐碎的,但我正在努力解决这个问题。 所以我有一个fasta文件,我通过在另一个文件中使用匹配的头来更改头

grep -e ">" head original.fasta| head
>TRINITY_GG_86181_c0_g1_i4.mrna1 strand=+ 
>TRINITY_GG_136405_c0_g1_i4.mrna1 strand=+ 
>TRINITY_GG_52087_c0_g1_i1.mrna1 strand=+ 
>TRINITY_GG_22558_c0_g1_i2.mrna1 strand=+ 
>TRINITY_GG_29872_c0_g1_i3.mrna1 strand=+ 

head matches.txt 

TRINITY_GG_86181_c0_g1_i4.mrna1 NM_001123383.1
TRINITY_GG_136405_c0_g1_i4.mrna1    NM_001321912.3
TRINITY_GG_52087_c0_g1_i1.mrna1 NM_001376885.1
TRINITY_GG_22558_c0_g1_i2.mrna1 NM_003043.6
TRINITY_GG_29872_c0_g1_i3.mrna1 NM_001363619.2
TRINITY_GG_129652_c0_g1_i3.mrna1    NM_001258446.1
 
awk 'FNR==NR{
  a[">"$1]=$2;next
}
$1 in a{
  sub(/>/,">"a[$1]"|",$1)
}1'  matches.txt  original.fasta > new.fa

问题是新ID刚刚添加,现在我需要删除旧部分(在| TRINITY…(空格)之间)

有人能帮我吗

grep -e ">" new.fa | head
>NM_001123383.1|TRINITY_GG_86181_c0_g1_i4.mrna1 strand=+
>NM_001321912.3|TRINITY_GG_136405_c0_g1_i4.mrna1 strand=+
>NM_001376885.1|TRINITY_GG_52087_c0_g1_i1.mrna1 strand=+
>NM_003043.6|TRINITY_GG_22558_c0_g1_i2.mrna1 strand=+
>NM_001363619.2|TRINITY_GG_29872_c0_g1_i3.mrna1 strand=+
>NM_001258446.1|TRINITY_GG_129652_c0_g1_i3.mrna1 strand=-
>NM_001252018.2|TRINITY_GG_141414_c0_g1_i1.mrna1 strand=-
>NM_001301072.2|TRINITY_GG_78808_c0_g1_i3.mrna1 strand=+
>NM_001128205.2|TRINITY_GG_13706_c0_g1_i1.mrna1 strand=+
>NM_001039802.2|TRINITY_GG_122170_c0_g1_i1.mrna1 strand=+
期望输出

>NM_001123383.1 strand=+
>NM_001321912.3 strand=+
>NM_001376885.1 strand=+
>NM_003043.6 strand=+
>NM_001363619.2 strand=+
>NM_001258446.1 strand=-
>NM_001252018.2 strand=-
>NM_001301072.2 strand=+
>NM_001128205.2 strand=+
>NM_001039802.2 strand=+
因此,我需要有效地删除标题中|和空格之间的所有内容 谁能帮帮我吗?
谢谢

注意:您可以使用“^>将grep搜索定位到行的开头