Unix sed多次替换第二列中的图案_Unix_Sed

Unix sed多次替换第二列中的图案

unix sed

Unix sed多次替换第二列中的图案,unix,sed,Unix,Sed,sed新手，尝试完成以下工作，但完全卡住了：我试图在第二列中用sed替换模式。这种模式发生了多次我有： Gene1 GO:0000045^biological_process^autophagosome assembly`GO:0005737^cellular_component^cytoplasm Gene2 GO:0000030^molecular_function^mannosyltransferase activity`GO:0006493^biological_process^pr

sed新手，尝试完成以下工作，但完全卡住了：我试图在第二列中用sed替换模式。这种模式发生了多次

我有：

Gene1 GO:0000045^biological_process^autophagosome assembly`GO:0005737^cellular_component^cytoplasm
Gene2 GO:0000030^molecular_function^mannosyltransferase activity`GO:0006493^biological_process^protein O-linked glycosylation`GO:0016020^cellular_component^membrane

我想得到：

Gene1 GO:0000045,GO:0005737
Gene2 GO:0000030,GO:0006493,GO:0016020

因此，去掉所有描述性部分，使用“，”作为分隔符。我选择使用sed是因为我想很容易识别^和`之间的模式。但相反，它删除了所有的先行条款

代码：

有人能帮我吗？

试试这个，图示为两个步骤

$#显示如何从^to`删除并替换为，
$sed's/\^[^`]*`/，/g'ip.txt
基因1 GO:0000045，GO:0005737^细胞成分^细胞质
Gene2 GO:0000030，GO:0006493，GO:0016020^细胞膜
$#同时删除从^到行尾的剩余数据
$sed's/\^[^`]*`/，/g；s/\^.*/'ip.txt
Gene1 GO:0000045，GO:0005737
Gene2 GO:0000030，GO:0006493，GO:0016020

由于
```
^
```
是元字符，请使用
```
\^
```
逐字匹配它
```
[^`]*
```
将匹配零个或多个非
```
`
```
字符
不要使用
```
\^.*`
```
，这将从行中的第一个^

第一个命令删除（不替换）除

后面的

以外的任何字符（包括在内）

第二种替代方法是识别单个字段，然后对每个字段进行操作，这可能比仅仅用regexp识别每行的部分更有用：

$ awk -F'^' -v OFS=',' '{print NR") "$0; for (i=1;i<=NF;i++) print "\t"i") "$i}' file
1) Gene1 GO:0000045^biological_process^autophagosome assembly`GO:0005737^cellular_component^cytoplasm
        1) Gene1 GO:0000045
        2) biological_process
        3) autophagosome assembly`GO:0005737
        4) cellular_component
        5) cytoplasm
2) Gene2 GO:0000030^molecular_function^mannosyltransferase activity`GO:0006493^biological_process^protein O-linked glycosylation`GO:0016020^cellular_component^membrane
        1) Gene2 GO:0000030
        2) molecular_function
        3) mannosyltransferase activity`GO:0006493
        4) biological_process
        5) protein O-linked glycosylation`GO:0016020
        6) cellular_component
        7) membrane

$awk-F'^'-v OFS='，“{print NR”）”$0；for（i=1；i
sed -e 's/\^[^`]*//g' -e 's/`/,/g' your_file

$ awk -F'^' -v OFS=',' '{print NR") "$0; for (i=1;i<=NF;i++) print "\t"i") "$i}' file
1) Gene1 GO:0000045^biological_process^autophagosome assembly`GO:0005737^cellular_component^cytoplasm
        1) Gene1 GO:0000045
        2) biological_process
        3) autophagosome assembly`GO:0005737
        4) cellular_component
        5) cytoplasm
2) Gene2 GO:0000030^molecular_function^mannosyltransferase activity`GO:0006493^biological_process^protein O-linked glycosylation`GO:0016020^cellular_component^membrane
        1) Gene2 GO:0000030
        2) molecular_function
        3) mannosyltransferase activity`GO:0006493
        4) biological_process
        5) protein O-linked glycosylation`GO:0016020
        6) cellular_component
        7) membrane

$ awk -F'^' -v OFS=',' '{out=$1; for (i=2;i<=NF;i++) if (sub(/.*`/,"",$i)) out=out OFS $i; print out}' file
Gene1 GO:0000045,GO:0005737
Gene2 GO:0000030,GO:0006493,GO:0016020