Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/unix/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Unix sed多次替换第二列中的图案_Unix_Sed - Fatal编程技术网

Unix sed多次替换第二列中的图案

Unix sed多次替换第二列中的图案,unix,sed,Unix,Sed,sed新手,尝试完成以下工作,但完全卡住了: 我试图在第二列中用sed替换模式。这种模式发生了多次 我有: Gene1 GO:0000045^biological_process^autophagosome assembly`GO:0005737^cellular_component^cytoplasm Gene2 GO:0000030^molecular_function^mannosyltransferase activity`GO:0006493^biological_process^pr

sed新手,尝试完成以下工作,但完全卡住了: 我试图在第二列中用sed替换模式。这种模式发生了多次

我有:

Gene1 GO:0000045^biological_process^autophagosome assembly`GO:0005737^cellular_component^cytoplasm
Gene2 GO:0000030^molecular_function^mannosyltransferase activity`GO:0006493^biological_process^protein O-linked glycosylation`GO:0016020^cellular_component^membrane
我想得到:

Gene1 GO:0000045,GO:0005737
Gene2 GO:0000030,GO:0006493,GO:0016020
因此,去掉所有描述性部分,使用“,”作为分隔符。我选择使用sed是因为我想很容易识别^和`之间的模式。但相反,它删除了所有的先行条款

代码:


有人能帮我吗?

试试这个,图示为两个步骤

$#显示如何从^to`删除并替换为,
$sed's/\^[^`]*`/,/g'ip.txt
基因1 GO:0000045,GO:0005737^细胞成分^细胞质
Gene2 GO:0000030,GO:0006493,GO:0016020^细胞膜
$#同时删除从^到行尾的剩余数据
$sed's/\^[^`]*`/,/g;s/\^.*/'ip.txt
Gene1 GO:0000045,GO:0005737
Gene2 GO:0000030,GO:0006493,GO:0016020
  • 由于
    ^
    是元字符,请使用
    \^
    逐字匹配它
  • [^`]*
    将匹配零个或多个非
    `
    字符
  • 不要使用
    \^.*`
    ,这将从行中的第一个^
第一个命令删除(不替换)除
`
后面的
^
以外的任何字符(包括在内)


第二种替代方法是识别单个字段,然后对每个字段进行操作,这可能比仅仅用regexp识别每行的部分更有用:

$ awk -F'^' -v OFS=',' '{print NR") "$0; for (i=1;i<=NF;i++) print "\t"i") "$i}' file
1) Gene1 GO:0000045^biological_process^autophagosome assembly`GO:0005737^cellular_component^cytoplasm
        1) Gene1 GO:0000045
        2) biological_process
        3) autophagosome assembly`GO:0005737
        4) cellular_component
        5) cytoplasm
2) Gene2 GO:0000030^molecular_function^mannosyltransferase activity`GO:0006493^biological_process^protein O-linked glycosylation`GO:0016020^cellular_component^membrane
        1) Gene2 GO:0000030
        2) molecular_function
        3) mannosyltransferase activity`GO:0006493
        4) biological_process
        5) protein O-linked glycosylation`GO:0016020
        6) cellular_component
        7) membrane
$awk-F'^'-v OFS=',“{print NR”)”$0;for(i=1;i
sed -e 's/\^[^`]*//g' -e 's/`/,/g' your_file
$ awk -F'^' -v OFS=',' '{print NR") "$0; for (i=1;i<=NF;i++) print "\t"i") "$i}' file
1) Gene1 GO:0000045^biological_process^autophagosome assembly`GO:0005737^cellular_component^cytoplasm
        1) Gene1 GO:0000045
        2) biological_process
        3) autophagosome assembly`GO:0005737
        4) cellular_component
        5) cytoplasm
2) Gene2 GO:0000030^molecular_function^mannosyltransferase activity`GO:0006493^biological_process^protein O-linked glycosylation`GO:0016020^cellular_component^membrane
        1) Gene2 GO:0000030
        2) molecular_function
        3) mannosyltransferase activity`GO:0006493
        4) biological_process
        5) protein O-linked glycosylation`GO:0016020
        6) cellular_component
        7) membrane
$ awk -F'^' -v OFS=',' '{out=$1; for (i=2;i<=NF;i++) if (sub(/.*`/,"",$i)) out=out OFS $i; print out}' file
Gene1 GO:0000045,GO:0005737
Gene2 GO:0000030,GO:0006493,GO:0016020