Bash 用不同的结构替换结构并保留一些值_Bash

Bash 用不同的结构替换结构并保留一些值

bash

Bash 用不同的结构替换结构并保留一些值,bash,Bash,我想在bash中将输入转换为输出。我试着使用sed，但是没有用——我可能弄错了。到目前为止，我有这个（只是想尝试一下，如果我可以提取id），但它不起作用： sed 's;id="([a-zA-Z:]+)";\\1;p' input 输入 <mediaobject> <imageobject id="fig:deployment"> <caption>Application deployment</caption>

我想在bash中将输入转换为输出。我试着使用sed，但是没有用——我可能弄错了。到目前为止，我有这个（只是想尝试一下，如果我可以提取id），但它不起作用：

sed 's;id="([a-zA-Z:]+)";\\1;p' input

输入

<mediaobject>
    <imageobject id="fig:deployment">
        <caption>Application deployment</caption>
        <imagedata fileref="images/deployment.png" width="90%" />
    </imageobject>
</mediaobject>


应用程序部署

输出

<img src="images/deployment.png" width="90%" id="fig:deployment" title="Application deployment" />

使用：

对于sed：

 sed -n '\!<mediaobject>!{
  n;
  s/ *[^ ]* \(id="[^"]*"\).*/\1/; 
  h; n;
  s/ *[^>]*>\([^<]*\).*/title="\1"/;
  H; n;
  s/ *<[^ ]* *fileref=\("[^"]*"\) *\(width="[^"]*"\).*/src=\1 \2/;
  H; n;
  x;
  s/\n/ /g;
  s/^/<img /;
  s/$/ \/>/;
  p
 }' input

sed-n'\！！{
N
s/*[^]*\（id=“[^”]*“\）./\1/；
h、 n；
s/*[^>]*>\（[^带sed:
 sed -n '\!<mediaobject>!{
  n;
  s/ *[^ ]* \(id="[^"]*"\).*/\1/; 
  h; n;
  s/ *[^>]*>\([^<]*\).*/title="\1"/;
  H; n;
  s/ *<[^ ]* *fileref=\("[^"]*"\) *\(width="[^"]*"\).*/src=\1 \2/;
  H; n;
  x;
  s/\n/ /g;
  s/^/<img /;
  s/$/ \/>/;
  p
 }' input

sed-n'\{
N
s/*[^]*\（id=“[^”]*“\）./\1/；
h、 n；
s/*[^>]*>\（[^awk几乎在安装bash的任何地方都可用，可以避免使用sed时可能遇到的一些陷阱（例如，如果xml中的属性顺序不一致）
awk几乎在安装bash的任何地方都可用，可以避免使用sed时可能遇到的一些陷阱（例如，如果xml中的属性顺序不一致）
这将很好，但我在服务器上没有xsh，它无法安装…这将很好，但我在服务器上没有xsh，它无法安装…很好，但它有一个小缺陷。结果是
。我如何摆脱这些空间？请您简要解释代码？我从未见过这样的awk，所以我不知道理解它…它应该像写的那样工作（所以可能是剪切和粘贴错误）。具体来说，gsub行的正则表达式包含“^*”和“*$”的匹配项，它们应该用“”（删除它）替换其中一个。很好，但它有一个小缺陷。结果是。我怎样才能去掉这些空格？请你简要解释一下代码好吗？我从未见过这样的awk，所以我不理解它…它应该像写的那样工作（所以可能是剪切粘贴错误）。具体来说，gsub行的正则表达式包含“^*”和“*$”的匹配项应将其中一个替换为“”（删除它）。
awk '
    ## set a variable to mark that we are in a mediaobject block
    $1=="<mediaobject>" { object=1 }

    ## mark that we have exited the object block
    $1=="</mediaobject>" { object=0 }

    ## if we are in an mediaobject block and we find an imageblock
    $1=="<imageobject" && object==1 { 
        iobject=1                          ## record that we are in an imageblock
        id = substr($2, 5, length($2) - 6) ## this is unnecessary for output
    }

    ## if we have a line with image data
    $1~/<imagedata/ && iobject==1 {
        fileref=substr($2,9,length($2)-8)  ## the path, including the quotations
        width=$3                           ## the width
    }

    ## if we have a caption line
    $1~/<caption>/ && iobject==1 {
        gsub("(</?caption>|^ *| *$)", "")  ## remove xml and leading/trailing whitespace
        caption=$0                         ## record the modified line as the caption
    }

    ## when we arrive at the end of an imageblock
    $1=="</imageobject>" && object==1 {
        iobject=0                                                            ## record it
        printf("<img src=%s %s title=\"%s\" />\n", fileref, width, caption)  ## print record
    }

' input

## use match to find the beginning of the attribute
## use a nested substr() to pull only the value of fileref (with quotations)
fileref = substr(substr($0, match($0,/fileref=[a-z\/"]+/),RLENGTH),9))