Regex 在两个标记之间抓取一块文本，如果块包含某些标记，则追加以下行_Regex_Sed_Grep

Regex 在两个标记之间抓取一块文本，如果块包含某些标记，则追加以下行

regex sed grep

Regex 在两个标记之间抓取一块文本，如果块包含某些标记，则追加以下行,regex,sed,grep,Regex,Sed,Grep,我试图抓住标签之间的文本块。开始标记将用正则表达式匹配来标识，结束标记将是静态的我确实在寻找一些方法来解决我所面临的问题，比如说，或者说，尽管我没有找到解决问题的方法，因为它有一些更具体的条件。。。让我举一个我拥有的文本文件的例子： <...text-to-ignore...> tag_list_index tag1 ...................... 51 tag2 .............. 54 tagn ......... 243

我试图抓住标签之间的文本块。开始标记将用正则表达式匹配来标识，结束标记将是静态的

我确实在寻找一些方法来解决我所面临的问题，比如说，或者说，尽管我没有找到解决问题的方法，因为它有一些更具体的条件。。。让我举一个我拥有的文本文件的例子：

<...text-to-ignore...>
tag_list_index
    tag1 ...................... 51
    tag2 .............. 54
    tagn ......... 243
    <...lots-of-text-to-ignore...>
        tag1
        headerA headerB headerC
        fieldx  description ...
        fieldy  description ... (a)
        fieldw  description ... 
        fieldz  description ... (c)
        fieldt  description ... (b)
                        Máx: 234+var
        (a) - Note1
        (b) - Note2
        (c) - Note3
        <...more-text-to-ignore...>
        tag2
        headerA headerB headerC
        fielda  description ...
        fieldj  description ... (a)
                        Max: 234+var
        (a) - Note1
        <...more-text-to-ignore...>
        tagn
        headerA headerB headerC
        fieldr  description ...
        fieldg  description ... 
                    Máx: 234+var
        <...more-text-to-ignore...>

你能帮我吗？对要使用的工具没有具体要求

sed -nr '/^ +tag[0-9n]+$/,/M[áa]x: /p;:A;s/^        \([a-z]\)/&/;tB;b;:B;p;n;bA' file.txt

输出：

   tag1
    headerA headerB headerC
    fieldx  description ...
    fieldy  description ... (a)
    fieldw  description ... 
    fieldz  description ... (c)
    fieldt  description ... (b)
                    Máx: 234+var
    (a) - Note1
    (b) - Note2
    (c) - Note3
    tag2
    headerA headerB headerC
    fielda  description ...
    fieldj  description ... (a)
                    Max: 234+var
    (a) - Note1
    tagn
    headerA headerB headerC
    fieldr  description ...
    fieldg  description ... 
                Máx: 234+var

限制：如果有一个或多个注释，那么在下一个标记之前，

是很重要的。虽然答案被接受，因为它确实达到了目的，但我一开始并不清楚sed是如何完成这项工作的。我确实对它进行了更深入的研究，并对该命令进行了重新设计，以便在阅读时更清楚地了解它。
我正在分享它，以及对每个命令的一些评论，以防它对其他人有用
sed -nr '/START/,/END/  {
 #print the block of text delimited by START and END
 p
 #Label A is stated
 :A
 # Substitutes all notes (a),(b),(c),... by them self. Meaning (a) is
 # substituted by (a), (b) by (b) and so on. Indeed, nothing is done.
 # This is a trigger for the next command... 
 s/^\([a-z]\)/&/
 # Command t will jump to label B (case insensitive), 
 # if any substitution was performed.
 tb
 # A branch without a label in front is saying: go to the end of script
 b
 #Label B is stated
 :B
 #prints the line
 p
 #Prints the current line and reads the next one
 n
 # Go up to label A again
 bA
 }' file.txt

A[0-9]{3}
是如何到位的？我没有看到一行以空格开头，后面跟着一个A？，标签实际上就是这样。它们将类似于“A123-描述”或“A999：另一个描述”。我之所以没有提前提到这一点，是因为我能够使用正则表达式来获取这些。不过，我还是会编辑这个问题来提到这一点。谢谢它能正常工作吗？是的，它工作起来很有魅力！我确实会接受这个解决办法。限制是可以的，因为注释和下一个标记之间总是有文本。非常感谢您的帮助，很抱歉耽搁了。@各位，如果您对这个问题投了反对票，请留下评论。没有任何进一步解释的向下投票，无助于改进。谢谢Cyrus的解决方案。但是还有另一个限制，即如果标记没有任何注释，它将无法抓取块，因为您依赖于这些注释的存在。您可以在我给出的示例中看到，tagn没有任何与之相关的注释。
sed -nr '/START/,/END/  {
 #print the block of text delimited by START and END
 p
 #Label A is stated
 :A
 # Substitutes all notes (a),(b),(c),... by them self. Meaning (a) is
 # substituted by (a), (b) by (b) and so on. Indeed, nothing is done.
 # This is a trigger for the next command... 
 s/^\([a-z]\)/&/
 # Command t will jump to label B (case insensitive), 
 # if any substitution was performed.
 tb
 # A branch without a label in front is saying: go to the end of script
 b
 #Label B is stated
 :B
 #prints the line
 p
 #Prints the current line and reads the next one
 n
 # Go up to label A again
 bA
 }' file.txt