Regex 从行块中有选择地提取行_Regex_Linux_Awk

Regex 从行块中有选择地提取行

regex linux awk

Regex 从行块中有选择地提取行,regex,linux,awk,Regex,Linux,Awk,请帮助我使用这个正则表达式，我需要第一个元映射中的所有组件短语：。\n数据映射** 这之后会是什么？我今天刚开始学习正则表达式到目前为止，我有点困了。我有下面的文档和我想要的输出主要文件： Phrase: "is" Phrase: "normal." Meta Mapping (1000): 1000 % Normal (Mean Percent of Normal) [Quantitative Concept] Meta Mapping (1000): 1000 Norm

请帮助我使用这个正则表达式，我需要第一个元映射中的所有组件

短语：。\n数据映射** 这之后会是什么？我今天刚开始学习正则表达式

到目前为止，我有点困了。我有下面的文档和我想要的输出

主要文件：

Phrase: "is"

Phrase: "normal."
Meta Mapping (1000):
 1000   % Normal (Mean Percent of Normal) [Quantitative Concept]
Meta Mapping (1000):
 1000   Normal [Qualitative Concept]
Meta Mapping (1000):
 1000   % normal (Percent normal) [Quantitative Concept]
Processing 00000000.tx.8: The EKG shows nonspecific changes.

Phrase: "The EKG"
Meta Mapping (1000):
 1000   EKG (Electrocardiogram) [Finding]
Meta Mapping (1000):
 1000   EKG (Electrocardiography) [Diagnostic Procedure]

Phrase: "shows"
Meta Mapping (1000):
 1000   Show [Intellectual Product]

Phrase: "nonspecific changes."
Meta Mapping (901):
 694   Nonspecific [Idea or Concept]
 861   changes (Changed status) [Quantitative Concept]
Meta Mapping (901):
 694   Nonspecific [Idea or Concept]
 861   changes (Changing) [Functional Concept]
Meta Mapping (901):
 694   Non-specific (Unspecified) [Qualitative Concept]
 861   changes (Changed status) [Quantitative Concept]
Meta Mapping (901):
 694   Non-specific (Unspecified) [Qualitative Concept]
 861   changes (Changing) [Functional Concept]

我希望每个短语的结果只有一个元映射

所以

请帮助我使用这个正则表达式，我需要第一个元映射中的所有组件。谢谢大家!

我想这可能对你有用。只是，和awk无关。在这里测试

gnu awk版本：

cat your_data_file | awk  '
BEGIN {
    FS="\n"
    RS="\n\n"
    OFS="\n"
}
NF > 1 {
    print $1, $2
    for (i = 3; i <= NF; i++)
        if (match($i, "Meta Mapping")) {
            print ""
            next
        }
        else
            print $i
    print ""
}
'

cat您的_数据_文件| awk'
开始{
FS=“\n”
RS=“\n\n”
OFS=“\n”
}
NF>1{
打印$1，$2
对于（i=3；i带注释的、符合POSIX的awk
解决方案：
awk -v RS='' -F'\n' -v re='^Meta Mapping \\(' '
    # Only process non-empty records:
    # those that have at least 1 "Meta Mapping" line.
  $2 ~ re { 
    print $1 # print the "Phrase: " line
    print $2 # print the 1st "Meta Mapping" line.
      # Print the remaining lines, if any, up to but not including
      # the next "Meta Mapping" line.
    for (i=3;i<=NF;++i) {
      if ($i ~ re) break # next "Meta Mapping" found; ignore and terminate block.
      print $i
    }
    print "" # print empty line between output blocks
  }
' file

awk-vrs=''-F'\n'-vre='^Meta-Mapping\\（''
#仅处理非空记录：
#那些至少有一条“元映射”线的。
$2~re{
打印$1#打印“短语：”行
打印$2#打印第一行“元映射”。
#打印剩余行（如果有），直到但不包括
#下一个“元映射”行。
for（i=3；iI意味着反对票。你不能用简单的正则表达式解决这个问题吗？类似这样的东西？+1用于awk解决方案。另外，为了明确起见，你的解决方案需要GNUawk
或mawk
，因为使用了多字符.RS（在Linux上这很好；如果你希望你的解决方案符合POSIX，请替换RS=“\n\n”
with option-v RS='
，这是一种awk习惯用法，通过空行将行分解为记录）。我很高兴有人为您提供了一个解决方案。至于为什么您收到了反对票：您自己几乎没有努力解决问题，因此您的问题实际上相当于“请帮我解决这个问题”。
cat your_data_file | awk  '
BEGIN {
    FS="\n"
    RS="\n\n"
    OFS="\n"
}
NF > 1 {
    print $1, $2
    for (i = 3; i <= NF; i++)
        if (match($i, "Meta Mapping")) {
            print ""
            next
        }
        else
            print $i
    print ""
}
'

awk -v RS='' -F'\n' -v re='^Meta Mapping \\(' '
    # Only process non-empty records:
    # those that have at least 1 "Meta Mapping" line.
  $2 ~ re { 
    print $1 # print the "Phrase: " line
    print $2 # print the 1st "Meta Mapping" line.
      # Print the remaining lines, if any, up to but not including
      # the next "Meta Mapping" line.
    for (i=3;i<=NF;++i) {
      if ($i ~ re) break # next "Meta Mapping" found; ignore and terminate block.
      print $i
    }
    print "" # print empty line between output blocks
  }
' file