Regex 如何在日志文件中获取所有已使用的ID

Regex 如何在日志文件中获取所有已使用的ID,regex,bash,grep,vivado,logfile-analysis,Regex,Bash,Grep,Vivado,Logfile Analysis,我有一个包含多个ID的日志文件,如下所示: INFO: [Synth 8-3491] module 'IOBUF' declared at 'C:/Xilinx/Vivado/2017.3/scripts/rt/data/unisim_comp.v:22655' bound to instance 'LED_tri_iobuf_7' of component 'IOBUF' [C:/.../block_design_wrapper.vhd:342] INFO: [Synth 8-3491] mo

我有一个包含多个ID的日志文件,如下所示:

INFO: [Synth 8-3491] module 'IOBUF' declared at 'C:/Xilinx/Vivado/2017.3/scripts/rt/data/unisim_comp.v:22655' bound to instance 'LED_tri_iobuf_7' of component 'IOBUF' [C:/.../block_design_wrapper.vhd:342]
INFO: [Synth 8-3491] module 'IOBUF' declared at 'C:/Xilinx/Vivado/2017.3/scripts/rt/data/unisim_comp.v:22655' bound to instance 'MAX11603_I2C_scl_iobuf' of component 'IOBUF' [C:/.../block_design_wrapper.vhd:349]
INFO: [Synth 8-3491] module 'IOBUF' declared at 'C:/Xilinx/Vivado/2017.3/scripts/rt/data/unisim_comp.v:22655' bound to instance 'MAX11603_I2C_sda_iobuf' of component 'IOBUF' [C:/.../block_design_wrapper.vhd:356]
INFO: [Synth 8-3491] module 'block_design' declared at 'C:/Projects/.../block_design.vhd:2346' bound to instance 'block_design_i' of component 'block_design' [C:/.../block_design_wrapper.vhd:363]
---------------------------------------------------------------------------------
Starting RTL Elaboration : Time (s): cpu = 00:00:07 ; elapsed = 00:00:07 . Memory (MB): peak = 491.250 ; gain = 104.844
---------------------------------------------------------------------------------
INFO: [Synth 8-638] synthesizing module 'block_design' [C:/Projects/.../block_design.vhd:2469]
INFO: [Synth 8-638] synthesizing module 'block_design_axi_mem_intercon_0' [C:/Projects/.../block_design.vhd:1403]
INFO: [Synth 8-638] synthesizing module 'm00_couplers_imp_1Y96WCF' [C:/Projects/.../block_design.vhd:81]
INFO: [Synth 8-3491] module 'block_design_auto_pc_0' declared at 'C:/.../block_design_auto_pc_0_stub.vhdl:5' bound to instance 'auto_pc' of component 'block_design_auto_pc_0' [C:/Projects/.../block_design.vhd:267]
INFO: [Synth 8-638] synthesizing module 'block_design_auto_pc_0' [C:/.../block_design_auto_pc_0_stub.vhdl:71]
INFO: [Synth 8-256] done synthesizing module 'm00_couplers_imp_1Y96WCF' (2#1) [C:/Projects/.../block_design.vhd:81]
INFO: [Synth 8-638] synthesizing module 'm01_couplers_imp_1SYGEF3' [C:/Projects/.../block_design.vhd:422]
INFO: [Synth 8-256] done synthesizing module 'm01_couplers_imp_1SYGEF3' (3#1) [C:/Projects/.../block_design.vhd:422]
INFO: [Synth 8-638] synthesizing module 'm02_couplers_imp_1MNJOVZ' [C:/Projects/.../block_design.vhd:611]
每个
INFO
行包含来自特定工具(此处为Synth)的消息和一个ID
xx-yyyy

如何使用Bash和gitbash中可用的工具提取日志文件中的所有ID?最后,我需要将输入日志文件拆分为多个文件,其中只包含属于同一ID的消息。因此ID列表必须是唯一的

我已经在PowerShell中编写了相同的脚本。这种方法使用正则表达式匹配和为每个新ID扩展的ID数组

它分3个步骤处理日志文件:

  • 按类别拆分日志文件
  • 收集消息ID的唯一列表
  • 将每个分类日志文件按邮件ID拆分为单独的文件
  • 主要问题是:如何创建此唯一消息ID列表?(步骤2)


    PowerShell脚本:

    [CmdletBinding()]
    Param(
      [Parameter(Mandatory=$True,Position=1)]
      [string]$ReportFile
    )
    
    $SynthesisLogFile = Get-Item $ReportFile
    $SynthesisLog = $SynthesisLogFile.BaseName
    
    $Categories = @("INFO", "CRITICAL WARNING", "WARNING", "ERROR")
    
    foreach ($Cat in $Categories)
    { Write-Host "Extracting category $Cat from $ReportFile ..."
      $line=0
      cat ".\$ReportFile" | %{ $line=$line+1; if ($_ -match "^$Cat") { "$line`t$_"} } > ".\$SynthesisLog.$Cat.log"
    
      $IDs = @();
      cat ".\$SynthesisLog.$Cat.log" | %{ $m = $_ -match "^\d+\t$($Cat): \[(\w+) (\d+-\d+)\]"; $mm = $Matches[2]; if ($IDs -notcontains $mm) {$IDs = $IDs + $mm } }
      foreach ($ID in $IDs)
      { Write-Host "  Extracting ID: $ID ..."
        cat ".\$SynthesisLog.$Cat.log" | Select-String "$ID" > ".\$SynthesisLog.$Cat.$ID.log"
      }
    }
    
    #! /bin/bash
    
    LOGFILE=$(basename "$1")
    PREFIX="${LOGFILE%.*}"
    
    for CATEGORY in INFO WARNING "CRITICAL WARNING" ERROR; do
      CATEGORY_FILE="$PREFIX.$CATEGORY.log"
      echo "Extracting category '$CATEGORY' from '$LOGFILE' ..."
      grep -n "^$CATEGORY: " "$1" > "$CATEGORY_FILE"
    
      if [[ -s $CATEGORY_FILE ]]; then
        echo "  File contains data"
      else
        echo "  Deleting empty output file for category '$CATEGORY'"
        rm "$CATEGORY_FILE"
      fi
    done
    
    #! /bin/bash
    
    LOGFILE=$(basename "$1")
    PREFIX="${LOGFILE%.*}"
    
    for CATEGORY in INFO WARNING "CRITICAL WARNING" ERROR; do
      CATEGORY_FILE="$PREFIX.$CATEGORY.log"
      echo "Extracting category '$CATEGORY' from '$LOGFILE' ..."
      grep -n "^$CATEGORY: " "$1" > "$CATEGORY_FILE"
    
      if [[ -s $CATEGORY_FILE ]]; then
        for ID in $(grep -P "^\d+:$CATEGORY: \[\w+ \d+-\d+\]" "$CATEGORY_FILE" | awk -F' ' '{print $3}' | tr -d ']' | sort | uniq); do
          ID_FILE="$PREFIX.$CATEGORY.$ID.log"
          echo "  Extracting ID: $ID ..."
          grep "$ID" "$CATEGORY_FILE" > "$ID_FILE"
        done
      else
        echo "  Deleting empty output file for category '$CATEGORY'"
        rm "$CATEGORY_FILE"
      fi
    done
    
    这是我当前的Bash脚本:

    [CmdletBinding()]
    Param(
      [Parameter(Mandatory=$True,Position=1)]
      [string]$ReportFile
    )
    
    $SynthesisLogFile = Get-Item $ReportFile
    $SynthesisLog = $SynthesisLogFile.BaseName
    
    $Categories = @("INFO", "CRITICAL WARNING", "WARNING", "ERROR")
    
    foreach ($Cat in $Categories)
    { Write-Host "Extracting category $Cat from $ReportFile ..."
      $line=0
      cat ".\$ReportFile" | %{ $line=$line+1; if ($_ -match "^$Cat") { "$line`t$_"} } > ".\$SynthesisLog.$Cat.log"
    
      $IDs = @();
      cat ".\$SynthesisLog.$Cat.log" | %{ $m = $_ -match "^\d+\t$($Cat): \[(\w+) (\d+-\d+)\]"; $mm = $Matches[2]; if ($IDs -notcontains $mm) {$IDs = $IDs + $mm } }
      foreach ($ID in $IDs)
      { Write-Host "  Extracting ID: $ID ..."
        cat ".\$SynthesisLog.$Cat.log" | Select-String "$ID" > ".\$SynthesisLog.$Cat.$ID.log"
      }
    }
    
    #! /bin/bash
    
    LOGFILE=$(basename "$1")
    PREFIX="${LOGFILE%.*}"
    
    for CATEGORY in INFO WARNING "CRITICAL WARNING" ERROR; do
      CATEGORY_FILE="$PREFIX.$CATEGORY.log"
      echo "Extracting category '$CATEGORY' from '$LOGFILE' ..."
      grep -n "^$CATEGORY: " "$1" > "$CATEGORY_FILE"
    
      if [[ -s $CATEGORY_FILE ]]; then
        echo "  File contains data"
      else
        echo "  Deleting empty output file for category '$CATEGORY'"
        rm "$CATEGORY_FILE"
      fi
    done
    
    #! /bin/bash
    
    LOGFILE=$(basename "$1")
    PREFIX="${LOGFILE%.*}"
    
    for CATEGORY in INFO WARNING "CRITICAL WARNING" ERROR; do
      CATEGORY_FILE="$PREFIX.$CATEGORY.log"
      echo "Extracting category '$CATEGORY' from '$LOGFILE' ..."
      grep -n "^$CATEGORY: " "$1" > "$CATEGORY_FILE"
    
      if [[ -s $CATEGORY_FILE ]]; then
        for ID in $(grep -P "^\d+:$CATEGORY: \[\w+ \d+-\d+\]" "$CATEGORY_FILE" | awk -F' ' '{print $3}' | tr -d ']' | sort | uniq); do
          ID_FILE="$PREFIX.$CATEGORY.$ID.log"
          echo "  Extracting ID: $ID ..."
          grep "$ID" "$CATEGORY_FILE" > "$ID_FILE"
        done
      else
        echo "  Deleting empty output file for category '$CATEGORY'"
        rm "$CATEGORY_FILE"
      fi
    done
    

    这是我的最后一个Bash脚本,经过以下改进:

    [CmdletBinding()]
    Param(
      [Parameter(Mandatory=$True,Position=1)]
      [string]$ReportFile
    )
    
    $SynthesisLogFile = Get-Item $ReportFile
    $SynthesisLog = $SynthesisLogFile.BaseName
    
    $Categories = @("INFO", "CRITICAL WARNING", "WARNING", "ERROR")
    
    foreach ($Cat in $Categories)
    { Write-Host "Extracting category $Cat from $ReportFile ..."
      $line=0
      cat ".\$ReportFile" | %{ $line=$line+1; if ($_ -match "^$Cat") { "$line`t$_"} } > ".\$SynthesisLog.$Cat.log"
    
      $IDs = @();
      cat ".\$SynthesisLog.$Cat.log" | %{ $m = $_ -match "^\d+\t$($Cat): \[(\w+) (\d+-\d+)\]"; $mm = $Matches[2]; if ($IDs -notcontains $mm) {$IDs = $IDs + $mm } }
      foreach ($ID in $IDs)
      { Write-Host "  Extracting ID: $ID ..."
        cat ".\$SynthesisLog.$Cat.log" | Select-String "$ID" > ".\$SynthesisLog.$Cat.$ID.log"
      }
    }
    
    #! /bin/bash
    
    LOGFILE=$(basename "$1")
    PREFIX="${LOGFILE%.*}"
    
    for CATEGORY in INFO WARNING "CRITICAL WARNING" ERROR; do
      CATEGORY_FILE="$PREFIX.$CATEGORY.log"
      echo "Extracting category '$CATEGORY' from '$LOGFILE' ..."
      grep -n "^$CATEGORY: " "$1" > "$CATEGORY_FILE"
    
      if [[ -s $CATEGORY_FILE ]]; then
        echo "  File contains data"
      else
        echo "  Deleting empty output file for category '$CATEGORY'"
        rm "$CATEGORY_FILE"
      fi
    done
    
    #! /bin/bash
    
    LOGFILE=$(basename "$1")
    PREFIX="${LOGFILE%.*}"
    
    for CATEGORY in INFO WARNING "CRITICAL WARNING" ERROR; do
      CATEGORY_FILE="$PREFIX.$CATEGORY.log"
      echo "Extracting category '$CATEGORY' from '$LOGFILE' ..."
      grep -n "^$CATEGORY: " "$1" > "$CATEGORY_FILE"
    
      if [[ -s $CATEGORY_FILE ]]; then
        for ID in $(grep -P "^\d+:$CATEGORY: \[\w+ \d+-\d+\]" "$CATEGORY_FILE" | awk -F' ' '{print $3}' | tr -d ']' | sort | uniq); do
          ID_FILE="$PREFIX.$CATEGORY.$ID.log"
          echo "  Extracting ID: $ID ..."
          grep "$ID" "$CATEGORY_FILE" > "$ID_FILE"
        done
      else
        echo "  Deleting empty output file for category '$CATEGORY'"
        rm "$CATEGORY_FILE"
      fi
    done
    
    试试这个

    $ cat file.sh 
    INFO: [Synth 8-3491] module 'IOBUF' declared at 'C:/Xilinx/Vivado/2017.3/scripts/rt/data/unisim_comp.v:22655' bound to instance 'MAX11603_I2C_sda_iobuf' of component 'IOBUF' [C:/.../block_design_wrapper.vhd:356]
    INFO: [Synth 8-3491] module 'block_design' declared at 'C:/Projects/.../block_design.vhd:2346' bound to instance 'block_design_i' of component 'block_design' [C:/.../block_design_wrapper.vhd:363]
    ---------------------------------------------------------------------------------
    Starting RTL Elaboration : Time (s): cpu = 00:00:07 ; elapsed = 00:00:07 . Memory (MB): peak = 491.250 ; gain = 104.844
    ---------------------------------------------------------------------------------
    INFO: [Synth 8-638] synthesizing module 'block_design' [C:/Projects/.../block_design.vhd:2469]
    INFO: [Synth 8-638] synthesizing module 'block_design_axi_mem_intercon_0' [C:/Projects/.../block_design.vhd:1403]
    INFO: [Synth 8-638] synthesizing module 'm00_couplers_imp_1Y96WCF' [C:/Projects/.../block_design.vhd:81]
    INFO: [Synth 8-3491] module 'block_design_auto_pc_0' declared at 'C:/.../block_design_auto_pc_0_stub.vhdl:5' bound to instance 'auto_pc' of component 'block_design_auto_pc_0' [C:/Projects/.../block_design.vhd:267]
    INFO: [Synth 8-638] synthesizing module 'block_design_auto_pc_0' [C:/.../block_design_auto_pc_0_stub.vhdl:71]
    INFO: [Synth 8-256] done synthesizing module 'm00_couplers_imp_1Y96WCF' (2#1) [C:/Projects/.../block_design.vhd:81]
    
    
    $ cat file.sh | grep -P "^\d+:INFO: \[\w+ \d+-\d+\]" | awk -F' ' '{print $3}' | tr -d ']' | sort | uniq
    8-256
    8-3491
    8-638
    

    我改进了你的grep模式以排除一些误报。最后的脚本版本现在使用您的命令链插入到我的原始问题中。谢谢。我认为答案不应该在问题中。@GregSchmit真正的答案在标有“解决方案”的答案中。但是人们可能会对看到组装好的解决方案感兴趣。我把它添加到我的问题中,因为自我回答在so不太受欢迎。我认为你应该回答你自己的问题,但如果你的问题是派生的,则保留当前的“已接受”,因为它是你使用的问题。我从来没有看到过回答你自己的问题会有什么负面影响,如果有人对你的问题不屑一顾,告诉他们让meta中的mods禁止自我回答。因为他们目前是被允许的,所以答案应该是这样的。只要我的2美分。