在shell脚本中优化grep（或使用AWK）_Shell_Awk_Grep

在shell脚本中优化grep（或使用AWK）

shell awk grep

在shell脚本中优化grep（或使用AWK）,shell,awk,grep,Shell,Awk,Grep,在我的shell脚本中，我尝试使用在$sourcefile中找到的术语反复搜索同一个$targetfile My$sourcefile的格式如下： pattern1 pattern2 etc... 我必须搜索的低效循环是： for line in $(< $sourcefile);do fgrep $line $targetfile | fgrep "RID" >> $outputfile done $（>$outputfile 完成我知道可以通过将整个$tar

在我的shell脚本中，我尝试使用在$sourcefile中找到的术语反复搜索同一个$targetfile

My$sourcefile的格式如下：

pattern1
pattern2
etc...

我必须搜索的低效循环是：

for line in $(< $sourcefile);do
    fgrep $line $targetfile | fgrep "RID" >> $outputfile
done

$（<$sourcefile）中的行的

；做
fgrep$line$targetfile | fgrep“RID”>>$outputfile
完成

我知道可以通过将整个$targetfile加载到内存中，或者使用AWK来改进这一点

感谢sed解决方案：

sed's/\（.*\）/\/\1\/p/'$sourcefile | sed-nf-$targetfile

这会将$sourcefile的每一行转换为sed模式匹配命令：

火柴串

到

/匹配字符串/p

但是，您需要转义特殊字符以使其健壮。

使用awk读取源文件，然后在targetfile中搜索（未测试）：

也将使用

gawk

我是否遗漏了什么，或者为什么不只

fgrep-f“$sourcefile”“$targetfile”

？

您不能只加入sourcefile和egrep for（pattern1 | pattern2…）？好主意…需要升级4000个选项…模式会根据源文件中的行数而有所不同。谢谢！现在试试这个。看起来已经比使用grep快了，尽管源文件有大约4000行，并且正在搜索一个300兆的目标文件，所以我预计它仍然需要一些时间。让我们看看会发生什么。谢谢你的建议，我也会尝试一下。速度非常快，但建议的fgrep-f更符合我的需要。哇！这比另外两个快。结果似乎也不错。我是说，闪电般的快。令人惊叹的！

nawk '
    NR == FNR {patterns[$0]++; next}
    /RID/ {
        for (pattern in patterns) {
            # since fgrep considers patterns as strings not regular expressions, 
            # use string lookup and not pattern matching ("~" operator).
            if (index($0, pattern) > 0) {
                print
                break
            }
        }
    }
' "$sourcefile" "$targetfile" > "$outputfile"