如何在不改变输出顺序的情况下使用grep_Grep

如何在不改变输出顺序的情况下使用grep

grep

如何在不改变输出顺序的情况下使用grep,grep,Grep,我有两个文件（LIST.txt和FILE1.txt）。我正在尝试使用脚本grep以与LIST.txt相同的顺序获得输出 LIST.txt rs201196551 rs8071824 rs74620303 rs201196551 red rs74620303 blue rs9000000 pink rs8071824 purple FILE1.txt rs201196551 rs8071824 rs74620303 rs201196551 red rs74620303 blue rs9000

我有两个文件（LIST.txt和FILE1.txt）。我正在尝试使用脚本grep以与LIST.txt相同的顺序获得输出

LIST.txt

rs201196551
rs8071824
rs74620303

rs201196551 red
rs74620303 blue
rs9000000 pink
rs8071824 purple

FILE1.txt

rs201196551
rs8071824
rs74620303

rs201196551 red
rs74620303 blue
rs9000000 pink
rs8071824 purple

我使用了以下代码：

grep-wFf LIST.txt FILE1.txt>OUTPUT.txt

我得到了这个输出：

rs201196551 red
rs74620303 blue
rs8071824 purple

rs201196551 red
rs8071824 purple
rs74620303 blue

但事实上，我预计会有这样的结果：

rs201196551 red
rs74620303 blue
rs8071824 purple

rs201196551 red
rs8071824 purple
rs74620303 blue

（与LIST.txt的顺序相同）。

我认为如果没有其他工具，您无法更改

grep

的输出顺序。但是，以下是一个按列表文件顺序缓冲输出的awk：

$ awk '
NR==FNR {                                            # process list file
    a[$0]=++c                                        # store first word in a hash
    next                                             # process next list item
}
{                                                    # process file1
    for(i in a)                                      # for each list item
        if($1==i) {                                  # see if it is the first word
            b[a[i]]=b[a[i]] (b[a[i]]==""?"":ORS) $0  # store to output buffer
            next                                     # no more candidates after match
        }
}
END {                                                # in the end
    for(i=1;i<=c;i++)                                # start outputing
        if(b[i]!="")                                 # skip empties
            print b[i]               
}' list file1

更新：从评论中，感谢@Sundeep:

$ awk '
NR==FNR {         # lets hash the haystack instead ie. file1
    a[$1]=$0
    next
} 
($0 in a) {       # now read the needles from the list and lookup from a
    print a[$0]
}' file1 list

输出：

rs201196551 red
rs8071824 purple
rs74620303 blue

rs201196551 red
rs8071824 purple
rs74620303 blue

但是，如果在

文件1

中有相同的条目（属于

$1

），它们将丢失（由于

a[$1]=$0

）。文件中的最后一个条目将保留。

为什么不反转输入文件列表并使用

awk'NR==FNR{a[$1]=$0；next}（$0在a中）{print a[$0]}'

？@Sundeep是的，这也可以。通常干草堆比针大，所以我想将针散列（不过，使用输出缓冲创建类似的条件：）。