Linux 如何使用sed按指定顺序提取行？_Linux_Bash_Unix_Sed

Linux 如何使用sed按指定顺序提取行？

linux bash unix sed

Linux 如何使用sed按指定顺序提取行？,linux,bash,unix,sed,Linux,Bash,Unix,Sed,我有一个约50000行长的文件，我需要检索特定的行。我尝试了以下命令： sed -n 'Np;Np;Np' inputFile.txt > outputFile.txt （'N'是具体的行，我想提取）这可以正常工作，但命令会按顺序提取行（即重新排序输入），例如，如果我尝试： sed -n '200p;33p;40,000p' inputFile.txt > outputFile.txt 我得到一个文本文件，其中的行顺序为：3320040000（这对我来说不起作用）。有没有办法保

我有一个约50000行长的文件，我需要检索特定的行。我尝试了以下命令：

sed -n 'Np;Np;Np' inputFile.txt > outputFile.txt

（'N'是具体的行，我想提取）

这可以正常工作，但命令会按顺序提取行（即重新排序输入），例如，如果我尝试：

sed -n '200p;33p;40,000p' inputFile.txt > outputFile.txt

我得到一个文本文件，其中的行顺序为：3320040000（这对我来说不起作用）。有没有办法保持命令中行的显示顺序？

您还可以使用其他bash命令吗？在这种情况下，这是可行的：

for i in 200 33 40000; do 
    sed -n "${i}p" inputFile.txt
done > outputFile.txt

这可能比在sed中使用array慢，但更实用

使用

perl

，将输入行保存在哈希变量中，行号作为键

$ seq 12 20 | perl -nle '
@l = (5,2,3,1);
$a{$.} = $_ if( grep { $_ == $. } @l );
END { print $a{$_} foreach @l } '
16
13
14
12

```
$。
```
是行号，
```
grep{$\u==$.}@l
```
检查数组
```
@l
```
中是否存在该行号，该数组按所需顺序包含所需行

作为一个单行程序，

@l

在

BEGIN

中声明，以避免每次迭代初始化，并确保行号超出范围时没有空行：

$ seq 50000 > inputFile.txt
$ perl -nle 'BEGIN{@l=(200,33,40000)} $a{$.}=$_ if(grep {$_ == $.} @l); END { $a{$_} and print $a{$_} foreach (@l) }' inputFile.txt > outputFile.txt
$ cat outputFile.txt
200
33
40000

对于足够小的输入，可以将行保存在数组中并打印所需的索引。注意，当索引以

0开始时所做的调整
$ seq 50000 | perl -e '$l[0]=0; push @l,<>; print @l[200,33,40000]'
200
33
40000



输入文件的性能比较seq 50000>inputFile.txt

$ time perl -nle 'BEGIN{@l=(200,33,40000)} $a{$.}=$_ if(grep {$_ == $.} @l); END { $a{$_} and print $a{$_} foreach (@l) }' inputFile.txt > outputFile.txt

real    0m0.044s
user    0m0.036s
sys 0m0.000s

$ time awk -v line_order="200 33 40000" '
    BEGIN {
        n = split(line_order, inorder)
        for (i=1; i<=n; i++) linenums[inorder[i]]
    }
    NR in linenums {cache[NR]=$0}
    END {for (i=1; i<=n; i++) print cache[inorder[i]]}
' inputFile.txt > outputFile.txt

real    0m0.019s
user    0m0.016s
sys 0m0.000s

$ time for i in 200 33 40000; do sed -n "${i}{p;q}" inputFile.txt ; done > outputFile.txt

real    0m0.011s
user    0m0.004s
sys 0m0.000s

$ time sed -n '33h; 200{p; g; p}; 40000p' inputFile.txt > outputFile.txt

real    0m0.009s
user    0m0.008s
sys 0m0.000s

$ time for i in 200 33 40000; do head -"${i}" inputFile.txt | tail -1 ; done > outputFile.txt

real    0m0.007s
user    0m0.000s
sys 0m0.000s

$time perl-nle'BEGIN{@l=（200,3340000）}$a{$.}=$\uIf（grep{$\uIf=$.}@l）；结束{$a{${}并打印$a{$}foreach（@l）}'inputFile.txt>outputFile.txt
实际0.044s
用户0m0.036s
系统0m0.000s
$time awk-v行订单=“200 33 40000”
开始{
n=拆分（行顺序，顺序）
对于（i=1；i您必须按住第33行，直到看到第200行：
sed -n '33h; 200{p; g; p}; 40000p' file

有关详细说明，请参阅手册：
awk
可能更具可读性：
awk '
    NR == 33    {line33 = $0} 
    NR == 200   {print; print line33} 
    NR == 40000 {print}
' file 

如果要按特定顺序打印任意数量的行，可以概括如下：
awk -v line_order="11 3 5 1" '
    BEGIN {
        n = split(line_order, inorder)
        for (i=1; i<=n; i++) linenums[inorder[i]]
    }
    NR in linenums {cache[NR]=$0}
    END {for (i=1; i<=n; i++) print cache[inorder[i]]}
' file

awk-v line_order=“11 3 5 1”
开始{
n=拆分（行顺序，顺序）
对于（i=1；iI）测试的sed和第一个awk解决方案，但第33行打印的是空行。最后一个awk孤子运行正常。如果要多次解析文件，则至少在打印想要的行后退出：sed-n“${i}{p；q}”
awk -v line_order="11 3 5 1" '
    BEGIN {
        n = split(line_order, inorder)
        for (i=1; i<=n; i++) linenums[inorder[i]]
    }
    NR in linenums {cache[NR]=$0}
    END {for (i=1; i<=n; i++) print cache[inorder[i]]}
' file