Regex sed：在文件中的单数匹配项上插入行（不是每行）_Regex_Bash_Unix_Sed

Regex sed：在文件中的单数匹配项上插入行（不是每行）

regex bash unix sed

Regex sed：在文件中的单数匹配项上插入行（不是每行）,regex,bash,unix,sed,Regex,Bash,Unix,Sed,经过八个多小时的搜索，我认输了，并为这个问题提出了一个新问题。操作很简单，但我最难让它正常工作，因为我已经经历了所有其他的解决方案。我需要两样东西： 1.）在整个文件中第一次匹配PBS的行之前插入一行。在整个文件中只能发生一次。出于某种原因，我尝试过的每个解决方案最终都会在文件中的每个事件中重复插入；我怀疑，因为sed是按每行进行跟踪的所以这需要发生。原始文件： stuff here stuff here PBS -N PBS -V stuff here 变成： stuff

经过八个多小时的搜索，我认输了，并为这个问题提出了一个新问题。操作很简单，但我最难让它正常工作，因为我已经经历了所有其他的解决方案。我需要两样东西：

1.）在整个文件中第一次匹配

PBS

的行之前插入一行。在整个文件中只能发生一次。出于某种原因，我尝试过的每个解决方案最终都会在文件中的每个事件中重复插入；我怀疑，因为sed是按每行进行跟踪的

所以这需要发生。原始文件：

stuff here  
stuff here  
PBS -N  
PBS -V  
stuff here

变成：

stuff here  
stuff here  
**inserted line**  
PBS -N  
PBS -V  
stuff here

stuff here  
stuff here  
PBS -N  
PBS -V  
**inserted line**  
stuff here

2.）在整个文件中最后一次匹配“PBS”的行之后追加一行。与之前相同：在整个文件中只应发生一次

因此，需要这样做：

stuff here  
stuff here  
PBS -N  
PBS -V  
stuff here

变成：

stuff here  
stuff here  
**inserted line**  
PBS -N  
PBS -V  
stuff here

stuff here  
stuff here  
PBS -N  
PBS -V  
**inserted line**  
stuff here

我在网上看到的所有解决方案（此时我打开了大约20个选项卡）都表明这应该是相对容易的。我毫不羞耻地宣布sed正在损害我的自尊。。。感谢所有能够提供帮助的人，这里有三种方法，两种使用sed，一种使用awk

单独使用sed 在第一次出现之前插入一次

$ sed ':a;$!{N;ba}; s/PBS/inserted line\nPBS/' file
stuff here
stuff here
inserted line
PBS -N
PBS -V
stuff here

要在最后一次出现后插入一次，请执行以下操作：

$ tac file | sed ':a;$!{N;ba}; s/PBS/inserted line\nPBS/' | tac
stuff here
stuff here
PBS -N
PBS -V
inserted line
stuff here

$ sed "$(grep -n PBS file | cut -d: -f1 | head -n1)"' s/PBS/inserted line\nPBS/' file
stuff here
stuff here
inserted line
PBS -N
PBS -V
stuff here

工作原理

：a；$！{N；ba}
这会立即读取整个文件。（如果整个文件非常大，则需要查看其他方法之一。）


s/PBS/插入行\nPBS/

这将执行替换

tac

sed '
  # read the whole file into pattern space
  :a; $!{N;ba}
  # then, use greedy matching to get to the *last* PBS
  # and non-greedy matching to get to the end of that line.
  s/.*PBS[^\n]*/&\ninserted line/   
' file

通常，在我们读取整个文件之前，无法知道文件中最后出现的PBS是哪一个<但是，code>tac

会颠倒行的顺序。因此，最后一个变成了第一个

使用awk awk的主要优点是它允许轻松使用变量。在这里，我们创建一个标志

，当我们到达第一个出现的PBS时，该标志被设置为true：

$ awk '/PBS/ && !f {print "inserted line"; f=1} 1'  file
stuff here
stuff here
inserted line
PBS -N
PBS -V
stuff here

要在最后一次出现后插入，我们可以使用如上所述的

tac

解决方案。对于多样性，这种方法将文件分两次读取。在第一次运行时，它跟踪PBS的最后一个行号。第二，它打印需要打印的内容：

$ awk 'NR==FNR{if (/PBS/)n=FNR;next} 1{print} n==FNR {print "inserted line"}'  file file
stuff here
stuff here
PBS -N
PBS -V
inserted line
stuff here

这些awk解决方案一次只处理一行文件。如果文件非常大，这有助于限制内存使用

使用grep和sed 另一种方法是使用

grep

告诉我们需要处理的行号。这将在第一次出现之前插入：

$ tac file | sed ':a;$!{N;ba}; s/PBS/inserted line\nPBS/' | tac
stuff here
stuff here
PBS -N
PBS -V
inserted line
stuff here

$ sed "$(grep -n PBS file | cut -d: -f1 | head -n1)"' s/PBS/inserted line\nPBS/' file
stuff here
stuff here
inserted line
PBS -N
PBS -V
stuff here

这将在最后一个之后插入：

$ sed  "$(grep -n PBS file | cut -d: -f1 | tail -n1)"' s/.*PBS.*/&\ninserted line/' file
stuff here
stuff here
PBS -N
PBS -V
inserted line
stuff here

这种方法不需要立即将整个文件读入内存。

@John1924答案是好的。在这种情况下，您也可以不以有效的方式执行任务，例如：

仅打印第一个PBS之前的行
回音
仅打印第一个PBS（包括）之后的行

例如，在

/pbsfile

line 1
line 2
PBS -N first
PBS -N second
line 3
PBS -V last-1
PBS -V last
line 4
line 5

例如，可以执行上述操作：

pbsfile="./pbsfile"

(
#delete the lines after the 1st PBS
#so remains only the lines before the 1st PBS
sed  '/PBS/,$d' "$pbsfile"

#echo the needed line
echo "THIS SOULD BE INSERTED BEFORE 1st PBS"

#print only the lines after the 1st PBS
sed -n '/PBS/,$p' "$pbsfile"

)

产生：

line 1
line 2
THIS SOULD BE INSERTED BEFORE 1st PBS
PBS -N first
PBS -N second
line 3
PBS -V last-1
PBS -V last
line 4
line 5

如上所述，您可以对最后一个PBS执行以下操作，只需在sed之前和之后反转文件，例如

pbsfile="./pbsfile"

(
tail -r "$pbsfile" | sed -n '/PBS/,$p' | tail -r
echo "THIS SOULD BE INSERTED AFTER THE LAST PBS"
tail -r "$pbsfile" | sed  '/PBS/,$d' | tail -r
)

生产什么

line 1
line 2
PBS -N first
PBS -N second
line 3
PBS -V last-1
PBS -V last
THIS SOULD BE INSERTED AFTER THE LAST PBS
line 4
line 5

同样，这只是作为“替代解决方案”（无效）

另一种sed方法：

sed '/PBS/ {
  # insert the new line
  i\
inserted line
  # then loop over the rest of the file, implicitly printing each line
  :a; n; ba
}' file

对于最后一次匹配，此版本不需要

tac

sed '
  # read the whole file into pattern space
  :a; $!{N;ba}
  # then, use greedy matching to get to the *last* PBS
  # and non-greedy matching to get to the end of that line.
  s/.*PBS[^\n]*/&\ninserted line/   
' file

sed对于这类工作来说是错误的工具，它只用于单个行上的简单替换。只需使用awk：

$ cat tst.awk
NR  == FNR { if (/PBS/) hits[++numHits] = NR; next }
FNR == hits[1] { print "inserted line before" }
{ print }
FNR == hits[numHits] { print "inserted line after" }

$ awk -f tst.awk file file
stuff here
stuff here
inserted line before
PBS -N
PBS -V
inserted line after
stuff here

下面是一个只读取文件一次的

awk

：

cat file
line 1
line 2
PBS -N first
PBS -N second
line 3
PBS -V last-1
PBS -V last
line 4
line 5

awk'/PBS/{last=NR；if（！f）{first=NR；f=1}}{a[NR]=$0}END{for（i=1；i要正确地执行sed，您必须绕过它的每行操作，然后用原始正则表达式重新设置它。这并不难，只是有点麻烦
sed -E 'H;$!d;g
        s/\n[^\n]*PBS/\ninsert before first PBS-containing line&/
        s/.*PBS[^\n]*/&\ninsert after last PBS-containing line/;
        s/.//
'

H；$！d；g
将整个文件拖到保留缓冲区，并在前面加一个换行符（H
将当前行附加到保留缓冲区，并在前面加一个\n
，$！d
删除，如果这不是最后一行；g
（以及后面的内容）仅在最后一行上运行并检索保留缓冲区
因此，s/\n[^\n]*PBS
将在第一个PBS之前找到新行，因为在每一行之前总是有一个新行，s/*PBS[^\n]*/
将查找最后一个PBS和所有后续换行符，而s///
将剥离我们卡在其中的人工换行符，以使第一次出现的搜索工作正常
请注意，只需将第一次插入添加到搜索，s/\n[^\n]中，就可以对任意n进行第一次插入*PBS/\netc&/4
第四行。
我感觉到了你的痛苦。单字符命令…不明显。我实际上只使用sed进行简单的搜索和替换，或打印/删除特定行。任何复杂的内容，我都使用另一种更易读的语言，或者在这里使用wh*re for rep；）感谢所有回答这个问题的人，但我最终使用了grep+sed解决方案。非常优雅的解决方案，谢谢John。