Bash 按多个换行符拆分文件_Bash

Bash 按多个换行符拆分文件

bash

Bash 按多个换行符拆分文件,bash,Bash,假设您有以下输入文件 Some text. It may contain line breaks. Some other part of the text Yet an other part of the text 您希望迭代每个文本部分（由两个换行符（\n\n）分隔），以便在第一次迭代中，我只会得到： Some text. It may contain line breaks. 在第二次迭代中，我将得到： Some other part of the text Yet an oth

假设您有以下输入文件

Some text. It may contain line
breaks.

Some other part of the text

Yet an other part of
the text

您希望迭代每个文本部分（由两个换行符（

\n\n

）分隔），以便在第一次迭代中，我只会得到：

Some text. It may contain line
breaks.

在第二次迭代中，我将得到：

Some other part of the text

Yet an other part of
the text

在最后一次迭代中，我会得到：

Some other part of the text

Yet an other part of
the text

我试过这个，但似乎不起作用，因为

IFS

只支持一个字符

cat $inputfile | while IFS=$'\n\n' read part; do
  # do something with $part
done

将awk与null

RS

一起使用：

awk '{print NR ":", $0}' RS= file
1: Some Text. It may contains line
breaks.
2: Some Other Part of the Text
3: Yet an other Part of
the Text

您现在可以清楚地看到您的输入文件有3条记录（每条记录都打印有输出中的记录）。

使用带null的awk

RS

：

awk '{print NR ":", $0}' RS= file
1: Some Text. It may contains line
breaks.
2: Some Other Part of the Text
3: Yet an other Part of
the Text

您可以清楚地看到，您的输入文件现在有3条记录（每条记录都打印有输出中的记录）。

这是纯bash中anubhava的解决方案：

#!/bin/bash

COUNT=1; echo -n "$COUNT: "
while read LINE
do
    [ "$LINE" ] && echo "$LINE" || { (( ++COUNT )); echo -n "$COUNT: " ;}
done

这是纯bash中anubhava的解决方案：

#!/bin/bash

COUNT=1; echo -n "$COUNT: "
while read LINE
do
    [ "$LINE" ] && echo "$LINE" || { (( ++COUNT )); echo -n "$COUNT: " ;}
done

如何使用

while

或

for

循环来迭代，如问题中所示？使用

awk

您不需要在循环中迭代，因为awk逐个记录地处理输入记录。您可以对每个记录（由

$0

表示）和文件执行任何操作，并通过awkoops进行迭代，未看到您的答案…发布了一个dup…：（+1并删除我的。@Kent:非常感谢。这和你的答案是一样的，只是碰巧在几分钟前发布了它。我对awk不太了解。我需要为每条记录执行其他shell脚本。这可以用awk来完成吗？我如何用

while

或

for

循环来迭代它，如问题中所示？用

awk

您不需要在循环中迭代，因为awk逐个记录处理输入记录。您可以对每个记录（用

$0

表示）和文件执行任何操作，并由awkoops进行迭代，没有看到您的答案…发布了一个dup…：（+1）并删除我的。@Kent:非常感谢。这和你的答案是一样的，只是碰巧在几分钟前发布了它。我对awk不太了解。我需要为每条记录执行其他shell脚本。这可以用awk完成吗？我最后使用了一个变体（不计算），因为我在awk解决方案中遇到了转义问题。我的文本部分包含大量字符，如

“

或

”

，需要通过

system（）

调用传递给其他脚本。我最终使用了一种变体（不计算），因为我在awk解决方案中遇到了转义问题。我的文本部分包含大量字符，如

“

或

”

，需要通过

系统（）

调用传递给其他脚本。