Regex 尝试使用BASH删除段落中的重复_Regex_Linux_Bash_Shell

Regex 尝试使用BASH删除段落中的重复

regex linux bash shell

Regex 尝试使用BASH删除段落中的重复,regex,linux,bash,shell,Regex,Linux,Bash,Shell,您好，我正在编写一个简单的BASH来删除段落中任何单词的连续重复出现，并将输出重定向到stdout，下面是我取得的进展文件1 由于无法正确删除事件，因此缺少某些内容。您可以使用 sed's/\b\（[a-z]\+\）\s\1\b/\1/g'文件>文件1 sed的/\b\（[a-z]\+\）[[：space:]\1\b/\1/g'文件>文件1 看。正则表达式匹配 \b-单词边界 \（[a-z]\+\）-第1组：任何一个或多个小写字母 [[：space:]/\s-空白 \1-与组1中的值相同

您好，我正在编写一个简单的BASH来删除段落中任何单词的连续重复出现，并将输出重定向到stdout，下面是我取得的进展

文件1

由于无法正确删除事件，因此缺少某些内容。

您可以使用

sed's/\b\（[a-z]\+\）\s\1\b/\1/g'文件>文件1
sed的/\b\（[a-z]\+\）[[：space:]\1\b/\1/g'文件>文件1

看。正则表达式匹配

```
\b
```
-单词边界
```
\（[a-z]\+\）
```
-第1组：任何一个或多个小写字母
```
[[：space:]
```
/
```
\s
```
-空白
```
\1
```
-与组1中的值相同
```
\b
```
-单词边界

只需使用

sed's/\b\（[a-z]\+\）\1/\1/g'文件>文件2

问题是明天的第三行被删除了，这是不正确的，因为在“明天”之后有“，”第四行“toto”被删除了，它没有被空格分隔，因此不应被视为连续重复。在应用它们之前，请阅读标签的说明。有一半是放错地方了@Abelisto是的，GNU

sed

中支持的

将以不区分大小写的方式匹配。这非常有效，感谢您的解释。如何将修改后的文本输出到stdout而不是文件？我找到了它，/dev/stdout

**double double toil and trouble 
fire burn and cauldron bubble bubble 
tomorrow and tomorrow and tomorrow 
creeps in this this petty pace from day toto day**

echo `<file1` | sed -e 's/\b\([a-z ]\+\)\1/\1/g' | cat > file2

double toil and trouble fire burn and cauldron bubble tomorrow and tomorrow creeps in this petty pace from day to day