Regex 正则表达式，删除重复的非中断字符串_Regex_String_Text_Duplicates

Regex 正则表达式，删除重复的非中断字符串

regex string text

Regex 正则表达式，删除重复的非中断字符串,regex,string,text,duplicates,Regex,String,Text,Duplicates,我最近试着制作一个正则表达式，用于删除彼此紧跟的字符串，而不被其他字符串打断，然后只保留一个字符串。我迄今为止的工作：。它应该适用于所有可能的URL，这些URL之前可能没有www.或其他结尾，如.com或.nl等字符串（URL列表）如下所示： operator.livrareflori.md operator.livrareflori.md operator.livrareflori.md operator.livrareflori.md operator.livrareflori.md op

我最近试着制作一个正则表达式，用于删除彼此紧跟的字符串，而不被其他字符串打断，然后只保留一个字符串。我迄今为止的工作：。它应该适用于所有可能的URL，这些URL之前可能没有www.或其他结尾，如.com或.nl等字符串（URL列表）如下所示：

operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
amazon.de
fonts.gstatic.com
fonts.gstatic.com
fonts.gstatic.com
erovoyeurism.net
tugtechnologyandbusiness.com

operator.livrareflori.md
amazon.de
fonts.gstatic.com
erovoyeurism.net
tugtechnologyandbusiness.com

最终结果应如下所示：

operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
operator.livrareflori.md
amazon.de
fonts.gstatic.com
fonts.gstatic.com
fonts.gstatic.com
erovoyeurism.net
tugtechnologyandbusiness.com

operator.livrareflori.md
amazon.de
fonts.gstatic.com
erovoyeurism.net
tugtechnologyandbusiness.com

您可以看到，没有被其他字符串中断的重复字符串消失了，只保留了一个结果。

您可以进行匹配

^(.+)$(?:\n\1)+

因此，捕获第一行并匹配后续重复行，然后替换与第一个捕获组匹配的所有内容：

\1

（或您所在环境中第一组的等效关键字）

您可以匹配

^(.+)$(?:\n\1)+

因此，捕获第一行并匹配后续重复行，然后替换与第一个捕获组匹配的所有内容：

\1

（或您所在环境中第一组的等效关键字）

使用记事本++，您可以执行以下操作：

Ctrl+H
查找内容：
```
^（+）$（？：\R\1）+
```
替换为：
```
$1
```
检查环绕
检查正则表达式
不要选中
```
。匹配换行符
```
全部替换

说明：

^(.+)$      : group 1, a whole line
(?:         : non capture group
    \R      : any kind of line break
    \1      : backreference to group 1
)+          : group must appear 1 or more times

$1          : content of group 1

operator.livrareflori.md
amazon.de
fonts.gstatic.com
erovoyeurism.net
tugtechnologyandbusiness.com

更换：

^(.+)$      : group 1, a whole line
(?:         : non capture group
    \R      : any kind of line break
    \1      : backreference to group 1
)+          : group must appear 1 or more times

$1          : content of group 1

operator.livrareflori.md
amazon.de
fonts.gstatic.com
erovoyeurism.net
tugtechnologyandbusiness.com

给定示例的结果：

^(.+)$      : group 1, a whole line
(?:         : non capture group
    \R      : any kind of line break
    \1      : backreference to group 1
)+          : group must appear 1 or more times

$1          : content of group 1

operator.livrareflori.md
amazon.de
fonts.gstatic.com
erovoyeurism.net
tugtechnologyandbusiness.com

使用记事本++，您可以执行以下操作：

Ctrl+H
查找内容：
```
^（+）$（？：\R\1）+
```
替换为：
```
$1
```
检查环绕
检查正则表达式
不要选中
```
。匹配换行符
```
全部替换

说明：

^(.+)$      : group 1, a whole line
(?:         : non capture group
    \R      : any kind of line break
    \1      : backreference to group 1
)+          : group must appear 1 or more times

$1          : content of group 1

operator.livrareflori.md
amazon.de
fonts.gstatic.com
erovoyeurism.net
tugtechnologyandbusiness.com

更换：

^(.+)$      : group 1, a whole line
(?:         : non capture group
    \R      : any kind of line break
    \1      : backreference to group 1
)+          : group must appear 1 or more times

$1          : content of group 1

operator.livrareflori.md
amazon.de
fonts.gstatic.com
erovoyeurism.net
tugtechnologyandbusiness.com

给定示例的结果：

^(.+)$      : group 1, a whole line
(?:         : non capture group
    \R      : any kind of line break
    \1      : backreference to group 1
)+          : group must appear 1 or more times

$1          : content of group 1

operator.livrareflori.md
amazon.de
fonts.gstatic.com
erovoyeurism.net
tugtechnologyandbusiness.com

诀窍是捕捉这条线，并使用前瞻来验证它是否存在于主题的后面。此表达式匹配重复项，并用“”替换使其保留最后出现的项：

(?s)^((?:https?://)?(?:www\.)?\S+\.\S+)\n(?=.*^\1$)

诀窍是捕捉线条，并使用前瞻来验证它是否存在于主题的后面。此表达式匹配重复项，并用“”替换使其保留最后出现的项：

(?s)^((?:https?://)?(?:www\.)?\S+\.\S+)\n(?=.*^\1$)

你可以试试这个。看演示

您使用的是什么语言/工具？你试了什么？什么不起作用？你得到了什么？你可能会使用你正在使用的语言或工具的可能副本？你试了什么？什么不起作用？您得到了什么？您可以使用或的可能副本