Regex 使用正则表达式捕获不正确的文本模式格式
我有空文件夹,它们的开头有Regex 使用正则表达式捕获不正确的文本模式格式,regex,powershell,Regex,Powershell,我有空文件夹,它们的开头有NA-。这些文件的正确格式应为 “NA-文件夹名称” 不幸的是,并非所有人都遵循这种命名惯例。我试图编写一个正则表达式来捕获NA的所有模式,即格式错误或重复NA- 到目前为止我想出的正则表达式是 N?A-((?![0-9.A-MO-Z][B-Z]|)N?A-) 下面是我用来测试它的文件夹名。名称中不正确的具有NA-格式不正确,我想在以下位置捕获NA-格式: NA -IncorrectFolderName1 N A-1. IncorrectFolderName2 N A-
NA-
。这些文件的正确格式应为
“NA-文件夹名称”
不幸的是,并非所有人都遵循这种命名惯例。我试图编写一个正则表达式来捕获NA的所有模式,即格式错误或重复NA-
到目前为止我想出的正则表达式是
N?A-((?![0-9.A-MO-Z][B-Z]|)N?A-)
下面是我用来测试它的文件夹名。名称中不正确的具有NA-格式不正确,我想在以下位置捕获NA-格式:
NA -IncorrectFolderName1
N A-1. IncorrectFolderName2
N A- 1. IncorrectFolderName3
NA-IncorrectFolderName4
N A -1.IncorrectFolderName5
NA -NA -IncorrectFolderName6
NA - NA -IncorrectFolderName7
N A - N A - IncorrectFolderName8
N A - NA - IncorrectFolderName9
NA - CorrectFolderName1
NA - 1CorrectFolderName2
NA - 1. CorrectFolderName3
请参阅此处的代码,以了解我正在尝试执行的操作:
我的代码无法捕获的唯一错误格式是:
N A- 1. IncorrectFolderName3
正则表达式不应捕获格式正确的“NA-”文件夹,如下所示。这些代码不应被捕获
RegularFolderName1
NA - CorrectFolderName1
NA - 1CorrectFolderName2
NA - 1. CorrectFolderName3
我一直在研究正则表达式,我很接近,但我似乎不知道如何编写它来找到错误代码的所有所需模式。任何帮助都将不胜感激。我想,也许吧
(?:(?:N\s*A\s*)-\s*){2}|N\s+A\s*-\s*(?=\d+\.\s*)|NA-|NA\s+-(?=\S)
或者,一些类似的表达式可以交替使用,这样编写和调试就简单多了
我不确定我们会在结尾处捕获什么,也不想在结尾处捕获什么,任何您不想滑动/捕获的结尾部分,您只需将其置于正向前瞻(?=)
,这是一种零宽度断言,例如:
NA\s+-(?=\S)
正则表达式电路
可视化正则表达式:
如果您希望简化/修改/探索表达式,将在的右上面板中进行解释。如果您愿意,还可以在中查看它与一些示例输入的匹配情况
使用您的示例,这是有效的:
$foldernames = 'NA -IncorrectFolderName1',
'N A-1. IncorrectFolderName2',
'N A- 1. IncorrectFolderName3',
'NA-IncorrectFolderName4',
'N A -1.IncorrectFolderName5',
'NA -NA -IncorrectFolderName6',
'NA - NA -IncorrectFolderName7',
'N A - N A - IncorrectFolderName8',
'N A - NA - IncorrectFolderName9',
'RegularFolderName1',
'NA - CorrectFolderName1',
'NA - 1CorrectFolderName2',
'NA - 1. CorrectFolderName3'
$newNames = $foldernames | ForEach-Object { $_ -replace '^(?:(N\s*A\s*-\s*))+(.+)', 'NA - $2' }
$newNames
结果:
正则表达式详细信息:
(?: Match the regular expression below
( Match the regular expression below and capture its match into backreference number 1
N Match the character “N” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A Match the character “A” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
- Match the character “-” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
)+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
( Match the regular expression below and capture its match into backreference number 2
. Match any single character that is not a line break character
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
请尝试替换“^(?:N\s*A\b\P{L}*)+”,请参阅或,
^N\s*A(?:\s*(?:-\s*)?N\s*A)*(?:\s*-\s*(?:\d+\)*\s*,
,请参阅。如果你能精确描述需求的其他部分,这将很有帮助。你打算用你想要的模式重命名这些项目吗?如果是这样,那么您可能只需要获取所有以N
开头、后跟一个或多个A
或``的文件,然后是连字符。一旦你有了这个列表,在连字符上拆分,取最后一个项目,并添加正确的前缀。-不匹配“^NA-(?!NA-”
-谢谢你的帮助。我在这件事上伤了脑筋。也感谢您提供的额外见解。对此感到抱歉,只是注意到它仍然可以捕获那些在模式后面有数字的数字。那些像na-1。错误的文件夹名称2,不适用-1。不正确的文件夹名称3,不适用-1。不正确的文件夹名称5
。不管怎么说?再次谢谢你。按预期工作。关于Lookaround断言,我有很多东西需要学习。
(?: Match the regular expression below
( Match the regular expression below and capture its match into backreference number 1
N Match the character “N” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
A Match the character “A” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
- Match the character “-” literally
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
)+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
( Match the regular expression below and capture its match into backreference number 2
. Match any single character that is not a line break character
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)