Regex 如何将加号与字符类一起用作正则表达式的一部分?
在cygwin中,这不会返回匹配项:Regex 如何将加号与字符类一起用作正则表达式的一部分?,regex,bash,character-class,Regex,Bash,Character Class,在cygwin中,这不会返回匹配项: $ echo "aaab" | grep '^[ab]+$' $ echo "aaab" | grep '^[ab][ab]*$' aaab 但这确实返回了一个匹配: $ echo "aaab" | grep '^[ab]+$' $ echo "aaab" | grep '^[ab][ab]*$' aaab 这两个表达方式不一样吗? 是否有任何方法可以在不键入字符类两次的情况下表示“字符类的一个或多个字符”(如在秒示例中) 根据定义,这两个表达式应该
$ echo "aaab" | grep '^[ab]+$'
$ echo "aaab" | grep '^[ab][ab]*$'
aaab
但这确实返回了一个匹配:
$ echo "aaab" | grep '^[ab]+$'
$ echo "aaab" | grep '^[ab][ab]*$'
aaab
这两个表达方式不一样吗?
是否有任何方法可以在不键入字符类两次的情况下表示“字符类的一个或多个字符”(如在秒示例中)
根据定义,这两个表达式应该是相同的,但是Regular-expressions.info可能不包括cygwin中的bash
在基本正则表达式中,元字符是?
,+
,{
,|
,(
和)
失去它们的特殊意义;改为使用反斜杠版本\?,
\+
,\{
,\\\
,\(
和\)
因此,请使用反斜杠版本:
$ echo aaab | grep '^[ab]\+$'
aaab
或激活扩展语法:
$ echo aaab | egrep '^[ab]+$'
aaab
grep
具有多个匹配“模式”,默认情况下仅使用基本集,除非转义,否则无法识别大量元字符。您可以将grep设置为扩展模式或perl模式,以便对+
进行评估
从man grep
:
Matcher Selection
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)
-P, --perl-regexp
Interpret PATTERN as a Perl regular expression. This is highly experimental and grep -P may warn of unimplemented features.
Basic vs Extended Regular Expressions
In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).
Traditional egrep did not support the { meta-character, and some egrep implementations support \{ instead, so portable scripts should avoid { in grep -E patterns and should use [{] to match a literal {.
GNU grep -E attempts to support traditional usage by assuming that { is not special if it would be the start of an invalid interval specification. For example, the command grep -E '{1' searches for the two-character string {1 instead of reporting a syntax
error in the regular expression. POSIX.2 allows this behavior as an extension, but portable scripts should avoid it.
或者,您可以使用
egrep
而不是grep-E
通过反斜杠屏蔽,或者使用egrep作为扩展grep,别名grep-E
:
echo "aaab" | egrep '^[ab]+$'
aaab
aaab