Regex RFC822:使用正则表达式验证电子邮件地址

Regex RFC822:使用正则表达式验证电子邮件地址,regex,perl,Regex,Perl,如您所知,这是我们验证电子邮件地址的方式: (?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t] )+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?: \r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\

如您所知,这是我们验证电子邮件地址的方式:

(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:
\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(
?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ 
\t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\0
31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\
](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+
(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:
(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)
?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\
r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[
 \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)
?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t]
)*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[
 \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*
)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)
*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+
|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r
\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:
\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t
]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031
]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](
?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?
:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?
:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?
:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?
[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|
\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>
@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"
(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?
:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[
\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-
\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(
?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;
:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([
^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\"
.\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\
]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\
[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\
r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]
|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \0
00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\
.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,
;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?
:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[
^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]
]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(
?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(
?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[
\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t
])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t
])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?
:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|
\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:
[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\
]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)
?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["
()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)
?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>
@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[
 \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,
;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:
\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[
"()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])
*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])
+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\
.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(
?:\r\n)?[ \t])*))*)?;\s*)
(?:(?:\r\n)?[\t])*(?:(?:(?:[^()@;:\\”\[\]\000-\031]+(?:(?:(?:\r\n)?[\t]
)+|\Z |(?=[\[“()@,;:\\”\[\]]))124;“(?:[^\“\r\\]\\.\.\。。。(?:(?:\r\n)?[\t])*”(?:(?:)
\r\n)?[\t])*(?:\。(?:(?:\r\n)?[\t])*(?:[^()@;:\\”\[\]\000-\031]+(?:(?:)(
?:\r\n)?[\t])+\Z|(?=[\[“()@,;:\\”\[\]])))124;“(?:[^\“\r\\]\\\\.\。|(?:(?:\r\n)?[
\t] )*“(?:(?:(?:\r\n)?[\t])*)*@(?:(?:\r\n)?[\t])*(?:[^()@;:\\”\[\]\000-\0
31]+(?:(?:\r\n)?[\t])+\Z;(?=[\[“()@,;:\\”\[\]]))\[([^\[\]\r\\]\\\\\\\]\\\\.]*\
](?:(?:\r\n)?[\t])*(?:\。(?:(?:\r\n)?[\t])*(?:[^(),;:\”\[\]\000-\031]+
(?:(?:(?:\r\n)?[\t])+\Z;(?=[\[“()@,;:\\”\[\]]))\124\[([^\[\]\ r\\]\\\\\].*\])(?:
(?:\r\n)?[\t])*(?:[^()@,;:\\\”\[\]\000-\031]+(?:(?:(?:\r\n)?[\t])+\Z
|(?=[\[“()@,;:\\”\[\]])|“(?:[^\”\r\\]\\\.\.\124;(?:(?:\ r\n)?[\t])*”(?:(?::\r\n)
?[\t])*)*\(?:(?:\r\n)?[\t])*)|(?:[^()@;:\”\[\]\000-\031]+(?:(?)?
:(?:\r\n)?[\t])+\Z;(?=[\[“()@,;:\\”\[\]]))。“(?:[^\”\r\\]\\\\.\.\124;(?:(?::\ r\n)?
[\t])*“(?:(?:(?:\r\n)?[\t])*:(?:(?:\r\n)?[\t])*(?:(?:(?:[^()@;:\\”\[\]
\000-\031]+(?:(?:(?:\r\n)?[\t])+\Z;(?=[\[”()@;:\\“\[\]]))。“(?:[^\”\r\\]|
\\(?:(?:\r\n)?[\t])*(?:(?:\r\n)?[\t])*(?:\(?:(?:\r\n)?[\t])*(?:[^()
@,;:\“\[\]\000-\031]+(?:(?:(?:\r\n)?[\t])+\Z;(?=[\[“()@,;:\”))。”
(?:[^\”\r\\]\\。(?:(?:\r\n)?[\t])*“(?:(?:\r\n)?[\t])*)*@(?:(?:\r\n)?[\t]
)*(?:[^()@,;:\\”\[\]\000-\031]+(?:(?:(?:\r\n)?[\t])+\Z;(?=[\[”()@,;:\\
“\[\]])\\[([^\[\]\r\\]\\\].*\](?:(?:\r\n)?[\t])*(?:\(?:(?:\r\n)?[\t])*(?:。)?
:[^()@;:\\”\[\]\000-\031]+(?:(?:(?:\r\n)?[\t])+\124;\ Z;(?=[\[”),;:\”\[
\]]))|\[([^\[\]\r\\]\\\].*](?:(?:\r\n)?[\t])*)*(?:[^()@;:\”\[\]\000-
\031]+(?:(?:\r\n)?[\t])+\Z;(?=[\[“()@,;:\\\”\[\]]))。“(?:[^\\”\r\\\]\\\”|(
(?:(?:\r\n)?[\t])*“(?:(?:\r\n)?[\t])*\(?:(?:\r\n)?[\t])*)(?:,\s*(
?:(?:[^()@,;:\\”\[\]\000-\031]+(?:(?:\r\n)?[\t])+\Z;(?=[\[“()@,;:\\
“\[\]])\”(?:[^\”\r\\]\\\.(?:(?:\r\n)?[\t])*(?:(?:(?:\r\n)?[\t])*(?:\。(?:)(
(?:\r\n)?[\t])*(?:[^()@;:\\”\[\]\000-\031]+(?:(?:(?:\r\n)?[\t])+\Z;(?:)=[
\[“()@,;:\\\”\[\]])|“(?:[^\”\r\\]\\\.\124;(?:(?:\ r\n)?[\t])*”(?:(?:\r\n)?[\t
])*))*@(?:(?:\r\n)?[\t])*(?:[^()@;:\\”\[\]\000-\031]+(?:(?:(?:\r\n)?[\t
])+|\Z |(?=[\[“()@,;:\\\”\[\]]))\[([^\[\]\r\\]\\\\\].*](?:(?:\r\n)?[\t])*)(?
:\。(?:(?:\r\n)?[\t])*(?:[^()@;:\“\[\]\000-\031]+(?:(?:\r\n)?[\t])+|
\Z |(?=[\[“()@,;:\\”\[\]]))\[([^\[\]\r\\]\\\\].*](?:(?:\r\n)?[\t])*)*(?:
[^()@;:\\”\[\]\000-\031]+(?:(?:(?:\r\n)?[\t])+\124;\ Z;(?=[\[“()@;:\”)\[\
]]))|“(?:[^\”\r\\]\\.\。(?:(?:\r\n)?[\t])*”(?:(?:\r\n)?[\t])*)*\(?:(?:(?:\r\n)?[\t]):(
?:\r\n)?[\t])*)*)?;\s*)
你能给我解释一下这里发生了什么事吗

我们是否在查看字符串并确定它是否是电子邮件地址

你能至少解释一下第一行吗:

(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
(?:(?:\r\n)?[\t])*(?:(?:(?:[^()@;:\\”\[\]\000-\031]+(?:(?:(?:\r\n)?[\t]

你能给我举一个罕见但有效的电子邮件地址的例子吗?

我昨天看到了这个表达

/^([\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]+\.)*[\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]+@((((([a-z0-9]{1}[a-z0-9\-]{0,62}[a-z0-9]{1})|[a-z])\.)+[a-z]{2,6})|(\d{1,3}\.){3}\d{1,3}(\:\d{1,5})?)$/i

我想你应该忘记这一点。如果你想在正则表达式方面做得更好,这是一件事,而且可能有更好的学习方法。否则,验证电子邮件地址是一项极其复杂且容易出错的活动,我不知道有任何剪切粘贴解决方案可以完全覆盖所有内容er案例。沿着这条路走下去就是疯狂。如果你有一个连接到互联网的应用程序,最好通过实际发送确认电子邮件来验证地址。

我关于电子邮件验证的政策是:忘掉它。 除了Unicode域之外,用户可以给你一个语法上有效的假地址(例如。test@mailinator.com)

而不是用那个巨大的(并且可能计算昂贵的??)表达式来验证电子邮件地址

1) 向用户发送一个确认链接,这样您就可以知道电子邮件地址是否存在

2) 对照Mailinator&Co.域进行检查

你能给我解释一下这里发生了什么事吗


掌握《掌握正则表达式》的第一版(或者第二版,我知道它不在第三版中)“。这通过RFC定义有效的电子邮件地址来工作,并构建所需的正则表达式。

如果您想了解发生了什么,您应该查看一个像样的模块,例如,并注意模式是如何从其组成部分构建的:

my $CTL            = q{\x00-\x1F\x7F};
my $special        = q{()<>\\[\\]:;@\\\\,."};

my $text           = qr/[^\x0A\x0D]/;

my $quoted_pair    = qr/\\$text/;

my $ctext          = qr/(?>[^()\\]+)/;
my ($ccontent, $comment) = (q{})x2;
for (1 .. $COMMENT_NEST_LEVEL) {
  $ccontent = qr/$ctext|$quoted_pair|$comment/;
  $comment  = qr/\s*\((?:\s*$ccontent)*\s*\)\s*/;
}
my $cfws           = qr/$comment|\s+/;

my $atext          = qq/[^$CTL$special\\s]/;
my $atom           = qr/$cfws*$atext+$cfws*/;
my $dot_atom_text  = qr/$atext+(?:\.$atext+)*/;
my $dot_atom       = qr/$cfws*$dot_atom_text$cfws*/;

my $qtext          = qr/[^\\"]/;
my $qcontent       = qr/$qtext|$quoted_pair/;
my $quoted_string  = qr/$cfws*"$qcontent+"$cfws*/;

my $word           = qr/$atom|$quoted_string/;

您从何处获得此信息?
\r\n
不允许出现在电子邮件地址中。这是一种可能的正则表达式。从技术上讲,无法使用正则表达式验证电子邮件地址(尤其是现在可以使用Unicode域)看到了吗?你试着阅读了吗?所有的内容都在那里解释了。Regex的墙,我们又见面了!到底是什么意思?给他们发封电子邮件,让他们验证一下。
my $simple_word    = qr/$atom|\.|\s*"$qcontent+"\s*/;
my $obs_phrase     = qr/$simple_word+/;

my $phrase         = qr/$obs_phrase|(?:$word+)/;

my $local_part     = qr/$dot_atom|$quoted_string/;
my $dtext          = qr/[^\[\]\\]/;
my $dcontent       = qr/$dtext|$quoted_pair/;
my $domain_literal = qr/$cfws*\[(?:\s*$dcontent)*\s*\]$cfws*/;
my $domain         = qr/$dot_atom|$domain_literal/;