Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/361.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/ionic-framework/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java Python域名正则表达式模式_Java_Python_Regex - Fatal编程技术网

Java Python域名正则表达式模式

Java Python域名正则表达式模式,java,python,regex,Java,Python,Regex,我希望能够按照以下规则匹配域: 域名应该是a-z | a-z | 0-9和连字符(-) 域名长度应介于1到63个字符之间 最后一个Tld必须至少为两个字符,最多为6个字符 域名不应以连字符(-)开头或结尾(例如-google.com或google-.com) 域名可以是子域(例如mkyong.blogspot.com) 我已经有了java风格的正则表达式,我只需要这个python风格的正则表达式 ^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\\.)+[A-Za-z]{

我希望能够按照以下规则匹配域:

  • 域名应该是a-z | a-z | 0-9和连字符(-)
  • 域名长度应介于1到63个字符之间
  • 最后一个Tld必须至少为两个字符,最多为6个字符
  • 域名不应以连字符(-)开头或结尾(例如-google.com或google-.com)
  • 域名可以是子域(例如mkyong.blogspot.com)
我已经有了java风格的正则表达式,我只需要这个python风格的正则表达式

^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\\.)+[A-Za-z]{2,6}$
有效域名列表

  • www.google.com
  • 谷歌网站
  • mkyong123.com
  • mkyong-info.com
  • sub.mkyong.com
  • sub.mkyong-info.com
  • mkyong.com.au
  • g、 公司
  • mkyong.t.t.co
无效域名列表,以及原因

  • mkyong.t.t.c-Tld的长度必须在2到6之间
  • mkyong,com-不允许使用逗号
  • mkyong-无Tld
  • mkyong.123,Tld不允许数字
  • .com-必须以[A-Za-z0-9]开头
  • mkyong.com/users-无Tld
  • mkyong.com-不能以连字符开头-
  • mkyong-.com-不能以连字符结尾-
  • sub.-mkyong.com-不能以连字符开头-
  • sub.mkyong-.com-不能以连字符结尾-

我根据给定域名列表(python 2.7x)运行了一个测试:

输出:

checking valid domain names ============
www.google.com                                      True
google.com                                          True
mkyong123.com                                       True
mkyong-info.com                                     True
sub.mkyong.com                                      True
sub.mkyong-info.com                                 True
mkyong.com.au                                       True
g.co                                                True
mkyong.t.t.co                                       True

checking invalid domain names ============
mkyong.t.t.c                                       False
mkyong,com                                         False
mkyong                                             False
mkyong.123                                         False
.com                                               False
mkyong.com/users                                   False
-mkyong.com                                        False
mkyong-.com                                        False
sub.-mkyong.com                                    False
sub.mkyong-.com                                    False
[Edit]为了获得与提供的expectedstring相同的结果,我提出了以下方法(不勾选“http(s)”):


在Python中尝试这个“java风格的正则表达式”时发生了什么?对我来说,这看起来像是完全正常的标准正则表达式语法。我在做:string=re.sub(r“^”((([A-Za-z0-9]+){1,63}\))|(([A-Za-z0-9]+(\-)+[A-Za-z0-9]+){1,63}\)+){1255}$,“XXX”,string)没有任何变化,这是一个与你的问题不同的正则表达式。还有,什么是
string
?我搞砸了,我更新了我的问题以匹配正确的正则表达式,并且正在使用这一个好的域吗
mkyong.t.t.t.co
对我的字符串string=re.sub(r'^([A-Za-z0-9]\.|[A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9]\.{1,3}[A-Za-z]{2,6}$,“XXX”,string)尝试了你的正则表达式,但仍然没有进行任何替换。我甚至在这里测试了您的正则表达式:但仍然没有匹配对于在线工具,请只尝试一行字符串(例如www.demo.com),您将找到匹配项。@faceoff:刚刚用我的方法更新以获得所需字符串。为什么要用“http”拆分字符串?这个怎么样:string=re.sub(r“(?:[a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?\)+[a-z0-9][a-z0-9-]{0,61}[a-z0-9],“XXX”,string)-做同样的工作,即使我是python新手,看起来也更简单。@faceoff:我在regexr.com上试过你的正则表达式,请参见:。比赛分别是mkyong.123、-mkyong.com、sub.-mkyong.com、sub.mkyong-.com、3.141、,foo@demo.net,mkyong.t.t.t.co,但无法与www.GOOGLE.com相匹配。那是完全错误的。请试试你的回复,看看你是否能解决你自己的问题。我知道我的正则表达式远不是最简单的,但它可以将域名与“XXX”匹配或替换,对吗?
string = "This is why this domain example.com will never be the same after some years, it might just be example.co.uk but will never get to example.-com. Documents could be located in this specific location http://en.example.com/documents/print.doc as you probably already know."

expectedstring = "This is why this domain XXX will never be the same after some years, it might just be XXX but will never get to example.-com. Documents could be located in this specific location http://XXX/documents/print.doc as you probably already know."
import re
valid_domains = """
www.google.com
google.com
mkyong123.com
mkyong-info.com
sub.mkyong.com
sub.mkyong-info.com
mkyong.com.au
g.co
mkyong.t.t.co
"""

invalid_domains = """
mkyong.t.t.c
mkyong,com
mkyong
mkyong.123
.com
mkyong.com/users
-mkyong.com
mkyong-.com
sub.-mkyong.com
sub.mkyong-.com
"""

valid_names = valid_domains.split()
invalid_names = invalid_domains.split()

# match 1 character domain name or 2+ domain name
pattern = '^([A-Za-z0-9]\.|[A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9]\.){1,3}[A-Za-z]{2,6}$'

print 'checking valid domain names ============'
for name in valid_names:
    print name.ljust(50), ('True' if re.match(pattern, name) else 'False').rjust(5)

print '\nchecking invalid domain names ============'
for name in invalid_names:
    print name.ljust(50), ('True' if re.match(pattern, name) else 'False').rjust(5)
checking valid domain names ============
www.google.com                                      True
google.com                                          True
mkyong123.com                                       True
mkyong-info.com                                     True
sub.mkyong.com                                      True
sub.mkyong-info.com                                 True
mkyong.com.au                                       True
g.co                                                True
mkyong.t.t.co                                       True

checking invalid domain names ============
mkyong.t.t.c                                       False
mkyong,com                                         False
mkyong                                             False
mkyong.123                                         False
.com                                               False
mkyong.com/users                                   False
-mkyong.com                                        False
mkyong-.com                                        False
sub.-mkyong.com                                    False
sub.mkyong-.com                                    False
import re

# match 1 character domain name or 2+ domain name
pattern = '(//|\s+|^)(\w\.|\w[A-Za-z0-9-]{0,61}\w\.){1,3}[A-Za-z]{2,6}'

string = "This is why this domain example.com will never be the same after some years, it might just be example.co.uk but will never get to example.-com. Documents could be located in this specific location http://en.example.com/documents/print.doc as you probably already know."
expectedstring = "This is why this domain XXX will never be the same after some years, it might just be XXX but will never get to example.-com. Documents could be located in this specific location http://XXX/documents/print.doc as you probably already know."

resultstring = ''.join([re.sub(pattern , "\g<1>XXX" , string)])

print 'resultstring: \n', resultstring
print '\nare they equal? ', expectedstring == resultstring
resultstring: 
This is why this domain XXX will never be the same after some years, it might just be XXX but will never get to example.-com. Documents could be located in this specific location http://XXX/documents/print.doc as you probably already know.

are they equal?  True