Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/276.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/20.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何设置;“任何字符串”;作为Python中的正则表达式_Python_Regex_Python 2.7 - Fatal编程技术网

如何设置;“任何字符串”;作为Python中的正则表达式

如何设置;“任何字符串”;作为Python中的正则表达式,python,regex,python-2.7,Python,Regex,Python 2.7,我有这种情况 if(url.startswith("http://country.domain.com/motors/used-cars/") and (url!="http://country.domain.com/motors/used-cars/")): if url.startswith("http://country.domain.com/motors/used-cars/?page="): return None else: retur

我有这种情况

if(url.startswith("http://country.domain.com/motors/used-cars/") and (url!="http://country.domain.com/motors/used-cars/")):
    if url.startswith("http://country.domain.com/motors/used-cars/?page="):
        return None
    else:
        return url
它正在工作,但由于某些原因,该公司将url从:

http://country.domain

因为有很多城市。我有很多类似的URL:

http://city1.domain
http://city2.domain
http://city3.domain
http://city4.domain
http://city20.domain
回到我的状态,我必须将其更改为添加
20个城市

我的问题 有没有办法让我做到这一点:

http://whateverthenamehere.doman

我想python上的正则表达式就是我需要的,但我不知道正确的代码是什么

我尝试使用\s、\s*和\s+,但没有任何效果


您能否帮助使用正则表达式测试URL是否以您的路径和变量第一个域部分以及更多文本开头:

[a-zA-Z0-9-]
字符组匹配所有有效的域名字符<代码>\w
是不够的,因为它允许使用下划线(
)、而不是破折号(
-

URL的其余部分在组1中捕获,因此您可以进一步检查它

演示:

>>重新导入
>>>#URL中的文本不够:
... 
>>>检索(r'^http://[a-zA-Z0-9-]+.domain.com/motors/used cars/(.+),
...           'http://city42.domain.com/motors/used-cars/)没有
真的
>>>#捕获URL的其余部分以供检查:
...
>>>检索(r'^http://[a-zA-Z0-9-]+.domain.com/motors/used cars/(.+),
'http://city42.domain.com/motors/used-cars/?page=')
>>>检索(r'^http://[a-zA-Z0-9-]+.domain.com/motors/used cars/(.+),
'http://city42.domain.com/motors/used-cars/?page=小组(一)
“?page=”
>>>#评论中提到的特定URL:
...
>>>检索(r'^http://[a-zA-Z0-9-]+.domain.com/motors/used cars/(.+),
'http://testes.domain.com/motors/used-cars/jeep/wrangler/2014/6/5/jeep-wrangler-‌​2/?back=dwfllmr1yml6emxllmnvbs9tb3rvcnmvdxnlzc1jyxjz9wywdlptm%3D&pos=8')。组(1)
'jeep/wrangler/2014/6/5/jeep wrangler-\xe2\x80\x8c\xe2\x80\x8b2/?back=dwfllmr1yml6emxllmnvbs9tb3rvcnmvdxnlzc1jyxjz9wydlptm%3D&pos=8'

假设城市名称按字母顺序排列,
http://[a-zA-Z]+\.域名
可能会起作用。@thg435不,这不起作用。这种情况总是会出错,我喜欢这些“它不起作用”的评论。当然。你是说进口吗?@FarhadAliNoo:嗯?什么?无辜的哨子。但我在第一个
中有两个条件,如果你有one@MarcoDinatsoli例如我把它们合起来了。这就是正则表达式以
+
结尾的原因。仍然不起作用。例如
http://testes.domain.com/motors/used-cars/jeep/wrangler/2014/6/5/jeep-wrangler-2/?back=dWFlLmR1Yml6emxlLmNvbS9tb3RvcnMvdXNlZC1jYXJzLz9wYWdlPTM%3D&pos=8
仍转到
false
,但应转到
返回ULR
http://city1.domain
http://city2.domain
http://city3.domain
http://city4.domain
http://city20.domain
import re

match = re.search(r'^http://[a-zA-Z0-9-]+.domain.com/motors/used-cars/(.+)', url)
if match:
    if match.group(1).startswith('?page='):
        return None
    return url
>>> import re
>>> # Not enough text in the URL:
... 
>>> re.search(r'^http://[a-zA-Z0-9-]+.domain.com/motors/used-cars/(.+)', 
...           'http://city42.domain.com/motors/used-cars/') is None
True
>>> # Remainder of the URL is captured for inspection:
...
>>> re.search(r'^http://[a-zA-Z0-9-]+.domain.com/motors/used-cars/(.+)',
              'http://city42.domain.com/motors/used-cars/?page=')
<_sre.SRE_Match object at 0x100621558>
>>> re.search(r'^http://[a-zA-Z0-9-]+.domain.com/motors/used-cars/(.+)',
              'http://city42.domain.com/motors/used-cars/?page=').group(1)
'?page='
>>> # specific URL mentioned in the comments:
...
>>> re.search(r'^http://[a-zA-Z0-9-]+.domain.com/motors/used-cars/(.+)',
              'http://testes.domain.com/motors/used-cars/jeep/wrangler/2014/6/5/jeep-wrangler-‌​2/?back=dWFlLmR1Yml6emxlLmNvbS9tb3RvcnMvdXNlZC1jYXJzLz9wYWdlPTM%3D&pos=8').group(1)
'jeep/wrangler/2014/6/5/jeep-wrangler-\xe2\x80\x8c\xe2\x80\x8b2/?back=dWFlLmR1Yml6emxlLmNvbS9tb3RvcnMvdXNlZC1jYXJzLz9wYWdlPTM%3D&pos=8'