Regex 正则表达式“?!”&引用;可变的
我正在测试一个用于网络爬网的代码Regex 正则表达式“?!”&引用;可变的,regex,Regex,我正在测试一个用于网络爬网的代码 def getExternalLinks(bs, excludeUrl): externalLinks = [] #Finds all links that start with "http" that do #not contain the current URL for link in bs.find_all('a', href=re.compile('^(http|www)((?!'+excludeU
def getExternalLinks(bs, excludeUrl):
externalLinks = []
#Finds all links that start with "http" that do
#not contain the current URL
for link in bs.find_all('a',
href=re.compile('^(http|www)((?!'+excludeUrl+').)*$')):
if link.attrs['href'] is not None:
if link.attrs['href'] not in externalLinks:
externalLinks.append(link.attrs['href'])
return externalLinks
我无法在重新编译(“^(http | www)((?!”+excludeUrl+)*$”)中分析正则表达式((?!”+excludeUrl+))检查:
(?!…)
匹配如果。。。下一个不匹配。这是一个消极的前瞻性断言。例如,Isaac(?!Asimov)只有在后面没有“Asimov”时才会与“Isaac”匹配
在((?!'+excludeUrl+'))注册表中。exp.“excludeUrl”是函数参数的变量。我认为这个表达式的目的是要删除变量URL。“+变量+”表达式是否可用?@cheolyong是的。这是正常的字符串连接<代码>'a string here'+'another string'=='a string here another string'