Python 重新转义文本中的潜在括号_Python_Regex_Beautifulsoup

Python 重新转义文本中的潜在括号

python regex

Python 重新转义文本中的潜在括号,python,regex,beautifulsoup,Python,Regex,Beautifulsoup,该脚本接受一个BeautifulSoup对象，并用最终目标URL替换链接（通常是缩短的链接）。它工作得很好，但在超链接文本包含（，抛出一个错误re.error:missing），位置0处未终止的子模式时出错了。是否有一个我缺少的简单修复方法不会改变文本 text = b.get_text() for link in b.find_all('a'): if 'href' in link.attrs: repl = link.get_text() href =

该脚本接受一个BeautifulSoup对象，并用最终目标URL替换链接（通常是缩短的链接）。它工作得很好，但在超链接文本包含（，抛出一个错误

re.error:missing），位置0处未终止的子模式时出错了。

是否有一个我缺少的简单修复方法不会改变文本

text = b.get_text()
for link in b.find_all('a'):
    if 'href' in link.attrs:
        repl = link.get_text()
        href = link.attrs['href']
        link.clear()
        link.attrs = {}
        link.attrs['href'] = unshorten_url(href)
        link.append(repl)
        # below fails if repl contains "("
        text = re.sub(repl+r"(?!= *?</a>)", str(link), text, count=1)

text=b.获取文本（）
对于b.find_all（'a'）中的链接：
如果link.attrs中的“href”：
repl=link.get_text（）
href=link.attrs['href']
link.clear（）
link.attrs={}
link.attrs['href']=取消排序的url（href）
link.append（repl）
#如果repl包含“（”，则以下操作失败
text=re.sub（repl+r“（？！=*？）”，str（link），text，count=1）

问题在于，通过将字符串与RE表达式合并，字符串中也用于RE的字符会触发错误。解决方案是

repl=RE.escape（repl）

要在字符串前面添加转义字符，请稍后清理文本。谢谢JasonHarper！

请使用两个好的和有问题的

链接来编辑您的问题。re.escape（）
可能就是您要找的。