如何将ruby正则表达式更改为python正则表达式
下面的代码是一个ruby表达式。我想把它转换成python代码。我怎么做如何将ruby正则表达式更改为python正则表达式,python,regex,Python,Regex,下面的代码是一个ruby表达式。我想把它转换成python代码。我怎么做 add_zzim\(\'(.*?)\',\'(.*?)\',\'(?<param>.*?)\',.* 资料来源: <li class="num" onClick="add_zzim('BD_AD_08','14913089','helloooo','3586312774','test');" title="contents.">14913089</li> <li class="n
add_zzim\(\'(.*?)\',\'(.*?)\',\'(?<param>.*?)\',.*
资料来源:
<li class="num" onClick="add_zzim('BD_AD_08','14913089','helloooo','3586312774','test');" title="contents.">14913089</li>
<li class="num" onClick="add_zzim('BD_AD_08','14913012','helloooo','3586312774','test');" title="contents.">14913012</li>
<li class="num" onClick="add_zzim('BD_AD_08','14913041','helloooo','3586312774','test');" title="contents.">14913045</li>
这里是一个非正则表达式的方法 要提取onclick属性值,我们将使用HTML解析器;要提取add_zzim参数值- 完整的工作示例:
from ast import literal_eval
from bs4 import BeautifulSoup
data = """
<ul>
<li class="num" onClick="add_zzim('BD_AD_08','14913089','helloooo','3586312774','test');" title="contents.">14913089</li>
<li class="num" onClick="add_zzim('BD_AD_08','14913012','helloooo','3586312774','test');" title="contents.">14913012</li>
<li class="num" onClick="add_zzim('BD_AD_08','14913041','helloooo','3586312774','test');" title="contents.">14913045</li>
</ul>
"""
soup = BeautifulSoup(data, "html.parser")
for li in soup.select("li.num"):
args = literal_eval(li["onclick"].replace("add_zzim", "").rstrip(";"))
print(args)
这将为您提供列表,然后您可以使用第三个元素“param”您可以使用“谢谢”来完成此操作。。但我只想提取Hellooo文本。
('BD_AD_08', '14913089', 'helloooo', '3586312774', 'test')
('BD_AD_08', '14913012', 'helloooo', '3586312774', 'test')
('BD_AD_08', '14913041', 'helloooo', '3586312774', 'test')
import re
p = re.compile(ur'add_zzim\(\'(.*?)\',\'(.*?)\',\'(.*?)\',.*')
test_str = u"<li class=\"num\" onClick=\"add_zzim('BD_AD_08','14913089','helloooo','3586312774','test');\" title=\"contents.\">14913089</li>\n<li class=\"num\" onClick=\"add_zzim('BD_AD_08','14913012','helloooo','3586312774','test');\" title=\"contents.\">14913012</li>\n<li class=\"num\" onClick=\"add_zzim('BD_AD_08','14913041','helloooo','3586312774','test');\" title=\"contents.\">14913045</li>\n"
for i in re.findall(p, test_str):
print(i[2])