Python 在<;内使用正则表达式提取内容&燃气轮机; html=“”文本“”
帮我提取Python 在<;内使用正则表达式提取内容&燃气轮机; html=“”文本“”,python,regex,python-2.7,Python,Regex,Python 2.7,帮我提取和之间的内容我想你想要这样的东西 使用DOTALL修饰符(?s),使正则表达式中的点也与linbe断点相匹配 在 >html=“”文本“” >>>关于findall(r'(?s'),html中的i: 打印i --//-- [CDATA[// -- 扩展(Druppal.设置,{“基本路径”设置,{“基本路径设置::”,“谷歌谷歌谷歌分析:::{“跟踪出境:1,”跟踪邮件:1,“轨下载:1,”轨下载:1,“轨下载:1,”轨下载:1,”轨下载:1,“轨下载扩展”:1,”轨下载下载:1,“7z
和
之间的内容我想你想要这样的东西
使用DOTALL修饰符(?s)
,使正则表达式中的点也与linbe断点相匹配
在
>html=“”文本“”
>>>关于findall(r'(?s'),html中的i:
打印i
--//--
[CDATA[//
--
扩展(Druppal.设置,{“基本路径”设置,{“基本路径设置::”,“谷歌谷歌谷歌分析:::{“跟踪出境:1,”跟踪邮件:1,“轨下载:1,”轨下载:1,“轨下载:1,”轨下载:1,”轨下载:1,“轨下载扩展”:1,”轨下载下载:1,“7z| aac(aac)电弧(arc)电弧,arj(arj)电弧(arj)电弧,电弧(arj)电弧(arj)电弧,asf,asf(asf)asf,asf,asf,asf,asf(asx)方方方方,asx,asx,asx(asx)方方方方方方,阿方方,阿方方,阿方方,阿方,阿方,阿方,阿方,阿方,阿方方方方,阿方,阿方方,阿方方方方,阿方,阿方,阿方,阿方,阿方,阿方,新德里| msp | pdf | phps |巴布亚新几内亚| ppt | qtm | ra(m | r)?| sea | sit | tar | tgz | torrent | txt | wav | wma | wmv | wpd | xls | xml | z | zip |,“spamspan”:{“m”:“spamspan”,“u”:“u”,“d”:“d”,“h”:“h”,“t”:;
//--
]]
或
在
>>关于findall(r'(?s'),html中的i:
打印i
//
扩展(Druppal.设置,{“基本路径”设置,{“基本路径设置::”,“谷歌谷歌谷歌分析:::{“跟踪出境:1,”跟踪邮件:1,“轨下载:1,”轨下载:1,“轨下载:1,”轨下载:1,”轨下载:1,“轨下载扩展”:1,”轨下载下载:1,“7z| aac(aac)电弧(arc)电弧,arj(arj)电弧(arj)电弧,电弧(arj)电弧(arj)电弧,asf,asf(asf)asf,asf,asf,asf,asf(asx)方方方方,asx,asx,asx(asx)方方方方方方,阿方方,阿方方,阿方方,阿方,阿方,阿方,阿方,阿方,阿方方方方,阿方,阿方方,阿方方方方,阿方,阿方,阿方,阿方,阿方,阿方,新德里| msp | pdf | phps |巴布亚新几内亚| ppt | qtm | ra(m | r)?| sea | sit | tar | tgz | torrent | txt | wav | wma | wmv | wpd | xls | xml | z | zip |,“spamspan”:{“m”:“spamspan”,“u”:“u”,“d”:“d”,“h”:“h”,“t”:;
//
将非贪婪正则表达式与findall
搜索一起使用:
>>> for i in re.findall(r'(?s)<!--(.*?)-->', html):
print i
//
jQuery.extend(Drupal.settings, { "basePath": "/", "googleanalytics": { "trackOutbound": 1, "trackMailto": 1, "trackDownload": 1, "trackDownloadExtensions": "7z|aac|arc|arj|asf|asx|avi|bin|csv|doc|exe|flv|gif|gz|gzip|hqx|jar|jpe?g|js|mp(2|3|4|e?g)|mov(ie)?|msi|msp|pdf|phps|png|ppt|qtm?|ra(m|r)?|sea|sit|tar|tgz|torrent|txt|wav|wma|wmv|wpd|xls|xml|z|zip" }, "spamspan": { "m": "spamspan", "u": "u", "d": "d", "h": "h", "t": "t" } });
//
matches=re.findall(r'',字符串)
你能发布正则表达式代码吗?
你能给我一个正则表达式,用于从htmlWelcome到StackOverflow提取引号之间的内容吗。我建议您阅读正则表达式以从“”中提取内容如果我在列表(匹配)中获得所有这些字符串,是否有方法使用此列表删除html文本中属于(匹配)列表的所有内容trystring=re.sub(“”),string)
>>> html="""text <!--//--><![CDATA[//><!--
jQuery.extend(Drupal.settings, { "basePath": "/", "googleanalytics": { "trackOutbound": 1, "trackMailto": 1, "trackDownload": 1, "trackDownloadExtensions": "7z|aac|arc|arj|asf|asx|avi|bin|csv|doc|exe|flv|gif|gz|gzip|hqx|jar|jpe?g|js|mp(2|3|4|e?g)|mov(ie)?|msi|msp|pdf|phps|png|ppt|qtm?|ra(m|r)?|sea|sit|tar|tgz|torrent|txt|wav|wma|wmv|wpd|xls|xml|z|zip" }, "spamspan": { "m": "spamspan", "u": "u", "d": "d", "h": "h", "t": "t" } });
//--><!]]>"""
>>> for i in re.findall(r'(?s)<!(.*?)>', html):
print i
--//--
[CDATA[//
--
jQuery.extend(Drupal.settings, { "basePath": "/", "googleanalytics": { "trackOutbound": 1, "trackMailto": 1, "trackDownload": 1, "trackDownloadExtensions": "7z|aac|arc|arj|asf|asx|avi|bin|csv|doc|exe|flv|gif|gz|gzip|hqx|jar|jpe?g|js|mp(2|3|4|e?g)|mov(ie)?|msi|msp|pdf|phps|png|ppt|qtm?|ra(m|r)?|sea|sit|tar|tgz|torrent|txt|wav|wma|wmv|wpd|xls|xml|z|zip" }, "spamspan": { "m": "spamspan", "u": "u", "d": "d", "h": "h", "t": "t" } });
//--
]]
>>> for i in re.findall(r'(?s)<!--(.*?)-->', html):
print i
//
jQuery.extend(Drupal.settings, { "basePath": "/", "googleanalytics": { "trackOutbound": 1, "trackMailto": 1, "trackDownload": 1, "trackDownloadExtensions": "7z|aac|arc|arj|asf|asx|avi|bin|csv|doc|exe|flv|gif|gz|gzip|hqx|jar|jpe?g|js|mp(2|3|4|e?g)|mov(ie)?|msi|msp|pdf|phps|png|ppt|qtm?|ra(m|r)?|sea|sit|tar|tgz|torrent|txt|wav|wma|wmv|wpd|xls|xml|z|zip" }, "spamspan": { "m": "spamspan", "u": "u", "d": "d", "h": "h", "t": "t" } });
//
matches = re.findall(r'<!.*?>', string)