Python 在<;内使用正则表达式提取内容&燃气轮机; html=“”文本“”

Python 在<;内使用正则表达式提取内容&燃气轮机; html=“”文本“”,python,regex,python-2.7,Python,Regex,Python 2.7,帮我提取和之间的内容我想你想要这样的东西 使用DOTALL修饰符(?s),使正则表达式中的点也与linbe断点相匹配 在 >html=“”文本“” >>>关于findall(r'(?s'),html中的i: 打印i --//-- [CDATA[// -- 扩展(Druppal.设置,{“基本路径”设置,{“基本路径设置::”,“谷歌谷歌谷歌分析:::{“跟踪出境:1,”跟踪邮件:1,“轨下载:1,”轨下载:1,“轨下载:1,”轨下载:1,”轨下载:1,“轨下载扩展”:1,”轨下载下载:1,“7z

帮我提取

之间的内容我想你想要这样的东西

使用DOTALL修饰符
(?s)
,使正则表达式中的点也与linbe断点相匹配

>html=“”文本“”
>>>关于findall(r'(?s'),html中的i:
打印i
--//--
[CDATA[//
--
扩展(Druppal.设置,{“基本路径”设置,{“基本路径设置::”,“谷歌谷歌谷歌分析:::{“跟踪出境:1,”跟踪邮件:1,“轨下载:1,”轨下载:1,“轨下载:1,”轨下载:1,”轨下载:1,“轨下载扩展”:1,”轨下载下载:1,“7z| aac(aac)电弧(arc)电弧,arj(arj)电弧(arj)电弧,电弧(arj)电弧(arj)电弧,asf,asf(asf)asf,asf,asf,asf,asf(asx)方方方方,asx,asx,asx(asx)方方方方方方,阿方方,阿方方,阿方方,阿方,阿方,阿方,阿方,阿方,阿方方方方,阿方,阿方方,阿方方方方,阿方,阿方,阿方,阿方,阿方,阿方,新德里| msp | pdf | phps |巴布亚新几内亚| ppt | qtm | ra(m | r)?| sea | sit | tar | tgz | torrent | txt | wav | wma | wmv | wpd | xls | xml | z | zip |,“spamspan”:{“m”:“spamspan”,“u”:“u”,“d”:“d”,“h”:“h”,“t”:;
//--
]]

>>关于findall(r'(?s'),html中的i:
打印i
//
扩展(Druppal.设置,{“基本路径”设置,{“基本路径设置::”,“谷歌谷歌谷歌分析:::{“跟踪出境:1,”跟踪邮件:1,“轨下载:1,”轨下载:1,“轨下载:1,”轨下载:1,”轨下载:1,“轨下载扩展”:1,”轨下载下载:1,“7z| aac(aac)电弧(arc)电弧,arj(arj)电弧(arj)电弧,电弧(arj)电弧(arj)电弧,asf,asf(asf)asf,asf,asf,asf,asf(asx)方方方方,asx,asx,asx(asx)方方方方方方,阿方方,阿方方,阿方方,阿方,阿方,阿方,阿方,阿方,阿方方方方,阿方,阿方方,阿方方方方,阿方,阿方,阿方,阿方,阿方,阿方,新德里| msp | pdf | phps |巴布亚新几内亚| ppt | qtm | ra(m | r)?| sea | sit | tar | tgz | torrent | txt | wav | wma | wmv | wpd | xls | xml | z | zip |,“spamspan”:{“m”:“spamspan”,“u”:“u”,“d”:“d”,“h”:“h”,“t”:;
//

将非贪婪正则表达式与
findall
搜索一起使用:

>>> for i in re.findall(r'(?s)<!--(.*?)-->', html):
        print i


//

    jQuery.extend(Drupal.settings, { "basePath": "/", "googleanalytics": { "trackOutbound": 1, "trackMailto": 1, "trackDownload": 1, "trackDownloadExtensions": "7z|aac|arc|arj|asf|asx|avi|bin|csv|doc|exe|flv|gif|gz|gzip|hqx|jar|jpe?g|js|mp(2|3|4|e?g)|mov(ie)?|msi|msp|pdf|phps|png|ppt|qtm?|ra(m|r)?|sea|sit|tar|tgz|torrent|txt|wav|wma|wmv|wpd|xls|xml|z|zip" }, "spamspan": { "m": "spamspan", "u": "u", "d": "d", "h": "h", "t": "t" } });
    //
matches=re.findall(r'',字符串)

你能发布正则表达式代码吗?
你能给我一个正则表达式,用于从htmlWelcome到StackOverflow提取引号之间的内容吗。我建议您阅读正则表达式以从“”中提取内容如果我在列表(匹配)中获得所有这些字符串,是否有方法使用此列表删除html文本中属于(匹配)列表的所有内容try
string=re.sub(“”),string)
>>> html="""text <!--//--><![CDATA[//><!--
    jQuery.extend(Drupal.settings, { "basePath": "/", "googleanalytics": { "trackOutbound": 1, "trackMailto": 1, "trackDownload": 1, "trackDownloadExtensions": "7z|aac|arc|arj|asf|asx|avi|bin|csv|doc|exe|flv|gif|gz|gzip|hqx|jar|jpe?g|js|mp(2|3|4|e?g)|mov(ie)?|msi|msp|pdf|phps|png|ppt|qtm?|ra(m|r)?|sea|sit|tar|tgz|torrent|txt|wav|wma|wmv|wpd|xls|xml|z|zip" }, "spamspan": { "m": "spamspan", "u": "u", "d": "d", "h": "h", "t": "t" } });
    //--><!]]>"""
>>> for i in re.findall(r'(?s)<!(.*?)>', html):
        print i


--//--
[CDATA[//
--
    jQuery.extend(Drupal.settings, { "basePath": "/", "googleanalytics": { "trackOutbound": 1, "trackMailto": 1, "trackDownload": 1, "trackDownloadExtensions": "7z|aac|arc|arj|asf|asx|avi|bin|csv|doc|exe|flv|gif|gz|gzip|hqx|jar|jpe?g|js|mp(2|3|4|e?g)|mov(ie)?|msi|msp|pdf|phps|png|ppt|qtm?|ra(m|r)?|sea|sit|tar|tgz|torrent|txt|wav|wma|wmv|wpd|xls|xml|z|zip" }, "spamspan": { "m": "spamspan", "u": "u", "d": "d", "h": "h", "t": "t" } });
    //--
]]
>>> for i in re.findall(r'(?s)<!--(.*?)-->', html):
        print i


//

    jQuery.extend(Drupal.settings, { "basePath": "/", "googleanalytics": { "trackOutbound": 1, "trackMailto": 1, "trackDownload": 1, "trackDownloadExtensions": "7z|aac|arc|arj|asf|asx|avi|bin|csv|doc|exe|flv|gif|gz|gzip|hqx|jar|jpe?g|js|mp(2|3|4|e?g)|mov(ie)?|msi|msp|pdf|phps|png|ppt|qtm?|ra(m|r)?|sea|sit|tar|tgz|torrent|txt|wav|wma|wmv|wpd|xls|xml|z|zip" }, "spamspan": { "m": "spamspan", "u": "u", "d": "d", "h": "h", "t": "t" } });
    //
matches = re.findall(r'<!.*?>', string)