Python 为什么带有re.findall（）的正则表达式不'；不行？_Python_Regex_Findall - Fatal编程技术网

Python 为什么带有re.findall（）的正则表达式不'；不行？

python regex

Python 为什么带有re.findall（）的正则表达式不'；不行？,python,regex,findall,Python,Regex,Findall,我试图从html代码中提取文本。这是我的代码： import re Luna = open('D:\Python\Luna.txt','r+') text=Luna.read() txt=re.findall('<p>\s+(.*)</p>',text) print txt 重新导入 Luna=open（'D:\Python\Luna.txt'，'r+'）） text=Luna.read（） txt=re.findall（'\s+（.*）'，text）打印文本但是

我试图从html代码中提取文本。这是我的代码：

import re
Luna = open('D:\Python\Luna.txt','r+')
text=Luna.read()
txt=re.findall('<p>\s+(.*)</p>',text)
print txt

重新导入
Luna=open（'D:\Python\Luna.txt'，'r+'））
text=Luna.read（）
txt=re.findall（'\s+（.*）'，text）
打印文本

但是，它只消除了第一个

之前的部分，而保留了第一个

之后的所有内容。如何改进代码，使其只返回
和之间的部分？
以下是原始html代码的一部分：
src="/advjs/gg728x90.js"></script></td>  </tr></table><div class="text" align="justify"></p><p> Sure. Eye of newt. Tongue of snake.</p><p>  She added, &ldquo;Since you&rsquo;re taking Skills for Living, it&rsquo;ll be good practice.&rdquo;</p><p>  For what? I wondered. Poisoning my family? &ldquo;I have to baby-sit,&rdquo; I said, a little too gleefully.</p>

src=“/advjs/gg728x90.js”>当然。蝾螈之眼。蛇的舌头。
她补充道，“蛇的舌头。”；既然您；我们学习生活技能，it&rsquo；这将是一种良好的做法。”
 为什么？我想知道。毒害我的家人&ldquo；“我得照看孩子，”她说；我说，有点太高兴了
我强烈建议您使用合适的HTML解析器，如：
您可以使用非贪婪运算符修复正则表达式（在*
运算符后面附加一个？
问号）：
txt=re.findall（'\s+（.*）'，text）

但是，由于HTML不是一种常规语言，您很可能会在使用正则表达式解析时遇到其他问题。关于使用正则表达式解析HTML的强制性警告：
from bs4 import BeautifulSoup

soup = BeautifulSoup(Luna.read())
para_strings = (p.get_text() for p in soup.find_all('p'))
txt = [p.strip() for p in para_strings if p.startswith(' ')]

txt=re.findall('<p>\s+(.*?)</p>',text)




[regex]相关文章推荐



                                                        
Regex 在actionscript中将反斜杠替换为正斜杠
regexflashactionscript-3 
Regex 是否可以使用正则表达式用零填充整数？
regex 
Regex 使用vim在单行中查找重复项
regexvim 
Regex 不适用地更改正则表达式语义？
regexscala 
Regex 要包含和排除的正则表达式，无需前瞻
regexgoogle-analytics 
Regex url重写后的url重定向
regexapache.htaccessmod-rewrite 
Regex 使用sed清理CSV文件
regexbashunixcsvsed 
Regex 从文件中的行中提取数字和连字符
regexperl 
Regex Swift：替换字符串中的确切单词
regexstringreplaceswift2 
Regex Pcrecpp析构函数
regexstring 
Regex 如何从这个正则表达式派生正则语法？
regex 
Regex 与perl兼容的正则表达式能否比较两个数字？
regex 
Regex 本地化重写条件
regexapacheurlmod-rewriteurl-rewriting 
Regex 用于替换字符串中的数字的正则表达式
regexbashsed 
Regex VBA计数并收集正则表达式中找到的每个匹配项
regexvba 
Regex 从字符串捕获组
regex 
Regex 删除文本正则表达式
regexpython-3.x 
Regex 如何选择具有一定间隔的所有空白
regex 
如何让Regex找到以A开头的7个字母单词
regex 
Regex 从正则表达式中删除不需要的匹配项
regex 
                                       





随机文章推荐



                                                        
Azure devops 将Visual Studio团队服务项目迁移到具有历史记录的新Visual Studio团队服务帐户
azure-devops 
Azure devops 多个工件源上的VSTS释放触发器赢得'；不关心分支规范
azure-devops 
Azure devops 是否可以使workitem继承父迭代路径？
azure-devops 
Azure devops 使用RESTAPI设置VSTS比较分支
azure-devops 
Azure devops 如何将变更集链接到workitem？
azure-devops 
Azure devops VSTS：为什么管道看起来是由同一个用户触发的？
azure-devops 
Azure devops 有没有办法阻止公司用户创建他们自己的Azure DevOps组织（与主组织分离）？
azure-devops 
Azure devops PowerBI-DevOps
azure-devopspowerbi 
Azure devops Azure DevOps：如何避免在新分支创建上构建
azure-devops 
Azure devops 如何在Azure Devops中通过多行拆分CLI命令？（在Windows上运行）
azure-devops 
Azure devops 多阶段YAML管道不应用特定于环境的XML转换
azure-devops 
Azure devops Azure Devops-用户故事描述/接受标准字段中的降价支持？
azure-devops 
Azure devops Azure Devops Webapp部署wwwroot而不是wwwroot\public
azure-devops 
Azure devops 将azure板从一个组织移动到另一个组织
azure-devops 
Azure devops 是否有一种方法可以为Azure Devops的yaml管道中的任务动态分配名称？
azure-devops 
Azure devops 如何在开发操作管道中访问Azure脚本btw Powershell中的变量
azure-devops 
Azure devops 在新测试计划视图中查看附件的位置-azure devops
azure-devops 
Azure devops Azure DevOps带子级的模板
azure-devops 
Azure devops 如何为Azure Repos提供数据保护，限制用户以Zip文件形式下载？
azure-devops 
Azure devops 如何打开Azure DevOps PR review comment设置为已解决的通知
azure-devops


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
Python中的简单文本分类
									Python
							 
Python 将obj插入列表一次
									Python
							 
Python 如何打印更多行？
									Python
							 
Python SyntaxError:行连续字符后出现意外字符
									Python
							 
为什么我需要设置两次相同的条件？（Python 3.4）
									Python
							 									Python 3.x
							 									If Statement
							 
有人知道我如何在python上创建8x8网格吗？
									Python
							 									Character Encoding
							 									Functional Programming
							 
Python 函数定义：调用create_rectangle创建正方形
									Python
							 									Function
							 									Tkinter
							 
使用路由引用在python中编辑嵌套字典
									Python
							 
Python HTML5视频无法在OS X上的Firefox浏览器中正常播放
问题是：
									Python
							 									Html
							 									Firefox
							 									Ffmpeg
							 
Python 如何将对象ID从Angular.js传递到Django Rest框架
									Python
							 									Angularjs
							 									Django
							 									Rest
							 									Django Rest Framework
							 
代码Python Beautifulsoup使用文本提取特定链接时出错
									Python
							 									Url
							 
Python （基本？）文件编译良好，不'；我不能如愿执行
									Python
							 									Import
							 									Atom Editor
							 
Python PyQt在每次单击后创建附加对话框
									Python
							 									User Interface
							 									Error Handling
							 
如何替换Python中的一行
									Python
							 
Python-重写父类参数
									Python
							 
在python中将字符串转换为整数
									Python
							 									String
							 
函数中的Python参数
									Python
							 									Python 3.x
							 
Python 从不同环境导入模块
									Python
							 									Import
							 									Module
							 
Python 为连接四的游戏建立我的棋盘有困难。我该如何理解这个抽象概念？
									Python
							 									Pycharm
							 
Python请求。Get响应包含意外值
									Python
							 									Json
							 									Rest
							 
Python中的幂参数
									Python
							 
Python Argparse：多参数处理
									Python
							 									Bash
							 									Python 2.7
							 									Function
							 
python中没有十六进制数字10？
									Python
							 
Python 有没有更快的方法来检查众多网站的可用性
									Python
							 									Web Scraping
							 
Python 达斯克'；s并行for循环速度比单核慢
									Python
							 									Multithreading
							 									Dask
							 
当传感器有一定水位时，Python会发送电子邮件
									Python
							 									Logic
							 
Python django url不'；不能正确地重定向
									Python
							 									Html
							 									Django
							 
Python Pygame音乐不会在我的游戏功能中播放
									Python
							 
Python加密在并行化时速度较慢
									Python
							 									Parallel Processing
							 									Cryptography
							 
Python3.7中使用zip从列表创建字典的替代方法
									Python
							 									Dictionary
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Virtualbox
Big O
Telerik
Entity Framework
Udp
C# 4.0
Discord
Ssh
Visual Studio 2012
Caching
Activemq
Uitableview
Web Services
Post
Filter
Numpy
Microsoft Graph Api
Directory
Nlp
Artifactory
Unity3d
Ffmpeg
Graphics
Hbase
Discord.js
Swift3
Shiny
Process
Google Plus
Awk
Three.js
Nsis
Twitter Bootstrap
Bluetooth
Transactions
Binding
Verilog
Z3
Pagination
Wicket
Curl
Mono
Properties
Python
Spring Mvc
Docker Compose
Sitecore
Asp Classic
Aem
Sql Server
Laravel
Sharepoint 2010
Netsuite
Linkedin
Email
Pycharm
Internationalization
Vb6
Intellij Idea
Printing
Wpf
Asp.net Mvc 5
Dialogflow Es
Matrix
Django Models
Sharepoint 2007
Vba
Postgresql
Ruby On Rails 3.2
File
Gradle
Bazel
Uml
Webstorm
Datatables
Sails.js
Gdb
Security
Parallel Processing
Time Complexity
Azure Devops
Inheritance
Mariadb
Macos
Mysql
Permissions
Notifications
Debugging
Keras
Python Sphinx
Lisp
Nest
Log4j
Docusignapi
Jaxb
Ember.js
Mfc
Tinymce
Firefox
Dom
Dynamics Crm 2011
Docker
Linq To Sql
Extjs
Character Encoding
Azure Active Directory
Frameworks
Configuration
Lambda
Dojo
Opencv
Itext
Asp.net Mvc
Scrapy
Uwp
Editor
Llvm
Sugarcrm
User Interface
Service
Blockchain
Dynamics Crm
Yii2
Microservices
React Native
Xampp
Cookies
Domain Driven Design
Sql Server 2008
Angular
Mapreduce
Oracle11g
Oracle
Apache Spark
Operating System
Asp.net Mvc 4
Jersey
Text
Asp.net Mvc 3
Gwt
Dynamic
Google Cloud Storage
Xcode
Lucene
Openlayers
Ibm Midrange
Sql Server 2008 R2
Xml
Grid
Date
Teradata
Visual Studio Code
Plot
Dataframe
Botframework
List
Orchardcms
Pointers
Antlr
Twitter
Adobe
Certificate
Orientdb
Parse Platform
Asp.net Web Api
Cron
Spring Cloud
Exception
Plone
Erlang
Prometheus
Terraform
Listview
Resharper
Zend Framework2
Xpath
Primefaces
Random
Ms Word
Biztalk
Csv
Rx Java
Apache Camel
Templates
Pentaho
Phpmyadmin
Lotus Notes
Fiware
Programming Languages
Perforce
Replace
Jquery Mobile
Msbuild
Nativescript
Firefox Addon
Elm
Jasper Reports
Statistics
Orm
Tags
Kdb


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网