Python 提取html表格中包含特定单词的文本_Python - Fatal编程技术网

Python 提取html表格中包含特定单词的文本

python

Python 提取html表格中包含特定单词的文本,python,Python,这里是Pyton初学者。可能有一个命令我不知道，但无法在web上找到解决方案。我在Python设置中有一个字符串格式的html文件。文件看起来像 <table> This is Table 1 </table> <table> This is Table 2 </table> <table> This is Table 3 </table> 这是表1 这是表2 这是表3 我想提取和之间的文本，但前提是它与表中的

这里是Pyton初学者。可能有一个命令我不知道，但无法在web上找到解决方案。我在Python设置中有一个字符串格式的html文件。文件看起来像

<table>
This is Table 1
</table>

<table>
This is Table 2
</table>

<table>
This is Table 3
</table>


这是表1
这是表2
这是表3

我想提取和之间的文本，但前提是它与表中的某些字符串匹配。所以，我只想要表2

我尝试拆分表上的文档，但由于它还包含

和之间的部分，所以变得很混乱。我知道research命令，但不知道如何将其与if语句结合使用
re.search（（*））
因此，一个想法是通过获取html。然后您可以简单地访问如下标记：
row = soup.find('tr') # Extract and return first occurrence of tr
print(row)            # Print row with HTML formatting
print("=========Text Result==========")
print(row.get_text()) # Print row as text

然后您可以获取innerHtml并将其与字符串进行比较。这将假定您可以使用BeautifulSoup访问html。这是从
获得的，请使用lxml解析器解决此问题
from lxml import html

text = '''<table>This is Table 1</table>

<table>This is Table 2</table>

<table>This is Table 3</table>'''

parser = html.fromstring(text)
parser.xpath("//table[contains(text(), 'Table 2')]/text()")

使用beautifulsoup
阅读HTMLY您可以通过检查长度为7的子字符串来拆分文档。对于文档中的每个字符，请查看它是否是“
的开头。如果是，请检查”
之后和下一个之前的内容else转到下一个字符Brilliant！非常感谢。您知道除了表2之外是否可以包含多个字符串？是的，您可以使用“或”条件来包含多个字符串
['This is Table 2']




[compiler errors]相关文章推荐



                                                        
Compiler errors 如何设置JScript（而不是JScript.NET）开发环境？
compiler-errorsjavascript 
Compiler errors null未声明的标识符错误
compiler-errorscomputer-vision 
Compiler errors 为什么会有模块冲突？
compiler-errorsd 
Compiler errors Inno安装程序：未找到所需的编译器错误函数或过程
compiler-errorsinno-setup 
Compiler errors &引用；“模块已定义”；编译安卓4.3源代码时
compiler-errors 
Compiler errors 编译使用同一fortran文件中定义的模块的子例程时出错
compiler-errorsfortran 
Compiler errors &引用；“找不到类型”；将类转换为haxelib之后
compiler-errors 
Compiler errors 值不为'；我活得不够长，尽管我不得不
compiler-errorsrust 
Compiler errors 使用cmake使用cpp和cuda源构建pybind11模块
compiler-errorscmakecuda 
Compiler errors 如何从opencv gpumat创建推力装置_矢量
compiler-errorscuda 
Compiler errors 如何修复'；无法识别的规则'；和'；致命解析'；Flex代码生成器上的错误？
compiler-errorscompilationcompiler-construction 
Compiler errors 如何解析编译器'；“s错误”；致命错误C1083“；关于迪莫拉？
compiler-errors 
Compiler errors 错误：接近"；首字母"；：语法错误，意外的首字母
compiler-errorsverilogsystem-verilog 
                                       





随机文章推荐



                                                        
Scripting 创建“一个”的障碍是什么；“欧罗巴托”；键入通用脚本语言？
scripting 
Scripting 如何以编程方式更改端口的连接速度？
scriptingsnmp 
Scripting 用于更改MSI中的操作序列记录的脚本
scriptingwindows-installer 
Scripting 如何通过cron作业访问MVC函数？
scriptingcodeignitercron 
Scripting awk内部打印变量
scriptingawkprinting 
Scripting 嵌入帕斯卡
有人知道PASCAL解释器/编译器是否可以嵌入C++（或者Pascal以外的其他）应用程序？我正在克隆（因为没有更好的词）一个应用程序，它使用与对象Pascal兼容的脚本语言，并且需要与脚本兼容。我最终会写一个翻译吗？（！）
scriptingpascal 
Scripting 编程语言和脚本语言之间有什么区别？
scripting 
Scripting 宋承宪；“框架”；编写将保持连接打开的程序并向服务器提供命令
scriptingssh 
Scripting 如何使用sed或awk删除符合特定字段条件的行？
scriptingsedawk 
Scripting InDesign：使用脚本将页面转换为JPEG格式，并将其放在同一页面上
scripting 
Scripting C shell脚本中的Grep查询执行不正确
scripting 
Scripting Amazon AWS.NET/powershell脚本（新SDK）
scripting 
Scripting 自动复制行导入太快？
scriptinggoogle-sheets 
Scripting 通过在After effects中编写脚本添加效果
scriptingadobe


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
FreeBSD 9中的python sqlite3错误
									Python
							 
Python 为什么在继承类中使用django.utils.decorators方法\u decorator时需要重复方法定义
									Python
							 									Inheritance
							 
Python 2.7的用户输入函数有问题，无法'；I don’我不能让这些输入与程序一起工作
									Python
							 									Python 2.7
							 
Python datetime可以与矢量化输入一起使用
									Python
							 									Pandas
							 
使用python从不同的电子商务站点提取价格
									Python
							 									Python 2.7
							 									E Commerce
							 									Web Scraping
							 
Python urllib2-无法读取页面
									Python
							 
我需要将Python中其他两个数字之间的偶数相加
print“此程序计算并打印所有偶数值之和”
打印“介于2和用户输入的正整数值之间。\n”
整数=输入（“输入一个正整数：”）
当整数
									Python
							 									Python 2.7
							 
Python 列出从csv导出和导入
									Python
							 									Csv
							 
python中的字符串操作
									Python
							 									String
							 
Python 如何将句子列表转换为IOB格式，并在输出中保存句子分隔
									Python
							 
Python 您可以为整个QTableWidget设置特定的颜色吗？
									Python
							 
Python 无法从多个数据帧和列中获取百分比差异？
									Python
							 									Pandas
							 									Dataframe
							 
Python 如何使用Google cloud vision api进行web检测，以检测特定网站的页面？
									Python
							 
Python 因为.apt文件，Slug的大小比本地的大很多？
									Python
							 									Heroku
							 
Python 使用matplot绘制具有不等间距记号的x轴
									Python
							 									Matplotlib
							 
Python Asyncpg INSERT查询时间戳返回错误语法错误在或接近；18“；
									Python
							 									Postgresql
							 									Discord.py
							 
Python 如何检查给定电子邮件和号码的数据库中是否存在用户，然后创建或更新该用户？
									Python
							 									Django
							 									Django Rest Framework
							 
Python selenium-阻止打开新选项卡
									Python
							 									Selenium
							 
Python 为什么我在tensorboard t-sne中对cnn图像特征的可视化是随机的？
									Python
							 									Tensorflow
							 									Keras
							 
如何在python中自动检测和验证日期格式？
									Python
							 									Python 3.x
							 									Pandas
							 									Date
							 									Matplotlib
							 
连接Python中的数据帧字典
									Python
							 									Pandas
							 									Dataframe
							 									Operating System
							 
尝试使用scrapy刮取url链接和名称。python
									Python
							 									Scrapy
							 
Python 基于数字输入输出笑脸
									Python
							 									Python 3.x
							 									String
							 									Loops
							 									Input
							 
Python sqlalchemy.exc.OperationalError:（sqlite3.OperationalError）没有这样的表：
									Python
							 									Sqlite
							 									Flask
							 									Sqlalchemy
							 
Python 安装深度学习框架（Caffe）时遇到错误
									Python
							 									C
							 									Cmake
							 
Python from.exceptions import InvalidKeyError ModuleNotFoundError:没有名为'；jwt.exceptions'；
									Python
							 									Django
							 									Django Rest Framework
							 
Python 有条件不在Django中显示窗体
									Python
							 									Django
							 
Python Snakemake：如何让shell命令在规则中使用不同的参数（整数）运行？
									Python
							 									Shell
							 
Python 使用两帧设置matplotlib散点图的动画
									Python
							 									Matplotlib
							 									Animation
							 
如何避免在python中从PDF文件中提取小图像元素？
									Python
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
View
Tcp
Azure Data Factory
Visual Studio 2017
Terminal
Protocol Buffers
Influxdb
Nosql
Ocaml
Apache Zookeeper
Rx Java
Forms
Laravel
Google Maps
Netty
Ssh
Gps
Android Ndk
Button
Zend Framework
Gis
Raspberry Pi
Java 8
Amazon Dynamodb
Search
Nsis
Kotlin
Php
Ms Office
Elm
Jhipster
Linux Kernel
Responsive Design
Command Line
Netsuite
Django Rest Framework
Visual Studio 2008
Documentation
Random
Spotify
Db2
Jms
Geolocation
Openlayers 3
Asp.net Web Api
Visual C++
Loopbackjs
Virtualbox
Spring Batch
Outlook
Hive
Cordova
F#
Asp.net
Heroku
Java Me
Alfresco
Gitlab
Artificial Intelligence
Login
Linkedin
Material Ui
Ruby On Rails 3.1
Jqgrid
Continuous Integration
Parse Platform
Jira
Corda
Drupal 7
Types
Ibm Mq
File
Rally
Coding Style
Angularjs
Xpath
Marklogic
Mdx
Autocomplete
Twitter
Web
Magento
Security
Cron
Joomla
Post
Graphql
Ibm Cloud
Time
Cygwin
Postman
Ssrs 2008
Email
Odata
Windows Mobile
Video
.net Core
Stm32
Jasper Reports
Sapui5
Maven 2
Requirejs
Yocto
Karate
Spring Security
Floating Point
Centos
Multithreading
C++
Rxjs
Encryption
Gcc
Nunit
Ios8
Networking
Intellij Idea
Seo
Azure Functions
Properties
Robotframework
Project Management
Vba
Chef Infra
File Upload
Webstorm
Azure
Ionic Framework
Memory
Jar
Replace
Batch File
Arrays
Enums
Eclipse Plugin
Windows Phone 8
Google Drive Api
Xcode
Aem
Ldap
Twig
Sqlalchemy
Apache Flex
Kubernetes
Doxygen
Pointers
For Loop
Dynamic
Xampp
Functional Programming
Graph
Scripting
Artifactory
Tinymce
Mips
Isabelle
Jquery Ui
Air
Notepad++
Mqtt
Permissions
Assembly
Testng
Layout
Perl
Instagram
Rabbitmq
Phantomjs
Ruby
Animation
Webpack
Arangodb
Hybris
Omnet++
Node.js
Lua
Eclipse Rcp
Amp Html
Arduino
Lisp
Qml
Antlr4
Magento2
Parameters
Struct
Css
Laravel 5
Youtube
Syntax
Ios6
Ibm Midrange
Cocos2d Iphone
Zurb Foundation
Tags
Itext
Iphone
Dialogflow Es
Sencha Touch
Azure Devops
Calendar
Primefaces
Debian


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网