Python 3.x 通过文本而不是标记查找beautiful soup中的项目_Python 3.x_Web Scraping_Beautifulsoup_Python Requests - Fatal编程技术网

Python 3.x 通过文本而不是标记查找beautiful soup中的项目

python-3.x web-scraping

Python 3.x 通过文本而不是标记查找beautiful soup中的项目,python-3.x,web-scraping,beautifulsoup,python-requests,Python 3.x,Web Scraping,Beautifulsoup,Python Requests,所以我试图通过从他们的维基百科页面上抓取某些位置的区域。以坎布里亚为例（）我可以通过以下方式获得信息框： url = 'https://en.wikipedia.org/wiki/Cumbria' r = requests.get(url) soup = BeautifulSoup(r.content, 'lxml') value = soup.find('table', {"class": "infobox geography vcard"}) \ .find('tr'

所以我试图通过从他们的维基百科页面上抓取某些位置的区域。以坎布里亚为例（）我可以通过以下方式获得信息框：

url = 'https://en.wikipedia.org/wiki/Cumbria'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
value = soup.find('table', {"class": "infobox geography vcard"}) \
            .find('tr', {"class":"mergedrow"}).text

但是，

信息框地理vcard

有多个子集，每个子集中都有一个

我想要的

是

区域

，我想知道我是否可以通过搜索“区域”而不是标记从

区域的子集中获取文本，因为在信息框地理vcard下，其他所有内容都是无处不在的你可以直接用scope=row
搜索所有th
。然后对它们进行迭代，查看哪些具有区域
作为文本，并使用以获取下一个同级（这将是包含所需数据的td
）
请注意，此表有2个区域
条目，一个用于“礼仪县”，另一个用于“非都市县”，不管这意味着什么；）
谢谢你，我不知道如何找到下一个兄弟姐妹。非常有用！
ths = soup.find_all('th', {'scope': 'row'})

for th in ths:
    if th.text == 'Area':
        area = th.find_next_sibling().text
        print(area)

#  6,768 km2 (2,613 sq mi)
#  6,768 km2 (2,613 sq mi)




[web scraping]相关文章推荐



                                                        
Web scraping 需要从根站点开始提取数据
web-scraping 
Web scraping iMacros——使用Javascript站点从站点提取特定文本
web-scraping 
Web scraping 用刮刀刮网
web-scrapingscrapy 
Web scraping 如何仅从特定类别抓取带有scrapy的链接，而忽略产品页面上的链接？
web-scrapingscrapy 
Web scraping 如何从网站上获取所有数据？
web-scrapingscrapy 
Web scraping 当我找到值时，如何停止web抓取数据？
web-scraping 
Web scraping 从URL触发按钮单击
web-scrapingcoldfusion 
Web scraping 使用BeautifulSoup解析和提取熊猫数据
web-scraping 
Web scraping Can'；无法获取属性元素的内部文本
web-scraping 
Web scraping Rvest不会返回数据
web-scraping 
Web scraping 为什么木偶人似乎在随机化数据？
web-scraping 
Web scraping page.close（）不会停止ui4j活动和定期重新加载网页
web-scraping 
                                       





随机文章推荐



                                                        
将我的Json字符串解析为openlayers.format.geojson
openlayersjson 
OpenLayers需要关于如何绘制可以在视觉上突出显示的地图功能的建议
openlayers 
OpenLayers WMS何时实际调用远程服务器？
openlayers 
使用GEOJson格式的Openlayers中的独立特征选择
openlayers 
从openlayers中的集群中拖动一个功能
openlayers 
OpenLayers：如何从特定点开始绘制线
openlayers 
Openlayers 贴图层会拉伸
openlayers 
如何在Openlayers 3中从矢量层获取特征信息
openlayers 
在angular6/openlayers中将贴图设置为变量
openlayersangular6 
Openlayers 返回dblclick Open图层上贴图的默认行为
openlayers 
Openlayers 5选择向量
openlayers 
OpenLayers-如何将自定义平铺层投影到[-200，-200200200]的视图？
openlayers 
如何在OpenLayers中将矢量层的某些部分作为图像导出到画布？
openlayers 
Openlayers 反转Y轴
openlayers


                                        

                                        
                                        


                                                
                                                        [python 3.x]相关推荐
                                                        
Python 3.x 打印列表中的第n项
									Python 3.x
							 
Python 3.x “Ipython笔记本电脑”；从mpl_toolkits.mplot3d导入Axes3D“；有值错误
									Python 3.x
							 									Matplotlib
							 									Plot
							 									3d
							 
Python 3.x 如何在Ubuntu12.04上安装Python3.4的scipy和numpy软件包？
									Python 3.x
							 									Numpy
							 
Python 3.x 我想知道是否可以使用列表理解或应用dataframe方法实现pandas.groupby操作
									Python 3.x
							 									Pandas
							 
Python 3.x 在pycharm中调试从sys.stdin获取输入的python脚本
									Python 3.x
							 									Debugging
							 									Pycharm
							 
Python 3.x 第4章逗号代码-扩展
									Python 3.x
							 
Python 3.x 返回未定义变量的程序（python）
									Python 3.x
							 									Variables
							 
Python 3.x 如何在pygame中创建许多可移动的文本框？
									Python 3.x
							 									Object
							 									Events
							 
Python 3.x TooManyRequestsException适用于Boto3客户组织
									Python 3.x
							 									Amazon Web Services
							 									Aws Lambda
							 
Python 3.x 如何使用MechanicalGroup中的元素id选择提交按钮名称？
									Python 3.x
							 									Web Scraping
							 
Python 3.x 尝试生成一个随机图，然后生成另一个，直到它与第一个图同构
									Python 3.x
							 									Numpy
							 									Graph
							 
Python 3.x 正在尝试复制kik用户图像消息！需要建议吗
									Python 3.x
							 
Python 3.x 在macOS上安装pwntools
									Python 3.x
							 									Macos
							 									Installation
							 									Pip
							 
Python 3.x KeyError:'；无法打开对象（错误的B树签名）'；
									Python 3.x
							 									Tensorflow
							 
Python 3.x 熊猫阅读带有sep='；的表格：'；
									Python 3.x
							 									Pandas
							 
Python 3.x 将输出python保存在字符串变量中
									Python 3.x
							 									String
							 									Variables
							 
Python 3.x 在点云中查找十字的中心
									Python 3.x
							 									Jupyter Notebook
							 
Python 3.x 在应用程序中的逻辑之前检查null
									Python 3.x
							 									Pandas
							 
Python 3.x 捕获Python中的导入错误和名称错误；编撰；时间
									Python 3.x
							 									Compilation
							 
Python 3.x 根据行号检索文本
									Python 3.x
							 									Text
							 
Python 3.x 在python 3.7中使用wikipedia API时出现证书错误
									Python 3.x
							 									Ssl
							 
Python 3.x 发现一个值中有多少个列表
									Python 3.x
							 									Dictionary
							 
Python 3.x 使用ConfigParser时在配置文件中保留大小写和注释
									Python 3.x
							 
Python 3.x 将日期名称转换为数值
									Python 3.x
							 									List
							 									Date
							 
Python 3.x 使用sklearn/Scikit Learn执行自定义GLM
									Python 3.x
							 									Scikit Learn
							 
Python 3.x 如何在python中修改嵌套pymongo字典记录中的字段
									Python 3.x
							 									List
							 									Dictionary
							 
Python 3.x 在字典中查找最大的元素索引？
									Python 3.x
							 
Python 3.x 遍历文件夹以打开特定类型的文件
									Python 3.x
							 
Python 3.x Regex的缩写不'；无法在csv文件中正常工作
									Python 3.x
							 									Regex
							 									Pandas
							 
Python 3.x Python相当于SAS'；s过程物流？
									Python 3.x
							 									Sas
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Dynamic
Cordova
Google Colaboratory
Parse Platform
Url
Python 2.7
Xmpp
Pytorch
Editor
Entity Framework Core
Sublimetext3
Omnet++
Botframework
Gruntjs
Properties
Macros
C#
Couchdb
Oracle10g
Tree
Sphinx
Soap
Tkinter
Processing
Puppet
Sqlalchemy
Statistics
Octave
Numpy
Networking
Google Analytics
Server
Oracle Apex
Localization
Join
Anaconda
Google Compute Engine
Windows Installer
Compression
Web Scraping
Wicket
Latex
Jupyter Notebook
Input
Yaml
Maven 2
Browser
Solr
Database
Kibana
Mod Rewrite
Next.js
List
System Verilog
Pentaho
Ssas
Qt
Openssl
Actionscript 3
Hyperlink
Asp.net Mvc 4
Smtp
Sequelize.js
Sublimetext2
Embedded
Elixir
String
Ethereum
Ada
Stata
Drop Down Menu
Talend
Azure Service Fabric
Seo
D3.js
Airflow
Webgl
Cocos2d X
Computer Vision
.htaccess
Gremlin
Msbuild
Batch File
Math
Playframework 2.0
Ravendb
Google Drive Api
Azure Active Directory
Proxy
Android
Gmail
Webpack
Influxdb
Sdk
Time
Sonarqube
Boost
Encryption
Tinymce
Rss
Doctrine
Openid
Tabs
Sails.js
Install4j
Inno Setup
Coffeescript
Vba
Windows Phone
Hazelcast
Model View Controller
Sml
Search
Swing
Ruby On Rails 3
Computer Science
Asp.net Mvc 3
Vmware
Geometry
Sas
Indexing
Nunit
Outlook
Snmp
Mongodb
Reflection
Ssl
Lotus Notes
Typescript
Mobile
Plone
Playframework
Aws Lambda
Asp.net Mvc 2
Ssh
Zsh
Stripe Payments
Jsf
Amazon Ec2
Django Rest Framework
Akka
Select
Kendo Ui
Continuous Integration
Postgresql
Animation
Opencv
Spring Integration
Error Handling
Openlayers 3
Firefox
Clearcase
F#
Session
Kentico
Google Chrome Extension
Data Structures
Audio
Chef Infra
Windows 10
Docusignapi
Sed
Uwp
Clang
Discord.py
Windbg
Jvm
.net Core
Ipython
Oop
Ibm Mq
Ibm Mobilefirst
Xamarin.ios
Redux
Visual Studio 2010
Keycloak
Dynamics Crm 2011
Mariadb
Jekyll
Graph
Filter
Flutter
Internet Explorer 8
Facebook Graph Api
Database Design
Twitter
Dependencies
Prometheus
Rust
Apache Flink
Templates
Cobol
Kubernetes
Vector
Amazon Redshift
Google Maps
Xslt
Polymer
Operating System
Command Line
Orientdb


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网