使用python beautifulsoup进行Web爬行_Python_Html_Beautifulsoup - Fatal编程技术网

使用python beautifulsoup进行Web爬行

python html

使用python beautifulsoup进行Web爬行,python,html,beautifulsoup,Python,Html,Beautifulsoup,如何提取位于命名类下的段落标记和中的数据？使用以下函数和：导入请求从bs4导入BeautifulSoup url=“…” r=请求。获取（url）数据=r.text soup=BeautifulSoup（数据'html.parser'） div=soup.find（'div'，{'class'：'class-name'}） ps=div.find_all（'p'）） lis=div.find_all（'li'）） #打印所有标签的内容对于ps中的p：打印（p.text） #打印所有标记

如何提取位于命名类下的

段落标记和中的数据？
使用以下函数和：
导入请求
从bs4导入BeautifulSoup
url=“…”
r=请求。获取（url）
数据=r.text
soup=BeautifulSoup（数据'html.parser'）
div=soup.find（'div'，{'class'：'class-name'}）
ps=div.find_all（'p'））
lis=div.find_all（'li'））
#打印所有标签的内容
对于ps中的p：
打印（p.text）
#打印所有标记的内容
对于lis中的li：
打印（li.text）
使用功能和：
导入请求
从bs4导入BeautifulSoup
url=“…”
r=请求。获取（url）
数据=r.text
soup=BeautifulSoup（数据'html.parser'）
div=soup.find（'div'，{'class'：'class-name'}）
ps=div.find_all（'p'））
lis=div.find_all（'li'））
#打印所有标签的内容
对于ps中的p：
打印（p.text）
#打印所有标记的内容
对于lis中的li：
打印（li.text）
post a sample input.post sample html/xmlpost a sample input.post sample html/xmlAwesome..非常感谢：-）非常感谢..非常感谢：-）
import requests
from bs4 import BeautifulSoup

url = '...'

r = requests.get(url)
data = r.text
soup = BeautifulSoup(data, 'html.parser')

div = soup.find('div', {'class':'class-name'})
ps = div.find_all('p')
lis = div.find_all('li')

# print the content of all <p> tags
for p in ps:
    print(p.text)

# print the content of all <li> tags
for li in lis:
    print(li.text)




[html]相关文章推荐



                                                        
Html 未按下页脚div
我有一个DIV头，中间有两个div，一个div作为页脚。中间的div设置为float:left和float:right。当中间的div垂直增长时，我希望页脚div向下压在页面上。它目前只在IE中工作。在所有其他浏览器上，footer div保持在同一位置，其他div仅与footer重叠。我做错了什么
html 
Html 两个div：一个固定，另一个拉伸
htmlcss 
Html Mailchimp电子邮件中的编码问题
htmlencodingutf-8gmail 
html5画布拉伸矩形的一侧
htmlcanvas 
Html 环绕浮动时避免悬空文本
htmlcss 
在非常宽的HTML表上获得正确的边距？
htmlcss 
Html CSS页脚图像
htmlcss 
Html ID未对齐中心CSS[outlook.com]
htmlcssoutlook 
Html CSS:无法访问下拉菜单
htmlcss 
Html 由不同父元素触发的同一元素上的转换和变换
htmlcss 
如何在HTML中居中放置按钮？
html 
Html ul:最后一个孩子和ul:最后一个孩子之间的差异
htmlcsssass 
Html 如何使用css添加带有文本覆盖的图像缩放效果
htmlcss 
wkwebview运行本地html（文档中）从web进程接收到意外URL
htmliframe 
Html 行中的第一个flex项在左视口边缘外消失，无法通过滚动进行访问
htmlcss 
Html 在一个id下包装多个CSS元素
htmlcss 
Html 如何更改ng中星号符号的颜色需要
htmlcssangularjsangular-material 
Html 在SVG中的掩码上放置轮廓
htmlsvg 
Html 如何将列表项放到下一个打开的行中？
htmlcssforms 
Html 为什么在一个元素中使用“文本”；显示：内联；线条高度：0“；，在不同的行中还有不同的yAxis位置吗？
htmlcss 
                                       





随机文章推荐



                                                        
对directx 10的virtualbox支持
virtualbox 
Virtualbox Vagrant:使用VBoxManage执行以下命令时出错
virtualboxvagrant 
Virtualbox 使用自定义文件打包基本长方体
virtualboxvagrant 
Virtualbox 无法运行virtualmachine-vbox amd-v
virtualbox 
如何在链接模式下克隆virtualbox来宾及其已保存状态？
virtualbox 
是否可以在docker容器中运行virtualbox
virtualboxdocker 
Virtualbox VMware Workstation Player桥接连接不'；行不通
virtualboxvmware 
使用virtualbox捕获所有指令
virtualbox 
windows上的docker工具箱和virtualbox。权限和无效协议错误
virtualbox 
在不添加GUI或来宾的情况下构建VirtualBox
virtualbox


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
Python 在googleappengine中处理多个url脚本
									Python
							 									Google App Engine
							 
在Python中获取IO错误
									Python
							 									Python 2.7
							 
二元组python列表中的循环检测
									Python
							 									Algorithm
							 									Python 2.7
							 
Python Tkinter:将标准输出重定向到只读文本
									Python
							 									Python 2.7
							 									Tkinter
							 
Python 下载剪贴图像时出错
									Python
							 									Scrapy
							 									Web Crawler
							 
Python Django可疑行动
									Python
							 									Django
							 									Exception
							 									Heroku
							 									Amazon S3
							 
Python Flask和Tweepy oAuth错误-无法获取请求令牌
									Python
							 									Twitter
							 									Oauth
							 									Flask
							 
Python 福吉：@从X导入Y'时修补程序不工作；是否正在使用ing而不是导入X？
									Python
							 									Unit Testing
							 
获取python中.tif文件的压缩信息
									Python
							 									Compression
							 
Python 如何在不更改类型的情况下从SearchQuerySet获取n个搜索对象？
									Python
							 									Django
							 
Raspberry Pi上JSON服务器的Python库
									Python
							 									Architecture
							 
Python 只有一个底层锁的读写锁？
									Python
							 									Multithreading
							 									Concurrency
							 
Python 将NAN和字符串列表转换为int？
									Python
							 
Python 在SUMBOLE3中运行自定义命令之前保存所有文件
									Python
							 									Sublimetext3
							 
Python %他在正则表达式中表现出奇怪的行为
									Python
							 									Regex
							 
在Ubuntu上的virtualenv中安装matplotlib的python tk
									Python
							 									Ubuntu
							 									Matplotlib
							 
Python Ubuntu mpi4py赢得'；不编译
									Python
							 									Linux
							 									Ubuntu
							 
如何在python中从两个函数中减去值
									Python
							 									Database
							 									Python 3.x
							 									Sqlite
							 
使用python实现opencv中的特征面
									Python
							 									Opencv
							 
LSTM错误python keras
									Python
							 									Keras
							 
Python 数据帧值，其中日期时间介于另一个数据帧中的两个日期之间
									Python
							 									Pandas
							 
python-在函数中创建列表
									Python
							 
Python 如何使用SRCALPHA模式在pygame中将alpha设置为画布？
									Python
							 									Image Processing
							 
使用opencv3.1.0 python 3.4.2拍摄视频时转动伺服电机
									Python
							 									Python 3.x
							 									Raspberry Pi
							 
Python 如何使用unicode字符串作为pd.DataFrame的索引？
									Python
							 									Pandas
							 									Dataframe
							 									Unicode
							 
Python 如何根据一列的先前值和另一列的当前值计算列
									Python
							 									Pandas
							 
Python Django-Heroku-提交更改
									Python
							 									Django
							 									Git
							 									Heroku
							 
Python 使用numpy.reformate（）添加标注
									Python
							 									Arrays
							 									Numpy
							 
Python Pandas dataframe：平均一个列中的值，因为另一个列具有重复项
									Python
							 									Pandas
							 									Dataframe
							 
Python 如何在字符串中的某个子字符串之前和之前获取所有内容？
									Python
							 									String
							 									Python 3.x
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Install4j
Firefox
Openssl
Yii2
Loopbackjs
Kernel
Google App Maker
D3.js
Tomcat
Embedded
Crystal Reports
Amp Html
Report
Opengl Es
Redux
Tsql
Arangodb
Cassandra
Osgi
Azure Ad B2c
Cron
Mediawiki
Syntax
Automation
Import
Anaconda
Xpages
Processing
Scrapy
Weblogic
Db2
Directx
Internet Explorer
Odata
Architecture
Geolocation
View
Stm32
Laravel
Iframe
Kendo Ui
Ms Access
Kdb
Teradata
Pointers
Asp.net Web Api
Ocaml
Objective C
Couchbase
Dart
Vb6
Pentaho
Kubernetes
Snmp
Jasmine
Fiware
Session
Yii
Spring Mvc
Visual Studio 2015
Websocket
Push Notification
File Io
Google Maps
Inno Setup
Cloud Foundry
Google Maps Api 3
Typescript
Big O
Random
Netty
Teamcity
Collections
Scala
Phpmyadmin
Dialogflow Es
Vbscript
Time Complexity
Reporting Services
Reference
Class
Ipython
Text
Image Processing
Sed
Twitter
Cmake
Java
Yaml
Windbg
Scripting
Cors
Electron
Android Studio
Mpi
Layout
Opencv
Svg
Ubuntu
Emacs
Oracle
Maven 2
Domain Driven Design
Perforce
Scheme
Dask
Flask
Prestashop
Parallel Processing
Z3
Gps
Appium
Excel
Ssh
Angular
Doctrine Orm
Eclipse Rcp
Windows 8
Asp.net Mvc
Ember.js
Testng
Artifactory
Data Structures
Here Api
Asp Classic
Actions On Google
Zurb Foundation
Api
Tree
Frameworks
Cygwin
Jms
Amazon Ec2
Xcode
Websphere
Couchdb
Pandas
C++11
Sapui5
Jmeter
Exchange Server
Fluent Nhibernate
Serial Port
For Loop
Swiftui
Algorithm
Openerp
Selenium Webdriver
Ruby On Rails 4
Salesforce
Dependency Injection
Ruby On Rails 3.2
Loops
Twig
Replace
Discord.js
Unicode
C
Ide
Cakephp
Openid
Encoding
Coffeescript
Struct
Windows Phone 8.1
Aws Lambda
Google Chrome Devtools
Phpstorm
Module
Asp.net Mvc 4
Timer
Math
Mongoose
Gdb
Ethereum
Open Source
Dynamics Crm
Windows Runtime
Mapreduce
Visual Studio 2012
Winforms
Solr
Usb
Google Compute Engine
Unity3d
Indexing
Visual Studio 2013
Nunit
Bots
Marklogic
Robotframework
Xmpp
Xamarin
Listview
Swagger
Reflection
Android Emulator
Angularjs
Ionic2
Visual Studio 2008
Compression


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网