Python 与BeautifulSoup不一致的刮削_Python_Html_Web Scraping - Fatal编程技术网

Python 与BeautifulSoup不一致的刮削

python html web-scraping

Python 与BeautifulSoup不一致的刮削,python,html,web-scraping,Python,Html,Web Scraping,我已经使用request.get下载了一些facebook页面的“关于部分”，并使用以下代码从页面上的迷你地图中删除坐标： html = open(file, "r",encoding='utf-8').read() # Opening saved html soup = BeautifulSoup(html,'html.parser') # Parsing html map_url = soup.find_all(class_='_

我已经使用request.get下载了一些facebook页面的“关于部分”，并使用以下代码从页面上的迷你地图中删除坐标：

html = open(file, "r",encoding='utf-8').read()        # Opening saved html
soup = BeautifulSoup(html,'html.parser')              # Parsing html

map_url = soup.find_all(class_='_a3f img')            # Extracting url of map

parsed =urlparse.urlparse(map_url['src'])             # Extracting url parameters
coordinate_marker = parse_qs(parsed.query)['markers'] # Return coordinates

除少数情况外，这一做法效果良好。在这些情况下，保存的html仍然有一个映射，str（soup）中的

“a3f img”返回True，只是没有返回任何带有“find\u all”的内容
奇怪的是，当我在Chrome中打开保存的HTML时，将整个HTML复制并粘贴到一个新文件中，然后该文件运行良好并返回坐标（两个文件大小相同）
我尝试过更改解析器（运气不好）。对此的任何解释都将不胜感激




[html]相关文章推荐



                                                        
Html 播放其他视频时暂停播放youtube视频？
htmlyoutubeyoutube-api 
Html 通过CS3中的Flash变量提取URI XML位置
htmlflashactionscript-3 
在HTML视图中使用Flex库API
htmlapache-flex 
HTML5寻路
html3d 
Html 拉伸网页高度以适应浏览器
htmlcss 
Html table layout=固定，在IE7中，一个没有指定宽度属性的列将表宽度设置为100%
htmlcss 
Html 如何使用xslt设计xml样式，同时保持标记的原始顺序？
htmlxmlxslt 
使用selenium和Java测试HTML代码
htmlselenium 
Html 如何从子元素中删除CSS类
htmlcss 
Html 通过ffmpeg widthout预加载创建在线播放的mp4视频
htmlvideoffmpegvideo-streaming 
Html 使遮光div延伸至其顶部div高度的100%
htmlcss 
Html 为什么我的UL子菜单以它为背景'；她的祖父母是谁？
htmlcssnavigation 
我想在建议的html布局中将两个单元格合并在一起
htmlcss 
Html 如何重新缩放悬停菜单。
htmlcss 
Html 为什么这个ASP经典进程没有执行？
htmlvbscriptasp-classic 
Html 从链接中的中间跨距提取内部文本
htmlvba 
Html 如何在iframe中获取超链接
html 
Html 如何将引导SCS编译为css
htmlcsssass 
如何限制html中div元素在文本框中输入的字符数
html 
Html 写两个段落时，这些段落是分开的！IDK为什么
htmlcss 
                                       





随机文章推荐



                                                        
Windows 8 如何在windows 8中以编程方式捕捉应用程序？
windows-8 
Windows 8 向webview控件提供域/用户凭据
windows-8 
Windows 8 Windows 8、Windows Phone 8应用程序大小限制
windows-8windows-phonewindows-phone-8 
Windows 8 如何防止/禁用Windows 8 Metro UI应用程序的快照视图
windows-8 
Windows 8 如果从foursquare发送到Facebook，WebAuthenticationBroker将挂起
windows-8windows-runtimeoauth-2.0 
Windows 8 两个Metro应用能否在WinRT中共享数据
windows-8windows-runtime 
Windows 8 在Windows 8 Metro应用程序中检测设计时间？
windows-8 
Windows 8 如何使用Javascript在Windows 8应用程序中使用表存储服务
windows-8 
Windows 8 windows8-将对象数据输出为表
windows-8windows-store-apps 
Windows 8 使用WinRt列出并连接WiFi
windows-8windows-runtime 
Windows 8 绑定到windows 8 gridview中ObservableCollection中的更改对象
windows-8windows-runtimewindows-store-apps 
Windows 8 Windows 8上Internet Explorer 10全局代理设置的注册表项
windows-8proxy 
Windows 8 如何从图片库下的目录中获取文件？
windows-8 
Windows 8 Windows 8应用程序解析
windows-8windows-store-apps 
Windows 8 如何在Windows XP上构建Windows Phone 8应用程序？
windows-8windows-phone-8 
Windows 8 如何添加提醒/提醒类通知，提醒用户在c#/xaml metro应用程序中设置的提醒？
windows-8 
Windows 8 Win8：像OblyTile和Modern Tile Maker这样的程序如何将Tile分配给非Metro应用程序Shorcut？
windows-8 
Windows 8 Windows应用商店应用程序认证失败
windows-8windows-store-appscertificateprotocol-buffers 
Windows 8 模拟Windows8外观的Javascript库
windows-8javascript 
Windows 8 在Windows 8上连接到LDAP时发生IBM.XMS.Admin错误
windows-8ldapibm-mq


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
Python 获取相关数据项的最有效方法
									Python
							 									Database
							 									Google App Engine
							 									Optimization
							 
Python flask classy和peewee，元类冲突错误
									Python
							 
musqldb-python并没有真正更新原始数据库
									Python
							 									Mysql
							 
Python 如何在配置解析器中添加多个部分？
									Python
							 									Qt
							 
在python中，如何在给定起始匹配条件的情况下解析/替换字符串的一部分？
									Python
							 									Regex
							 
Python ElementTree并行节点创建
									Python
							 
Python 合并目录中以相似名称开头的PDF文件
									Python
							 									Merge
							 
Python 有没有办法存储我的程序执行的所有GET请求的值？
									Python
							 									Pandas
							 									Dataframe
							 
将python列表输入Teradata SQL
									Python
							 									Sql
							 									Teradata
							 
Python ValueError:未知层：功能性
									Python
							 									Tensorflow
							 									Keras
							 
Python 使用Django按连接数查询多对多关系
									Python
							 									Django
							 									Database
							 
如何使用python隐藏任务栏图标？
									Python
							 									Windows
							 
我的python脚本在Heroku上的输出如何与本地的不同？
									Python
							 									Firebase
							 									Heroku
							 
Python 如何将文本数据标记为单词和句子而不出现类型错误
									Python
							 									Nlp
							 
Python 基于其他数据帧的值更新数据帧。这是一个传统的UPSERT任务，带有一个新的指示符列
									Python
							 									Python 3.x
							 									Pandas
							 									Dataframe
							 
在Python中使用fuzzyfuzzy/rapidfuzz提高字符串匹配性能
									Python
							 									Multithreading
							 									Performance
							 
Python：使用经过身份验证的代理启动Chrome
									Python
							 									Python 3.x
							 									Python 2.7
							 
Python 如何序列化嵌套对象？
									Python
							 									Django
							 									Django Rest Framework
							 
PythonDataFrame：循环遍历每一行，如果条件为true，则更新列
									Python
							 									Pandas
							 									Dataframe
							 
Python pip安装google cloud pubsub在docker容器中安装失败
									Python
							 									Docker
							 
在PHP中调用Python代码并显示输出
									Python
							 									Php
							 
Python GetModule中的comtypes错误：UnicodeDecodeError:'；utf-8'；编解码器可以'；t解码位置621中的字节0x92：无效的开始字节
									Python
							 									Encoding
							 
Python 如何修复在discord服务器中发送垃圾邮件的命令
									Python
							 									Python 3.x
							 									Discord
							 									Discord.py
							 
Python Conda找不到要安装的程序包
									Python
							 
Neo4j python驱动程序：匹配数十万个节点并返回数据帧块
									Python
							 									Neo4j
							 
Python 用PostgreSQL实现炼金术；SSL系统调用错误：检测到EOF"；
									Python
							 									Postgresql
							 									Azure
							 									Docker
							 									Sqlalchemy
							 
Selenium python pytest控制台Bluethoth适配器失败
									Python
							 									Selenium
							 
Python 使用K-means的数据集的三维打印
									Python
							 									Scikit Learn
							 
Python ked in@alecxe的答案提供了一个函数，它似乎就是您想要的。这是一个很好的方法，应该是事实上的方法。但是，我会注意到，如果表单的作者没有正确设置selecthtml元素，那么您可能必须使用更迟钝的“xpath”版本。如果只是使用输入字段，xpath
									Python
							 									Selenium
							 									Selenium Webdriver
							 									Drop Down Menu
							 
如何使用Python加密在没有后端的情况下加载der证书（或使用哪个后端）
									Python
							 									Cryptography
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Asp.net Mvc 3
Pandas
Dom
Azure Ad B2c
Nservicebus
Join
Syntax
Asp.net Mvc 2
Forms
Big O
Service
Responsive Design
Yii
Tensorflow
Ms Word
Debugging
Twitter Bootstrap
Outlook
Hadoop
Text
Dependency Injection
Listview
Azure Cosmosdb
D
Ada
Mongoose
Mongodb
Database
Report
Nativescript
Php
Tableau Api
Maps
Glassfish
Prometheus
Mercurial
Virtual Machine
Linkedin
Open Source
Authentication
Amazon Redshift
Macros
Udp
Asterisk
Autodesk Forge
Elm
Blackberry
Wolfram Mathematica
Amazon Cloudformation
Certificate
Aem
Qt4
Ubuntu
Jquery Mobile
Jwt
Modelica
Gps
Geometry
Apache Storm
Laravel 4
C# 4.0
Oauth
.net
Julia
Jdbc
Timer
Notepad++
Drupal 6
Mapreduce
Prolog
Eclipse
Svg
Methods
Mpi
Regex
Codeigniter
Replace
Python 3.x
X86
Ocaml
Phpmyadmin
Binding
Redis
.htaccess
Alfresco
Properties
Odata
Ftp
Cocos2d Iphone
Audio
Lucene
Typescript
Highcharts
Google App Maker
Extjs4
Editor
C++11
Windows Mobile
Windows 10
Numpy
Install4j
Tcl
Flash
Corda
Machine Learning
Wcf
Xml
Jira
Hash
Kibana
Testing
Content Management System
Moodle
Module
Sails.js
Akka
Model
Hyperlink
Rabbitmq
Windows Phone 7
Vector
Cluster Computing
Jestjs
C# 3.0
Jsf
Xcode4
Biztalk
Wix
Operating System
Linux Kernel
Vuejs2
Asp.net Core Mvc
Influxdb
Mule
Groovy
Webpack
Dotnetnuke
Entity Framework Core
Error Handling
Intellij Idea
Ms Office
Fullcalendar
Gis
For Loop
Xna
Utf 8
Objective C
Reporting Services
Dynamic
Haskell
Clearcase
Teamcity
Templates
Visual Studio 2010
Api
Compiler Errors
Ethereum
Doctrine
Sugarcrm
Internet Explorer 8
Google Maps
Qt
Matlab
Proxy
Apache Flink
Visual Studio 2008
Floating Point
Cucumber
Resharper
Amazon Web Services
Spring Batch
Visual Studio 2012
Keras
Google Drive Api
If Statement
Sublimetext3
Swiftui
Exchange Server
Primefaces
Soap
Jmeter
Logic
Validation
Command Line
Three.js
Opengl
Testng
Identityserver4
View
Layout
Xamarin
Apache Flex
Cryptography
Asp.net Mvc 5
Isabelle
Virtualbox
Composer Php
Processing
Sapui5
Sap
Random


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网