selenium与python web爬虫_Python_Selenium_Web Crawler - Fatal编程技术网

selenium与python web爬虫

python selenium web-crawler

selenium与python web爬虫,python,selenium,web-crawler,Python,Selenium,Web Crawler,我想屏幕刮有多个网页的网站。这些页面在不更改URL的情况下动态加载。因此，我使用硒来筛选和刮除它。但是这个简单的程序有一个例外 import re from contextlib import closing from selenium.webdriver import Firefox url="http://www.samsung.com/in/consumer/mobile-phone/mobile-phone/smartphone/" with closing(Firefox())

我想屏幕刮有多个网页的网站。这些页面在不更改URL的情况下动态加载。因此，我使用硒来筛选和刮除它。但是这个简单的程序有一个例外

import re
from contextlib import closing
from selenium.webdriver import Firefox 

url="http://www.samsung.com/in/consumer/mobile-phone/mobile-phone/smartphone/"

with closing(Firefox()) as browser:
    n = 2
    link = browser.find_element_by_link_text(str(n))
    link.click()
    #web_page=browser.page_source
    #print type(web_page)

错误如下

raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: u'Unable to locate element: {"method":"link text","selector":"2"}' ; Stacktrace: Method FirefoxDriver.prototype.findElementInternal_ threw an error in file:///tmp/tmpMJeeTr/extensions/fxdriver@googlecode.com/components/driver_component.js

这是给定url的问题还是firefox浏览器的问题。

如果有人帮了我，那会很有帮助。

我想你的主要问题是页面本身需要一段时间才能加载，你会立即尝试访问该链接（该链接可能尚未呈现，因此堆栈跟踪）。您可以尝试的一件事是在

浏览器中使用隐式等待，这将告诉浏览器在超时之前等待元素出现的特定时间段。在您的情况下，您可以尝试以下操作，在为特定项轮询DOM时，最多需要等待10秒（在本例中，是链接文本2
）：
我正在开发一个python模块，它可能涵盖您（或其他人）的用例：

它将记录的selenium脚本转换为爬网函数，从而避免编写上述任何代码。它适用于动态加载内容的页面。我希望有人觉得这有用
browser.implicitly_wait(10)
n = 2
link = browser.find_element_by_link_text(str(n))
link.click()
#web_page=browser.page_source
#print type(web_page)




[selenium]相关文章推荐



                                                        
Selenium 硒自动化
selenium 
selenium webdriver正在清除sendKeys之前填充的字段
seleniumknockout.js 
Selenium 如何从xpath处理动态id
seleniumide 
如何使用selenium webdriver选择弹出型按钮的子菜单
seleniumselenium-webdriver 
Selenium 为什么'；t在等待中指定的2秒后出现元素定位超时
seleniumselenium-webdriver 
Selenium 无法启动新会话。可能的原因是远程服务器地址无效或浏览器启动失败
selenium 
Selenium 打开新选项卡，而不是新窗口或同一选项卡
selenium 
Selenium 无法单击Capybara中的元素
selenium 
Selenium 如何在Robot框架中通过URL点击链接
seleniumrobotframework 
Selenium xpath text（）-简单选择似乎不起作用
seleniumxpath 
Selenium和page对象模型：与用户打交道时的最佳实践
selenium 
Selenium 边缘浏览器中的选择器异常无效，而在chrome浏览器中可以毫无异常地找到相同的元素
seleniumselenium-webdriverxpath 
Selenium 如何从模式弹出窗口获取文本
selenium 
SeleniumWebDriver：如何单击特定按钮，将类应用于每个按钮。找不到确切的x路径
seleniumselenium-webdriverxpath 
Pytest Selenium elem.send_keys（）导致类型错误：类型为'；非类型'；没有len（）
selenium 
Selenium 硒兼容性
seleniumgoogle-chrome 
Selenium 铬驱动硒异常
seleniumselenium-webdriver 
Selenium 自动完成搜索中列出的元素的XPATH定位器
seleniumxpath 
selenium.common.exceptions.NoSuchElementException:消息：无法找到元素错误使用selenium Python向twitter中的电子邮件字段发送文本
seleniumxpathtwitter 
Selenium 获取元素的XPath
seleniumselenium-webdriverxpath 
                                       





随机文章推荐



                                                        
jQuery中ajax成功后如何绑定事件
ajaxjquery 
已处理来自Chrome扩展的ajax，但接收responseText="&引用；和状态=0
ajaxapachegoogle-chrome-extension 
Ajax 带REST的SharePoint 2010配置文件活动提要
ajaxsharepointrestsharepoint-2010 
Ajax jquery图像加载
ajaximagejquery 
带有ui的ajax:insert-in-JSF
ajaxjsf 
Ajax 从JSF2/Facelets子视图中通过selectOneMenu更新文本组件
ajaxjsf-2 
Magento-使用AJAX在数据库中验证电子邮件，无需刷新
ajaxdatabasevalidationmagento 
用jqueryajax控制执行顺序
ajaxjqueryasynchronous 
Ajax调用了struts2中的上一个操作？
ajaxstruts2 
Ajax搜索和CGridView
ajaxyii 
Ajax jquery中的陷阱消息/异常到ELMAH
ajaxjsonasp.net-mvc-3exception-handling 
Ajax 更新网页的某些部分？
ajaxlaravellaravel-4 
使用ajax和jquery对div进行实时搜索
ajax 
Ajax GeoJSON的Mapbox旋转标记
ajaxmapsleafletmapbox 
Ajax 带有文件的rails窗体的真实性令牌无效
ajaxruby-on-rails-4 
Ajax 包引导工具提示有时会保留在页面上
ajaxtwitter-bootstrap 
laravel ajax post不返回值
ajaxpostlaravel 
Ajax在JSF2应用程序中不起作用
ajaxjsfprimefaces 
Ajax 使用selenium和C#刮取一个有角度的网站会返回角度脚本，而不是呈现的网页
ajaxangularselenium 
Ajax 如何动态更改价格并增加输入数量？
ajax


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
让多个客户端在Python中显示最新信息？
									Python
							 									Network Programming
							 
变量有问题。[Python]
									Python
							 									Windows
							 
Python 正则表达式的行为出乎意料
									Python
							 									Regex
							 
Python 多行正则表达式匹配
									Python
							 									Regex
							 
Python执行受信任但不可靠的脚本
									Python
							 
使用“从迭代器中获取匹配的值”；如果列表中的值为“；用Python
									Python
							 
Python 读取一个字符串并使用它读取另一个文件'；s变量'；s值
									Python
							 									File
							 									Variables
							 
Python代码解释需要
									Python
							 
在Python中，如何在代码的字符串生成行中设置条件
									Python
							 									String
							 
Python 限制蟒蛇芹菜的eta？
									Python
							 									Asynchronous
							 
Python 取向量的不同幂并保存在txtfile中
									Python
							 									Numpy
							 
Python 为什么我的网络不会学习？
									Python
							 									Tensorflow
							 
Python 将用户定义类型与列属性相结合可能会提供一个“两全其美”的解决方案：
从sqlalchemy.types导入UserDefinedType
从sqlalchemy.orm导入延迟
类别接地（用户定义类型）：
def get_col_规格（自身，**千瓦）：
返回“地球”
类事物（db.Model）
__tablename_uuu='things'
id=db.Column（db.Integer，主键=True）
name=db.Column（db.UnicodeText，nullable=False
									Python
							 									Sqlalchemy
							 
如何使用python将IN或NOT IN子句作为变量传递给postgresql查询
									Python
							 									Postgresql
							 									Variables
							 
Python 需要使用拆分函数的帮助吗
									Python
							 									Python 3.x
							 
Python从调用的模块获取返回的对象
									Python
							 									Object
							 									Module
							 
Python 如何使用“解码二进制文件”；对于索引，枚举（文件）中的行“；？
									Python
							 									Python 3.x
							 									Binary
							 
I´；我目前正在python上开发一个平衡括号算法，can´；我不知道是什么´；这是不对的
									Python
							 									Python 3.x
							 
Python 巨蟒硒。我如何告诉selenium等待一个键被点击？
									Python
							 									Selenium
							 
Python将带有空单元格的矩阵样式字符串解析为二维列表
									Python
							 									Parsing
							 									Matrix
							 
Python 查询分组依据/订单依据
									Python
							 									Pandas
							 
有没有办法在python中合并无界数组？
									Python
							 									Arrays
							 									Algorithm
							 
Python 为什么django可以在不运行服务器的情况下进行测试？
									Python
							 									Django
							 									Testing
							 
如何在python中通过逐个字母和反向打印来打印字符串
									Python
							 									String
							 									Text
							 									Printing
							 
在C+中嵌入python+；提取c++；类型
我试图在我的C++程序中嵌入简单的Python指令。
我无法从Python对象类型中提取C++类型…
非常感谢您的帮助
									Python
							 									C++
							 
Python 将重复列表数据移动到表中
									Python
							 									Pandas
							 
Python 如何避免输入错误：'&燃气轮机'；在'；元组'；和'；int'；？
									Python
							 									Python 3.x
							 
如何洗牌；“爱国者”；及；“小马”；在；团队“；专栏？（PYTHON）
									Python
							 
为什么print（'；a'；和'；b'；）返回'；b'；用python？
									Python
							 									String
							 
在python tkinter中用新标签替换标签
									Python
							 									Python 3.x
							 									Tkinter
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Ip
Codeigniter
Webrtc
Tags
Kentico
Talend
.net 4.0
Subsonic
Jpa
Cloud Foundry
Opengl
Emacs
Unity3d
Swift
Ssl
Validation
Module
Cassandra
Gtk
Linq To Sql
Processing
Arangodb
C++11
Url Rewriting
Groovy
Yocto
Computer Science
Windows Phone
Rabbitmq
Arrays
Api
Libgdx
Google Bigquery
Artifactory
Actions On Google
Language Agnostic
Jquery Mobile
Visual Studio 2010
Ubuntu
Mongodb
Fluent Nhibernate
Powershell
Autocomplete
Ruby On Rails 3.1
Gmail
Entity Framework
Scroll
Build
Apache Nifi
Floating Point
Webgl
Ipython
Batch File
Material Ui
Windows 10
Youtube
Text
File Upload
Ssh
Math
Express
Windows Runtime
Windows Phone 8
Spring Boot
Maps
Fonts
Drupal
Symfony1
Python Sphinx
Msbuild
If Statement
Internet Explorer 8
Function
Google Drive Api
Asp.net Core Mvc
Mule
Hive
Windows Mobile
Asp.net Mvc 2
Flask
Vagrant
Haskell
Internet Explorer
Pandas
Coffeescript
Time
Polymer
Encryption
Gstreamer
Memory Management
Ssrs 2008
Android Emulator
Antlr4
Xmpp
Protocol Buffers
Prometheus
Jquery Plugins
Grafana
Blazor
Aws Lambda
C++ Cli
Sails.js
Blockchain
Playframework
Socket.io
Swagger
Applescript
Puppet
Makefile
Entity Framework Core
Botframework
Data Binding
Heroku
Ionic Framework
Push Notification
Cocos2d X
Web Scraping
Jsf 2
Unicode
Rest
Discord
Websocket
Orm
Select
Gridview
Kernel
Editor
Ocaml
Discord.js
Leaflet
Tkinter
Xpath
List
Apache Flex
Triggers
Migration
Odata
Fullcalendar
Google Cloud Platform
Big O
Reflection
Eclipse Plugin
Reference
Windows 7
Backbone.js
Spotify
Gradle
Jquery
Sapui5
Cakephp
Optimization
Jupyter Notebook
Debugging
Cucumber
Cmake
Pytorch
Tcp
Class
Coq
Regex
Reporting Services
Tableau Api
Drools
Shopify
Workflow
Jdbc
Winapi
Nlp
Single Sign On
Visual Studio Code
Sencha Touch 2
Java Me
Yii2
Xcode4
Visual Studio 2017
Graph
Swift2
Axapta
Html
Install4j
Automation
Weblogic
Charts
Markdown
Amazon Redshift
Eclipse Rcp
Command Line
Logic
Xamarin.ios
Three.js
Generics
Pycharm
Apache Zookeeper
Machine Learning
Email
Phantomjs
Com
Image Processing
Plsql
Vb6
Oop


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网