Python 使用getElementsByTagName进行健壮的DOM解析_Python_Dom - Fatal编程技术网

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/281.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用getElementsByTagName进行健壮的DOM解析_Python_Dom - Fatal编程技术网

Python 使用getElementsByTagName进行健壮的DOM解析

python dom

Python 使用getElementsByTagName进行健壮的DOM解析,python,dom,Python,Dom,以下内容（摘自“深入Python”）失败于 Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/path/to/htmlToNumEmbedded.py", line 2, in <module> xmldoc = minidom.parse('/path/to/index.html') File "/usr/lib/python2.7

以下内容（摘自“深入Python”）

失败于

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/path/to/htmlToNumEmbedded.py", line 2, in <module>
    xmldoc = minidom.parse('/path/to/index.html')
  File "/usr/lib/python2.7/xml/dom/minidom.py", line 1918, in parse
    return expatbuilder.parse(file)
  File "/usr/lib/python2.7/xml/dom/expatbuilder.py", line 924, in parse
    result = builder.parseFile(fp)
  File "/usr/lib/python2.7/xml/dom/expatbuilder.py", line 207, in parseFile
    parser.Parse(buffer, 0)
xml.parsers.expat.ExpatError: mismatched tag: line 12, column 4

但似乎有些笨拙：有没有我忽略的内置函数

或者使用getElementsByTagName进行健壮DOM解析的另一种更优雅的方法？
您可以使用BeautifulSoup进行以下操作：

from bs4 import BeautifulSoup with open('/path/to/index.html') as f: soup = BeautifulSoup(f) soup.find_all("img")

如果需要元素列表，请参见，而不是迭代
元素的返回值。iter
，请在其上调用
list
：

from lxml import html reflist = list(html.parse('/path/to/index.html.html').iter('img'))

from bs4 import BeautifulSoup with open('/path/to/index.html') as f: soup = BeautifulSoup(f) soup.find_all("img")

from lxml import html reflist = list(html.parse('/path/to/index.html.html').iter('img'))

[dom]相关文章推荐

Dom getElementsByTagName（'；TBODY'；）在Chrome中失败 dom google-chrome

如何将值写入DOM中的文本节点？ dom

使用视频DOM暂停并播放.mp4 onmouseover/onmouseout dom video

Dom onclick不起作用。无论何时提交，它都会自动执行所有onclick处理程序 dom

Dom 不使用客户端Javascript的动态异步表单和网页是否可能？ dom asynchronous web html

PHP DOMDocument-匹配并删除URL dom

Dom 如何使用CasperJS单击超链接？ dom

Dom PIXI检查是否将DisplayObject添加到后台或从后台删除 dom

随机文章推荐

Python 在facebook oauth回调中获取代码参数时遇到问题 python django facebook oauth

Python '；gcc-4.2'；安装bcrypt时失败，退出状态为1 python

Python Django模型结构允许'；测试用户'； python django database-design django-models

Python 使用Flask获取表单数据 python flask

Python 如何跳过函数？ python function

Python Django Rest框架使用序列化程序一次上载多个图像 python django image rest file-upload

I'；我得到的回答是：“；Python handling socket.error:[Errno 104]由对等方重置连接；当我'；我正在尝试运行python脚本 python sockets

Python 如何在honcho.env文件中添加多行变量？ python

Python 将语法从Stanford或Berkeley解析器导入NLTK python parsing stanford-nlp

使用Line2D在python中绘制线条 python

Python 如何创建重载构造函数？ python

Python 使用pandas替换excel工作表中的数据 python excel pandas

Python pytest中是否有方法从夹具中获取参数化测试节点ID列表？ python

Python 如何在Django物化视图中传输复选框参数？ python html django

Python 退出定义中的循环 python raspberry-pi

Python 如何使用自定义名称保存h2o MOJO下载的jar文件？ python

Python 如何限制按钮的按下次数 python

Python 拆分列>&燃气轮机；获取唯一值>&燃气轮机；将唯一值添加回列 python python-3.x pandas

如何在python中将背景添加到pdf并另存为图像 python image pdf

Python '；将RPi.GPIO导入为GPIO'；不'；不管我做什么，我都不工作 python

[python]相关推荐

Tags

C# Joomla Blazor Geometry Memory Facebook Graph Api Animation Winapi Notifications Asp.net Web Api Qt4 Active Directory Azure Ad B2c Spring Batch Filter Google App Maker Com Serial Port Clang Apache Kafka .htaccess Debian Inheritance Grafana Drop Down Menu Model View Controller Emacs Weblogic Module Big O Spring Cloud Yii2 Makefile Extjs Excel Formula Cookies Installation Mfc Vaadin Scrapy Mapreduce Github Deep Learning Dotnetnuke Exchange Server Ruby On Rails Windows Phone 8 Hadoop Amazon Dynamodb Aws Lambda Ldap Arangodb Flask Report Apache Zookeeper File Upload Socket.io Yaml Cypress Openlayers 3 Jira Directx Vhdl Sql Server 2005 Git Character Encoding Ubuntu Libgdx Telerik Swiftui Synchronization Silverlight 4.0 Unity3d Asp.net Mvc 3 Wix Sbt Chef Infra List Ant Network Programming Wordpress Configuration Java Keycloak Ftp Angularjs Vbscript Laravel 4 Jersey Documentation Jwt Tkinter Odata Indexing Go Ajax Debugging Groovy Drupal 7 Axapta Streaming Rust Ada Shiny Jaxb Sip Julia Fiware Windows Google Drive Api Version Control Visual Studio 2015 Artificial Intelligence Internationalization Bootstrap 4 Inno Setup C Cocoa Sharepoint Ssas Silverlight Optimization Shell Umbraco Windows 7 Rx Java Scikit Learn Tsql Formatting Search Doxygen Swagger Ignite F# View Docusignapi Functional Programming Mips Webgl Perforce Swift Artifactory Webpack Apache Flink Bluetooth Netbeans Xquery Google Chrome Extension Tabs Button Linux Abap Matlab Math Amazon Redshift Asp.net Core Mvc Isabelle Ruby On Rails 3.2 Gwt Javascript Menu Android Udp Cluster Computing Data Structures Open Source Opengl Jdbc Ethereum Javafx Api Jhipster Aem Selenium Push Notification Python 2.7 Zsh Nest Azure Functions Sonarqube Xaml Pandas Applescript Windows Runtime Gis Opencl Embedded Magento Java 8 Automated Tests Solr Stm32 Reactjs Dart Templates Phantomjs Virtual Machine Windows Mobile C++11 Cordova Scripting

Copyright © 2024. All Rights Reserved by - Fatal编程技术网