Python lxml更改标记层次结构？_Python_Html_Xml_Lxml - Fatal编程技术网

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/html/80.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python lxml更改标记层次结构？_Python_Html_Xml_Lxml - Fatal编程技术网

Python lxml更改标记层次结构？

python html xml

Python lxml更改标记层次结构？,python,html,xml,lxml,Python,Html,Xml,Lxml,我对lxml有一个小问题。我正在将XML文档转换为HTML文档。原始XML如下所示（看起来像HTML，但在XML文档中）：我明白了： <div><p>Localization - Eiffel tower? Paris or Vegas </p><p>Bayes theorem p(A|B)</p></div> 本地化-埃菲尔铁塔？巴黎或拉斯维加斯贝叶斯定理p（A | B）我对s没有任何问题，但是“贝叶斯定理”段落

我对lxml有一个小问题。我正在将XML文档转换为HTML文档。原始XML如下所示（看起来像HTML，但在XML文档中）：

我明白了：

<div><p>Localization - Eiffel tower? Paris or Vegas </p><p>Bayes theorem p(A|B)</p></div>

本地化-埃菲尔铁塔？巴黎或拉斯维加斯
贝叶斯定理p（A | B）

我对s没有任何问题，但是“贝叶斯定理”段落不再嵌套在外部段落中这一事实是一个问题

有人知道lxml为什么这样做，以及如何阻止它吗？谢谢。

lxml之所以这样做，是因为它不存储无效的HTML，也不存储HTML中的

元素：
p元素表示一个段落。它不能包含块级元素（包括P本身）
您使用的是lxml的HTML解析器，而不是XML解析器。请尝试以下方法：
>>> from lxml import etree
>>> item = '<p>Eiffel tower? Paris or Vegas <p>Bayes theorem p(A|B)</p></p>'
>>> root = etree.fromstring(item)
>>> etree.tostring(root, pretty_print=True)
'<p>Eiffel tower? Paris or Vegas <p>Bayes theorem p(A|B)</p></p>\n'

来自lxml导入etree的>>
>>>项目='埃菲尔铁塔？巴黎或维加斯贝叶斯定理p（A | B）
'
>>>root=etree.fromstring（项）
>>>etree.tostring（root，pretty\u print=True）
“埃菲尔铁塔？巴黎或拉斯维加斯贝叶斯定理p（A | B）
\n'
嗯。这是我不知道的。谢谢
<div><p>Localization - Eiffel tower? Paris or Vegas </p><p>Bayes theorem p(A|B)</p></div>

>>> from lxml import etree
>>> item = '<p>Eiffel tower? Paris or Vegas <p>Bayes theorem p(A|B)</p></p>'
>>> root = etree.fromstring(item)
>>> etree.tostring(root, pretty_print=True)
'<p>Eiffel tower? Paris or Vegas <p>Bayes theorem p(A|B)</p></p>\n'




[html]相关文章推荐



                                                        
                                       





随机文章推荐



                                                        
Stanford nlp 如何检查英语语法关系？
stanford-nlp 
Stanford nlp 标记Regex是否支持依赖项注释？
stanford-nlp 
Stanford nlp Stanford CoreNLP基本线性示例不'；行不通
stanford-nlp 
Stanford nlp 如何防止核心NLP Pos标记器标记化？
stanford-nlp


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
Python MySQL参数化查询
									Python
							 									Mysql
							 
是Python'；s布尔通过值传递？
									Python
							 
为什么dict对象在python中是不可损坏的？
									Python
							 									Dictionary
							 
相当于Python'；s"；加上；红宝石色
									Python
							 									Ruby
							 
Python：在元组列表中查找最小值和最大值
									Python
							 									List
							 									Graphics
							 
Python 如何将dict列表转换为dict
									Python
							 									List
							 									Dictionary
							 
Python 如何在App Engine中模拟用户服务？
									Python
							 									Google App Engine
							 
为什么Python CSV阅读器会忽略双引号字段？
									Python
							 									Csv
							 
用Python解析CSV/制表符分隔的txt文件
									Python
							 									Parsing
							 									Csv
							 									Dictionary
							 
使子进程保持活动状态并不断向其发出命令？python
									Python
							 									Process
							 
Python 在timeit中使用分号
									Python
							 									Exception
							 
如何将变量放入Python docstring中
									Python
							 
Python 在matplotlib中动态添加/创建子地块
									Python
							 									Matplotlib
							 
Python Can'；t让sphinx在目录树下链接到另一个文档
									Python
							 									Python Sphinx
							 
Python 在文件之间使用全局变量？
									Python
							 
Python Django get_或_create，如何说commit=False
									Python
							 									Django
							 
Python 使用SciPy的分位数-分位数图
									Python
							 									Statistics
							 
Python 将颜色的字符串表示形式转换回列表
									Python
							 									Pandas
							 									Csv
							 
Erlang 512哈希与python 512哈希不匹配
									Python
							 									Hash
							 									Cryptography
							 									Erlang
							 
如何在python3中精确打印大量十进制对象？
									Python
							 									Python 3.x
							 
Python Discord.py从bot附加图像
									Python
							 									Discord.py
							 
Python Selenium下拉菜单是div
									Python
							 									Selenium
							 
python在环境变量path中引发语法错误
									Python
							 									Python 3.x
							 									Windows
							 
Python 我是否需要再次预处理新数据以预测模型？
									Python
							 
Python 十二生肖日历样本中日期的意外输入值检测
									Python
							 									Python 3.x
							 
python 2.7与json的套接字通信
									Python
							 									Json
							 									Python 2.7
							 									Sockets
							 
Python 使用print（"；，"；.join（my_array））提取单个字符串并将其添加到streamlight markdown。我没有得到绳子，而是一根也没有
									Python
							 									Pandas
							 									Markdown
							 
Python Dataframe使用unique，但行是一个列表，而不是一个"；一维阵列状；
									Python
							 									Pandas
							 									Lambda
							 
查找与DST小时数的时间差（Python）
									Python
							 									Date
							 									Datetime
							 
有没有更好的方法来读取Python中的几个txt文件？
									Python
							 									Pandas
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Orientdb
Video Streaming
Datatables
Swing
Jquery Ui
Zurb Foundation
Spring
Variables
Flask
File
Playframework
Dotnetnuke
Ruby On Rails 3.2
Android Emulator
Linux
Hibernate
Google Calendar Api
Keras
Azure Ad B2c
Visual Studio 2008
Vue.js
Aurelia
3d
Google Cloud Dataflow
Neo4j
Google Cloud Storage
Nhibernate
File Io
Xquery
Stored Procedures
Extjs4
.net 4.0
Unicode
Certificate
Adobe
Alfresco
Influxdb
Drupal
D
Webgl
Web Scraping
Asp Classic
Asp.net Mvc 2
Go
Apache Nifi
Mongoose
Time
Qt
Cluster Computing
Assembly
Google Bigquery
Openlayers 3
Awk
Backbone.js
Automated Tests
Latex
Xamarin.ios
Proxy
Breeze
Sparql
Phpunit
Shell
Url Rewriting
File Upload
Clearcase
Google Colaboratory
Parse Platform
Javafx 2
Akka
Permissions
Angularjs
Geolocation
Ibm Mobilefirst
Zend Framework2
Sql
Udp
Javascript
Gdb
Android Ndk
Collections
Asp.net Web Api
Pascal
Selenium
Apache2
Single Sign On
Tomcat
Pip
Amazon S3
Nunit
Matrix
Function
Tcl
Google App Engine
Less
Websphere
Clang
F#
Filter
Coldfusion
Installation
Reflection
Scripting
Nsis
Sharepoint 2007
Swiftui
Generics
Markdown
Mfc
Authentication
Automation
Puppet
Replace
Ansible
Vim
Jasmine
Clojure
Ethereum
Linux Kernel
Process
Swift
Air
Sorting
Tcp
Uitableview
Racket
Programming Languages
C
Compiler Construction
Wicket
Computer Vision
Drop Down Menu
Serialization
Gwt
Sas
Python 3.x
Actionscript 3
Knockout.js
Parsing
Exception Handling
Log4net
Logic
Gremlin
Ignite
Listview
Apache Pig
Ubuntu
Memory
Charts
Office365
Merge
Ibm Mq
Google Chrome
Functional Programming
Fullcalendar
Azure Devops
Typescript
Cmake
Java
Vuejs2
Ipython
R
Web Crawler
Netbeans
Io
Hazelcast
Sharepoint 2010
Iphone
System Verilog
Emacs
Symfony
Com
Ios7
Documentation
Isabelle
Java Me
Nginx
Django Rest Framework
Next.js
Gitlab
Rally
Localization
Ms Office
Google Maps Api 3
Ravendb
Silverstripe
Jekyll
Robotframework
Zsh
Heroku
Amazon Cloudformation
Compiler Errors
Applescript
Recursion
Cryptography
Mapping
Activemq
Codenameone
Verilog
For Loop
Opencl
Silverlight 4.0


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网