Python R如何从xml标记中提取信息_Python_Xml_R_Perl_Spaces - Fatal编程技术网

Python R如何从xml标记中提取信息

python xml r perl

Python R如何从xml标记中提取信息,python,xml,r,perl,spaces,Python,Xml,R,Perl,Spaces,我有一个关于xml解析的问题。我有带空格的标签，例如 <item1 id=rt name ="th"> <point1>1254</point1> <point2>1254</point2> </item> 1254 1254 如何从这些标签中提取id和名称在接下来的分析中，我需要使用R，但我也可以用perl和python进行文件解析。最好的解决方案是什么？您可以使用XML包： tt <- '<?xm

我有一个关于xml解析的问题。我有带空格的标签，例如

<item1 id=rt name ="th">
<point1>1254</point1>
<point2>1254</point2>
</item>


1254
1254

如何从这些标签中提取id和名称

在接下来的分析中，我需要使用R，但我也可以用perl和python进行文件解析。

最好的解决方案是什么？

您可以使用

XML

包：

tt <- '<?xml version="1.0" encoding="utf-8"?>
<item id="rt" name ="th">
  <point1>1254</point1>
  <point2>1254</point2>
</item>
'

library(XML)
xpathSApply(doc,'//item',xmlGetAttr,'id')
[1] "rt"

tt正则表达式怎么样
/=\K\W？\K\W+/g

=\K
查找但不保存=

\W？\K
查找但不保存标记前的潜在引号
\w+
是您的标签
您可以逐行读取文件并将匹配项保存到数组中，如：
my@matches=$line=~/=\K\W？\K\W+/g
然后使用$matches[]
访问各个元素
如果您想进一步使用正则表达式，可以在这里使用正则表达式：
这是真实的数据吗？第一个标记不是格式良好的XML，它缺少id值周围的引号。如果这是您拥有的数据，您可能无法使用XML工具。您更改了输入。正如@mirod在上面指出的，输入没有id=“rt”
，但是id=rt@simbabque是的，我知道XML的格式非常糟糕（se item1标记不是item）。我的答案是向初学者展示如何使用XML包。这很公平。在这种情况下，我建议指出这一点（你现在这样做了）。否则，他们可能会抱怨它不起作用
    tt <- '<item1 id=rt name ="th">
<point1>1254</point1>
<point2>1254</point2>
</item>
    '

    ll <- readLines(textConnection(tt))
    gsub('.*id=(.*)[ ]name.*','\\1',ll[1])
 [1] "rt"




[xml]相关文章推荐



                                                        
如何使用nokogiri解析xml文件并将结果放入新文件中？
xmlruby 
同步Excel和XML数据？
xmldatabasevb.netexcel 
XML中的空CDATA
xml 
Xml 局部模式定位
xmlxsd 
Don'；我不知道为什么XML模式是'；你的格式不好吗？
xmlxsd 
Xml xsl 1.x如何根据选择更改输出？
xmlxsltxpath 
谷歌地图Api v3和XML提要
xmlgoogle-maps-api-3 
Xml Restful Web服务搜索
xmlweb-servicesrestxslt 
未返回Powershell XML对象
xmlpowershell 
Xml 在xsd模式的xs:documentation中添加换行符
xmlvisual-studioxsd 
Xml XSLT如何输出<；tr></tr>；每3次迭代标记一次
xmlxslt 
无需硬编码XML文档的XQuery转换
xmlxquery 
将大型xml文件作为blob插入oracle表中
xmldatabaseoracleplsql 
Xml 类型取决于属性值的元素的XSD？
xmlxsdjaxb 
XSLT中的XPath 3.0 XML解析函数
xmlxslt 
日期范围未应用于Tally ERP 9集成中使用XML导出的日记帐/凭证登记报告
xml 
Python：从XML文件中删除行并创建新的XML文件
xmlparsing 
Xml 为什么在Web服务的公开WSDL中，<；包括>；标记已更改为<；进口>；标签
xmlmavensoapapache-camel 
为什么可以'；DOMT解析器解析这个XML吗？
xml 
Snowflake XML解析如果元素不存在'；不存在
xmljoinsnowflake-cloud-data-platform 
                                       





随机文章推荐



                                                        
Artificial intelligence 什么是'；订单'；感知器
artificial-intelligenceneural-network 
Artificial intelligence 用RDF表示自然语言
artificial-intelligencemachine-learningrdf 
Artificial intelligence 神经网络是一种懒惰的还是渴望的学习方法？
artificial-intelligencemachine-learningneural-network 
Artificial intelligence 有哪些可定制的机器学习工具包？
artificial-intelligencemachine-learning 
Artificial intelligence 模拟退火的实现
artificial-intelligence 
Artificial intelligence 在国际象棋Alpha-Beta搜索中实现Killer启发式
artificial-intelligence 
Artificial intelligence 与NuPIC类似的项目
artificial-intelligence 
Artificial intelligence 如何解决密码算术难题？
artificial-intelligence 
Artificial intelligence 带Alpha-Beta修剪的MinMax
artificial-intelligence 
Artificial intelligence 用乙状结肠神经元代替感知器网络
artificial-intelligenceneural-network 
Artificial intelligence 浮点遗传适应度函数
artificial-intelligence 
Artificial intelligence NetLogo:2048机器人优化
artificial-intelligencenetlogo 
Artificial intelligence 如何用*算法实现解锁游戏？
artificial-intelligence


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
Python django:缓存用于自定义身份验证的密码
									Python
							 									Django
							 									Caching
							 									Ldap
							 									Passwords
							 
用于在Python中存储字符串的对象
									Python
							 									String
							 
Python subprocess.call似乎忽略了参数
									Python
							 
Python 两个数据帧之间的操作或两个多维数组之间的等效操作
									Python
							 									Numpy
							 									Pandas
							 
Python 在循环中构建阵列
									Python
							 									Numpy
							 
Python Pypephem是否使用完整的VSOP87和ELP-2000/82理论？
									Python
							 
Python 如何在控制台中打印最后一行和第二行到最后一行？
									Python
							 									Python 3.x
							 
Python 函数的作用是：添加新序列
									Python
							 
Python导入时出错
									Python
							 									Django
							 									Django Models
							 									Import
							 
Python 从unicode中删除标点：错误
									Python
							 									Python 2.7
							 									Unicode
							 
Python 查找值为true的布尔数组的索引
									Python
							 									Python 3.x
							 									Numpy
							 
Python lambda使用for循环动态添加参数
									Python
							 									Lambda
							 
Python Numpy将旋转矩阵应用于数组中的每一行
									Python
							 									Arrays
							 									Numpy
							 									Matrix
							 
Python 我该如何解决这个问题；“图像”；pyimage10“；不'；“不存在”；错误，为什么会发生？
									Python
							 									Image Processing
							 									Tkinter
							 
Python R中的梯度增强分类器模拟
									Python
							 									R
							 									Machine Learning
							 									Scikit Learn
							 
Python Django在queryset中对单个字段进行切片
									Python
							 									Django
							 
Python 没有'的两级抽象类层次结构；一致的方法分辨率'；错误
									Python
							 									Python 3.x
							 
Python Keras分段故障（堆芯倾倒）
									Python
							 									Python 2.7
							 									Keras
							 
Python 安装Google Cloud SDK时出现httplib2.SSLHandshakeError
									Python
							 									Python 2.7
							 									Google App Engine
							 
在python中，一个函数将根据另一行上的条件应用于一行中元素的组合
									Python
							 									Function
							 									Pandas
							 
用Python实现上传函数
									Python
							 									Flask
							 
向随机漫游绘图添加动画[Python]
									Python
							 									Numpy
							 									Animation
							 									Matplotlib
							 									Random
							 
Python Django：在一个html中使用多个CSS文件
									Python
							 									Html
							 									Css
							 									Django
							 
Python 有没有办法在heroku上运行IPFS守护程序？
									Python
							 									Heroku
							 
Python pandas.stats是否已被弃用？
									Python
							 									Pandas
							 
Python 斯卡皮没有发送数据包
									Python
							 									Python 3.x
							 
如何在python中计算组内多个后续点的两点之间的距离
									Python
							 									Python 3.x
							 									Pandas
							 
Python 我们什么时候不需要激活功能？
									Python
							 									Tensorflow
							 									Machine Learning
							 									Neural Network
							 
Python 一个涉及到填充物放置位置的广播问题
介绍
									Python
							 									Python 3.x
							 									Numpy
							 
Python 列表理解抛出一个运行时错误
									Python
							 									Python 3.x
							 									Python 2.7
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Bison
Routes
Spring Security
Video Streaming
Gcc
Debugging
Neural Network
Random
C++ Cli
Sed
Pip
Jquery Ui
Windows 8
Multithreading
Autocomplete
Mapreduce
Chart.js
Opengl
Syntax
Matlab
Proxy
Artificial Intelligence
Ipad
Tsql
Dns
Requirejs
Sencha Touch 2
Tkinter
Twitter Bootstrap
Google Apps Script
Amp Html
Hyperledger Fabric
Hive
Cuda
Entity Framework 4
Spring Integration
Tcl
Linker
Iis
Jsp
Inheritance
Android Ndk
Codenameone
Cmake
Snmp
Apache Storm
Active Directory
Visual Studio 2012
Gitlab
Wordpress
Css
Sails.js
Signalr
File Io
Blockchain
Vaadin
Actionscript 3
Openssl
Sprite Kit
Listview
Laravel 5
Hyperlink
Sql Server
Discord.js
Iis 7
Lisp
Sqlite
Puppet
Project Management
Keycloak
Wcf
Batch File
Magento
Parallel Processing
C++
Svn
Uitableview
Mariadb
Compiler Construction
Asp.net Mvc 4
Primefaces
Yocto
Selenium Webdriver
Extjs
Java Me
Jdbc
Certificate
Serialization
Text
Lua
Iphone
Actionscript
Date
Exception
Printing
Bash
Sml
Serial Port
Layout
Doxygen
Cryptography
Doctrine
Smalltalk
Ios
Udp
Windows Store Apps
Linkedin
Audio
Hybris
Cordova
Ios5
Oauth
Jar
Zsh
Nhibernate
Opencv
Tags
Object
Notepad++
For Loop
Session
Apache2
Windows Installer
Swift3
Google Cloud Platform
Qt4
Botframework
Shiny
Animation
Jms
Laravel
Unicode
Compilation
Google Drive Api
Visual Studio 2013
Flask
Loopbackjs
React Native
Ruby On Rails 3.1
Logging
Csv
Graphviz
Orchardcms
Xpath
Github
Chef Infra
Optimization
Dialogflow Es
Zurb Foundation
Google Api
Bootstrap 4
C++11
Rust
Webview
Com
Stored Procedures
Bluetooth
Web
Fiware
Apache Nifi
Web Applications
Tableau Api
Microsoft Graph Api
Haskell
Download
C#
Sql Server 2008
Markdown
Air
Datetime
Google Colaboratory
Memory
Jasmine
Google Maps
Zend Framework2
Ada
Dynamic
Pandas
Events
Gradle
Microservices
Directx
Uml
Windows Phone 8
Drools
Matrix
Couchbase
Geometry
Asp.net Mvc 5
Subsonic
Usb
Google Cloud Firestore
Collections
Nest
Android Studio
Lucene
Dependency Injection
Macos
Image Processing
Docusignapi
Filesystems


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网