大型lvl-xml-to-r数据提取
我需要从xml文件中提取数据值。我尝试使用xmlToList和xmlTodataframe,但失败了,因为我收到一个空列表()。我需要帮助,因为这是有效的,但它不适用于我的情况 我的xml文件看起来像大型lvl-xml-to-r数据提取,xml,r,xml-parsing,Xml,R,Xml Parsing,我需要从xml文件中提取数据值。我尝试使用xmlToList和xmlTodataframe,但失败了,因为我收到一个空列表()。我需要帮助,因为这是有效的,但它不适用于我的情况 我的xml文件看起来像 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <timeSeriesResponse xmlns="http://www.cuahsi.org/waterML/1.1/"> <queryInfo>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<timeSeriesResponse xmlns="http://www.cuahsi.org/waterML/1.1/">
<queryInfo>
<creationTime>2015-07-14T10:35:39.452+00:00</creationTime>
<criteria MethodCalled="GetValues">
<parameter value="S:F006875" name="site"/>
<parameter value="S:3047695" name="variable"/>
<parameter value="2014-08-25T00:00:00" name="startDate"/>
<parameter value="2014-08-29T00:00:00" name="endDate"/>
</criteria>
</queryInfo>
<timeSeries>
<sourceInfo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="SiteInfoType">
<siteName>Central - Ca l'Espona (A)</siteName>
<siteCode network="STR">F006875</siteCode>
</sourceInfo>
<variable>
<variableCode default="true" vocabulary="STR">3047695</variableCode>
<variableName>Potència T1-BOBITÈCNIC</variableName>
</variable>
<values>
<value qualityControlLevelCode="0" sourceCode="1" methodCode="0" dateTimeUTC="2014-08-26T18:15:00+00:00" timeOffset="+01:00" dateTime="2014-08-26T18:15:00+00:00" censorCode="nc">452</value>
<value qualityControlLevelCode="0" sourceCode="1" methodCode="0" dateTimeUTC="2014-08-26T18:45:00+00:00" timeOffset="+01:00" dateTime="2014-08-26T18:45:00+00:00" censorCode="nc">456</value>
<value qualityControlLevelCode="0" sourceCode="1" methodCode="0" dateTimeUTC="2014-08-26T19:15:00+00:00" timeOffset="+01:00" dateTime="2014-08-26T19:15:00+00:00" censorCode="nc">460</value>
<value qualityControlLevelCode="0" sourceCode="1" methodCode="0" dateTimeUTC="2014-08-27T02:30:00+00:00" timeOffset="+01:00" dateTime="2014-08-27T02:30:00+00:00" censorCode="nc">464</value>
<value qualityControlLevelCode="0" sourceCode="1" methodCode="0" dateTimeUTC="2014-08-27T02:45:00+00:00" timeOffset="+01:00" dateTime="2014-08-27T02:45:00+00:00" censorCode="nc">460</value>
<value qualityControlLevelCode="0" sourceCode="1" methodCode="0" dateTimeUTC="2014-08-27T03:00:00+00:00" timeOffset="+01:00" dateTime="2014-08-27T03:00:00+00:00" censorCode="nc">460</value>
<value qualityControlLevelCode="0" sourceCode="1" methodCode="0" dateTimeUTC="2014-08-25T13:30:00+00:00" timeOffset="+01:00" dateTime="2014-08-25T13:30:00+00:00" censorCode="nc">468</value>
<value qualityControlLevelCode="0" sourceCode="1" methodCode="0" dateTimeUTC="2014-08-25T13:45:00+00:00" timeOffset="+01:00" dateTime="2014-08-25T13:45:00+00:00" censorCode="nc">472</value>
<qualityControlLevel qualityControlLevelID="0">
<qualityControlLevelCode>0</qualityControlLevelCode>
<definition>Raw data</definition>
<explanation/>
</qualityControlLevel>
<method methodID="0">
<methodCode>0</methodCode>
<methodDescription>Not defined</methodDescription>
</method>
</values>
</timeSeries>
2015-07-14T10:35:39.452+00:00
中环-埃斯波纳(A)
F006875
3047695
Potència T1-BOBITÈCNIC
452
456
460
464
460
460
468
472
0
原始数据
0
未定义
谢谢以下是使用
rvest
软件包的解决方案:
library(rvest)
res<-read_html("yourxmldoc.xml")
res %>%
html_nodes("value")%>%
html_text()
[1] "452" "456" "460" "464" "460" "460" "468" "472"
库(rvest)
res%
html_节点(“值”)%%>%
html_text()
[1] "452" "456" "460" "464" "460" "460" "468" "472"
什么是数据值?这里有很多价值吗?请具体说明。@user227710在每个**452**的末尾,您是指452、456、460…472?如果是,请检查我的答案。