Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/xml/15.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R:在R中解析XML_R_Xml_Parsing - Fatal编程技术网

R:在R中解析XML

R:在R中解析XML,r,xml,parsing,R,Xml,Parsing,我有一个如下的XML文件 <?xml version="1.0" encoding="ISO-8859-1" standalone="no"?> <t:Forecast xmlns:t="http://example.com"> <Sender Abbreviation="abc" Name="xyz"/> <Recipient Abbreviation="efg" Name="cba"/> <create

我有一个如下的XML文件

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<t:Forecast xmlns:t="http://example.com">
      <Sender Abbreviation="abc" Name="xyz"/>
      <Recipient Abbreviation="efg" Name="cba"/>
      <createdUTC>2017-11-24T10:41:11Z</createdUTC>
      <MessageID>bcjs</MessageID>
      <SystemState>test</SystemState>
      <ForecastData>
          <DataHeader GroupKey="rkolo">

          <Timeseries ID="abc123">
              <TimeInt ISTUTC="2017-11-24T10:45:00Z" Out="858"/>
              <TimeInt ISTUTC="2017-11-24T11:45:00Z" Out="868"/>
          </Timeseries>

          <Timeseries ID="xyz">
              <TimeInt ISTUTC="2017-11-24T10:45:00Z" Out="870"/>
              <TimeInt ISTUTC="2017-11-24T11:45:00Z" Out="890"/>
          </Timeseries>
      </ForecastData>
</t:Forecast>
另一个数据帧如图所示

TimeInt                 out
2017-11-24T10:45:00Z    870
2017-11-24T11:45:00Z    890
到目前为止,我已经做了以下工作:

require(XML)

temp = xmlParse("datafile.xml")
data = xmlToList(temp)
但是
数据的输出
包含许多嵌套列表。如何获取数据帧


编辑1:changed
out

考虑三重冒号方法
xmlatrstodataframe
,但循环遍历时间序列的每个节点索引,甚至使用相应的时间序列id命名每个元素


再次感谢
Out
不会像那样作为组指示符配对。
require(XML)

temp = xmlParse("datafile.xml")
data = xmlToList(temp)
library(XML)

txt='<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
     <t:Forecast xmlns:t="http://example.com">
        <Sender Abbreviation="abc" Name="xyz"/>
        <Recipient Abbreviation="efg" Name="cba"/>
        <createdUTC>2017-11-24T10:41:11Z</createdUTC>
        <MessageID>bcjs</MessageID>
        <SystemState>test</SystemState>
        <ForecastData>
           <DataHeader GroupKey="rkolo"/>
             <Timeseries ID="abc123">
                <TimeInt ISTUTC="2017-11-24T10:45:00Z" Out="858"/>
                <TimeInt ISTUTC="2017-11-24T11:45:00Z" Out="858"/>
             </Timeseries>

             <Timeseries ID="xyz">
                <TimeInt ISTUTC="2017-11-24T10:45:00Z" Out="870"/>
                <TimeInt ISTUTC="2017-11-24T11:45:00Z" Out="870"/>
             </Timeseries>
        </ForecastData>
     </t:Forecast>'

doc <- xmlParse(txt)

dfList <- lapply(1:length(xpathSApply(doc, "//Timeseries", xmlAttrs)), function(i)
    XML:::xmlAttrsToDataFrame(getNodeSet(doc, path=paste0('//Timeseries[',i,']/TimeInt')))
)

dfList <- setNames(dfList, xpathSApply(doc, path='//Timeseries', xmlAttrs))
dfList
dfList$abc123
#                 ISTUTC Out
# 1 2017-11-24T10:45:00Z 858
# 2 2017-11-24T11:45:00Z 858

dfList$xyz
#                 ISTUTC Out
# 3 2017-11-24T10:45:00Z 870
# 4 2017-11-24T11:45:00Z 870