将具有多个根的XML转换为DataFrame

将具有多个根的XML转换为DataFrame,r,xml,R,Xml,我有一个XML,我试图通过选择一个特定的根来将它转换成DF。我的XML: <?xml version="1.0" encoding="ISO-8859-1" ?> <test:TASS xmlns="http://www.vvv.com/schemas" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.vvv.com/schemas http://w

我有一个
XML
,我试图通过选择一个特定的根来将它转换成
DF
。我的
XML

<?xml version="1.0" encoding="ISO-8859-1" ?>


<test:TASS xmlns="http://www.vvv.com/schemas"  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  xsi:schemaLocation="http://www.vvv.com/schemas http://www.vvv.com/schemas/testV2_02_03.xsd"  xmlns:test="http://www.vvv.com/schemas" >
    <test:house>
                <test:billing>
                    <test:proceduresummary>
                        <test:guidenumber>X2030</test:guidenumber>
                            <test:diagnosis>
                                <test:table>ICD-10</test:table>
                                <test:diagnosiscod>J441</test:diagnosiscod>
                                <test:description>CHRONIC OBSTRUCTIVE PULMONARY DISEASE WITH (ACUTE) EXACERBATION</test:description>
                            </test:diagnosis>
                            <test:procedure>
                                <test:procedure>
                                    <test:description>HOSPITAL</test:description>
                                </test:procedure>
                                <test:amount>12</test:amount>
                            </test:procedure>
                    </test:proceduresummary>
                </test:billing>
                    <test:billing>
                    <test:proceduresummary>
                        <test:guidenumber>Y6055</test:guidenumber>
                            <test:diagnosis>
                                <test:table>ICD-10</test:table>
                                <test:diagnosiscod>I21</test:diagnosiscod>
                                <test:description>ACUTE MYOCARDIAL INFARCTION</test:description>
                            </test:diagnosis>
                            <test:procedure>
                                <test:procedure>
                                    <test:description>HOSPITAL</test:description>
                                </test:procedure>
                                <test:amount>8</test:amount>
                            </test:procedure>
                    </test:proceduresummary>
                </test:billing>
                    <test:billing>
                    <test:proceduresummary>
                        <test:guidenumber>Z9088</test:guidenumber>
                            <test:diagnosis>
                                <test:table>ICD-10</test:table>
                                <test:diagnosiscod>F20</test:diagnosiscod>
                                <test:description>SCHIZOPHRENIA</test:description>
                            </test:diagnosis>
                            <test:procedure>
                                <test:procedure>
                                    <test:description>HOSPITAL</test:description>
                                </test:procedure>
                                <test:amount>1</test:amount>
                            </test:procedure>
                    </test:proceduresummary>
                </test:billing>
    </test:house>
</test:TASS>
我试过:

require(tidyverse)
require(xml2)
setwd("D:/")
myxml<- read_xml("base.xml")
house <- myxml %>% xml_find_all("//house")
require(tidyverse)
require(xml2)
setwd(“D:/”)

myxml解决这个问题的方法是正确的,您的问题是错误地识别了节点的名称。在本例中,所有内容都以“测试:”开头:

库(xml2)
myxml%xml_text()
#由于有2个描述子节点,请在选择DiagnosticId之前选择正确的诊断子节点
诊断代码%xml\u find\u first(“.//测试:诊断”)%%>%xml\u find\u first(“.//测试:诊断代码”)%%>%xml\u text()
description%xml\u find\u first(“.//测试:description”)%%>%xml\u text()

回答您在解决此问题的正确轨道上,您的问题是错误地识别节点的名称。在本例中,所有内容都以“测试:”开头:

库(xml2)
myxml%xml_text()
#由于有2个描述子节点,请在选择DiagnosticId之前选择正确的诊断子节点
诊断代码%xml\u find\u first(“.//测试:诊断”)%%>%xml\u find\u first(“.//测试:诊断代码”)%%>%xml\u text()
description%xml\u find\u first(“.//测试:description”)%%>%xml\u text()

答:我试了几天,包括在这里发帖,但它被认为太笼统了。这就是我把这篇文章写得更详细的原因。我尝试过同时使用XML和xml2。我也使用了这个示例,但无法()。这就是我在这里寻求帮助的原因。@Parfait我用一个尝试的例子再次改善了我的怀疑。我尝试了几天,包括在这里发布了一个问题,但它被认为太普通了。这就是我把这篇文章写得更详细的原因。我尝试过同时使用XML和xml2。我也使用了这个示例,但无法()。这就是我在这里寻求帮助的原因。@Parfait我通过一个尝试的例子再次消除了我的疑虑。你如何更改节点名称?我想把“测试:”改为“测试2:”。谢谢你知道如何更改节点名称吗?我想把“测试:”改为“测试2:”。谢谢
require(tidyverse)
require(xml2)
setwd("D:/")
myxml<- read_xml("base.xml")
house <- myxml %>% xml_find_all("//house")
library(xml2)

myxml<-read_xml(' **.... Reading file from above.....** ')

#strip namesspaces.  #not needed in this case
#xml_ns_strip(myxml)

#find high level node containing all of the requested information
procedures<-myxml %>% xml_find_all(".//test:proceduresummary")

#extract the requested information from each node
#assumes only 1 subnode per parent
guidenumber<- procedures %>% xml_find_first(".//test:guidenumber") %>% xml_text()
#since there are 2 description sub-subnodes, select the correct diagnosis subnode before selecting diagnosicod
diagnosiscod <- procedures %>% xml_find_first(".//test:diagnosis")%>% xml_find_first(".//test:diagnosiscod") %>% xml_text()
description<- procedures %>% xml_find_first(".//test:description") %>% xml_text()

answer<-data.frame(guidenumber, diagnosiscod, description)
head(answer)