R 将XML文件转换为数据帧
我想读XML 860文件 所有xml文件都具有该结构R 将XML文件转换为数据帧,r,xml,xml-parsing,R,Xml,Xml Parsing,我想读XML 860文件 所有xml文件都具有该结构 <ip id_pac="48"> <rodcis>48</rodcis> <jmeno>Andrew</jmeno> <prijmeni>Mazal</prijmeni> <titul_pred></titul_pred> <titul_za>&
<ip id_pac="48">
<rodcis>48</rodcis>
<jmeno>Andrew</jmeno>
<prijmeni>Mazal</prijmeni>
<titul_pred></titul_pred>
<titul_za></titul_za>
<dat_dn format="D">1999-06-21</dat_dn>
<dat_de format="D"></dat_de>
<sex>M</sex>
<rod_prijm></rod_prijm>
<a typ="1">
<dat_od format="D">2020-09-17</dat_od>
</a>
</ip>
48
安得烈
马扎尔
1999-06-21
M
2020-09-17
我想得到一个数据框,其中一列是“rodcis”(在本例中为48),第二列是“dat_od”(在本例中为2020-09-17)
我正在尝试这个
files下面是一个使用xml2包的可能解决方案。请参阅代码中的注释,以了解分步指南
library(xml2)
library(dplyr)
#loop through the file list with lapply
dfs <-lapply(files, function(file) {
#read file
page <- read_xml(file)
#get id for each file
id <- xml_find_first(out, "//ip") %>% xml_attr("id_pac")
#get information from each the requested nodes
rodcis <- xml_find_first(out, ".//rodcis") %>% xml_text()
datod <- xml_find_first(out, ".//dat_od") %>% xml_text()
#make data frame of results for each file
data.frame(id, rodcis, datod)
})
#combine all results into 1 data frame
answer <- bind_rows(dfs)
库(xml2)
图书馆(dplyr)
#使用lappy循环浏览文件列表
dfs
out <- lapply(files, xmlParse)
dataframe <- do.call(rbind, lapply(out, function(x) rootnode[[1]][[2]], rootnode[[2]][[1]]))
library(xml2)
library(dplyr)
#loop through the file list with lapply
dfs <-lapply(files, function(file) {
#read file
page <- read_xml(file)
#get id for each file
id <- xml_find_first(out, "//ip") %>% xml_attr("id_pac")
#get information from each the requested nodes
rodcis <- xml_find_first(out, ".//rodcis") %>% xml_text()
datod <- xml_find_first(out, ".//dat_od") %>% xml_text()
#make data frame of results for each file
data.frame(id, rodcis, datod)
})
#combine all results into 1 data frame
answer <- bind_rows(dfs)