在R中解析XML,遇到不同的行错误

在R中解析XML,遇到不同的行错误,r,xml,R,Xml,我想这个问题以前可能有人问过,但经过研究,我什么也找不到。我不熟悉解析XML文档。我正在尝试解析如下所示的XML页面: schedule = xmlParse("MYXML.XML") # here's what schedule looks like <all-games> <game-schedule> <team name="Knicks"> <outcome winner="OtherTeam"> </gam

我想这个问题以前可能有人问过,但经过研究,我什么也找不到。我不熟悉解析XML文档。我正在尝试解析如下所示的XML页面:

schedule = xmlParse("MYXML.XML")

# here's what schedule looks like
<all-games>
  <game-schedule>
    <team name="Knicks">
    <outcome winner="OtherTeam">
  </game-schedule>
  <game-schedule>
    <team name="Lakers">
    <outcome winner="HomeTeam">
  </game-schedule>
  <game-schedule>
    <team name="Celtics">
  </game-schedule>
</all-games>


# here's my code to parse the XML
my_df = data.frame(
  team = sapply(schedule["//game-schedule/team/@name"], as, "character"),
  winner = sapply(schedule["//game-schedule/outcome/@winner"], as, "character")
)
我想解析数据帧,这样丢失的子项就可以简单地作为NA填充。也就是说,我正在尝试获取以下数据帧:

my_df
      team      winner
1   Knicks   OtherTeam
2   Lakers    HomeTeam
3  Celtics          NA

NA在XML文档中反映出游戏尚未进行。

如果缺少标记,您需要一个可以返回NA的包装器,类似下面的
xpath2
xpathsaply
。然后获取节点并在当前节点的任何位置应用
xpath2
。//”

xpath2
my_df
      team      winner
1   Knicks   OtherTeam
2   Lakers    HomeTeam
3  Celtics          NA
xpath2 <-function(x, ...){
    y <- xpathSApply(x, ...)
    ifelse(length(y) == 0, NA,  paste(y, collapse=", "))
}
nd <- getNodeSet(schedule, "//game-schedule")   
data.frame(
   team = sapply(nd, xpath2, ".//team", xmlGetAttr, "name"),
 winner = sapply(nd, xpath2, ".//outcome", xmlGetAttr, "winner")
)   
team    winner
1  Knicks OtherTeam
2  Lakers  HomeTeam
3 Celtics      <NA>