Xml 使用XPath检索节点的属性和标识符中带有冒号的属性
在R中使用XPathSapply,我尝试在edgar:url属性中检索url:Xml 使用XPath检索节点的属性和标识符中带有冒号的属性,xml,r,xpath,Xml,R,Xpath,在R中使用XPathSapply,我尝试在edgar:url属性中检索url: <edgar:xbrlFile edgar:sequence="3" edgar:file="edgr-2004_10k.xml" edgar:type="EX-100.INS" edgar:size="25257" edgar:description="" edgar:url="http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/
<edgar:xbrlFile edgar:sequence="3" edgar:file="edgr-2004_10k.xml" edgar:type="EX-100.INS" edgar:size="25257" edgar:description="" edgar:url="http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-2004_10k.xml" />
我尝试了以下几种变体:
url <- "http://www.sec.gov/Archives/edgar/monthly/xbrlrss-2005-04.xml"
data <- getURL(url)
doc <- xmlParse(data)
url <- xpathSApply(doc, "//item/*[name()='edgar:xbrlFiling']", xmlValue)
url使用XML
和使用xml2
(暂时只能安装github)都非常简单
XML
:
xpathSApply(doc, "//edgar:xbrlFile", xmlGetAttr, "edgar:url", namespaces="edgar")
xml2
:
library(xml2)
dat <- read_xml(url)
dat %>%
xml_find_all("//edgar:xbrlFile", ns=xml_ns(dat)) %>%
xml_attr("edgar:url", ns=xml_ns(dat))
我知道它必须比我做的要简单。谢谢可能重复:不是重复。问题是节点和属性是否有冒号。
library(xml2)
dat <- read_xml(url)
dat %>%
xml_find_all("//edgar:xbrlFile", ns=xml_ns(dat)) %>%
xml_attr("edgar:url", ns=xml_ns(dat))
## [1] "http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/eo2425.txt"
## [2] "http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/eo2425ex991.txt"
## [3] "http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-2004_10k.xml"
## [4] "http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-20050228.xsd"
## [5] "http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-20050228_cal.xml"
## [6] "http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-20050228_lab.xml"
## [7] "http://www.sec.gov/Archives/edgar/data/1080224/000127528705001434/edgr-20050228_pre.xml"
## [8] "http://www.sec.gov/Archives/edgar/data/29669/000119312505068717/d8k.htm"
## [9] "http://www.sec.gov/Archives/edgar/data/29669/000119312505068717/xrrd-20050331.xml"
## [10] "http://www.sec.gov/Archives/edgar/data/29669/000119312505068717/xrrd-20050331.xsd"
## [11] "http://www.sec.gov/Archives/edgar/data/29669/000119312505068717/xrrd-20050331_cal.xml"
## [12] "http://www.sec.gov/Archives/edgar/data/29669/000119312505068717/xrrd-20050331_lab.xml"
## [13] "http://www.sec.gov/Archives/edgar/data/29669/000119312505068717/xrrd-20050331_pre.xml"
## [14] "http://www.sec.gov/Archives/edgar/data/13610/000095012305004029/bne-20050404_8kfinal.htm"
## [15] "http://www.sec.gov/Archives/edgar/data/13610/000095012305004029/bne-20041231er.xml"
## [16] "http://www.sec.gov/Archives/edgar/data/13610/000095012305004029/bne-20050307er.xsd"
## [17] "http://www.sec.gov/Archives/edgar/data/13610/000095012305004029/bne-20050307er_pre.xml"
## [18] "http://www.sec.gov/Archives/edgar/data/13610/000095012305004029/bne-20050307er_lab.xml"
## [19] "http://www.sec.gov/Archives/edgar/data/13610/000095012305004029/bne-20050307er_cal.xml"