CRAN中R内所有包装的列表和说明_R

CRAN中R内所有包装的列表和说明

CRAN中R内所有包装的列表和说明,r,R,我可以通过该功能获得所有可用软件包的列表： ap <- available.packages() ap事实上，我认为您需要“包”和“标题”，因为“描述”可以运行到几行。这里是前者，如果你真的想要“Description”，只需在最后一个子集中添加“Description”： R>来自http://developer.r-project.org/CRAN/Scripts/depends.R 适应 R> R> 需要（“工具”） R> R> getPackagesWithTitleDirk提供

我可以通过该功能获得所有可用软件包的列表：

ap <- available.packages()

ap事实上，我认为您需要“包”和“标题”，因为“描述”可以运行到几行。这里是前者，如果你真的想要“Description”，只需在最后一个子集中添加“Description”：
R>来自http://developer.r-project.org/CRAN/Scripts/depends.R 适应
R>
R> 需要（“工具”）
R>
R> getPackagesWithTitleDirk提供了一个非常好的答案，在完成我的解决方案，然后看到他的解决方案后，我花了一段时间讨论了发布我的解决方案，生怕看起来很傻。但出于两个原因，我还是决定发布它：
这对像我这样的新手来说是有益的
我花了一段时间才完成，为什么不：）
我开始这样想，我需要做一些网页抓取，并选择crantastic作为抓取的网站。首先，我将提供代码，然后提供两个对我非常有帮助的资源：
library(RCurl)
library(XML)

URL <- "http://cran.r-project.org/web/checks/check_summary.html#summary_by_package"
packs <- na.omit(XML::readHTMLTable(doc = URL, which = 2, header = T, 
    strip.white = T, as.is = FALSE, sep = ",", na.strings = c("999", 
        "NA", " "))[, 1])
Trim <- function(x) {
    gsub("^\\s+|\\s+$", "", x)
}
packs <- unique(Trim(packs))
u1 <- "http://crantastic.org/packages/"
len.samps <- 10 #for demo purpose; use:
#len.samps <- length(packs) # for all of them
URL2 <- paste0(u1, packs[seq_len(len.samps)]) 
scraper <- function(urls){ #function to grab description
    doc   <- htmlTreeParse(urls, useInternalNodes=TRUE)
    nodes <- getNodeSet(doc, "//p")[[3]]
    return(nodes)
}
info <- sapply(seq_along(URL2), function(i) try(scraper(URL2[i]), TRUE))
info2 <- sapply(info, function(x) { #replace errors with NA
        if(class(x)[1] != "XMLInternalElementNode"){
            NA
        } else {
            Trim(gsub("\\s+", " ", xmlValue(x)))
        }
    }
)
pack_n_desc <- data.frame(package=packs[seq_len(len.samps)], 
    description=info2) #make a dataframe of it all

库（RCurl）
库（XML）
URL我想尝试使用HTML scraper（）作为练习，因为OP中的available.packages（）
不包含包描述
library（'rvest'）
我似乎记得以前有人问过这个问题。您尝试过哪些搜索？以前曾询问过一些类似的问题（例如：按名称长度分类，添加包的日期），我还查看了可用的包（）函数查看是否有用于说明的可选参数，但我尚未找到解决方案。解决方案将包括添加参数fields=“description”
，或者更好的fields=“Title”
，但是它似乎没有从我使用的可用的CRAN镜像中返回任何东西。它可以与已安装的一起使用。但是，软件包
。不幸的是，在尝试打印或查看整个df时，会遇到相同的逐列显示问题。我在自己的回复帖子中描述了这一点。现在是2018年，事情变得更容易了。试着从什么db开始
library(RCurl)
library(XML)

URL <- "http://cran.r-project.org/web/checks/check_summary.html#summary_by_package"
packs <- na.omit(XML::readHTMLTable(doc = URL, which = 2, header = T, 
    strip.white = T, as.is = FALSE, sep = ",", na.strings = c("999", 
        "NA", " "))[, 1])
Trim <- function(x) {
    gsub("^\\s+|\\s+$", "", x)
}
packs <- unique(Trim(packs))
u1 <- "http://crantastic.org/packages/"
len.samps <- 10 #for demo purpose; use:
#len.samps <- length(packs) # for all of them
URL2 <- paste0(u1, packs[seq_len(len.samps)]) 
scraper <- function(urls){ #function to grab description
    doc   <- htmlTreeParse(urls, useInternalNodes=TRUE)
    nodes <- getNodeSet(doc, "//p")[[3]]
    return(nodes)
}
info <- sapply(seq_along(URL2), function(i) try(scraper(URL2[i]), TRUE))
info2 <- sapply(info, function(x) { #replace errors with NA
        if(class(x)[1] != "XMLInternalElementNode"){
            NA
        } else {
            Trim(gsub("\\s+", " ", xmlValue(x)))
        }
    }
)
pack_n_desc <- data.frame(package=packs[seq_len(len.samps)], 
    description=info2) #make a dataframe of it all