在线目录中的rbind txt文件（R）_R_Concatenation_Rbind_Stringr

在线目录中的rbind txt文件（R）

在线目录中的rbind txt文件（R）,r,concatenation,rbind,stringr,R,Concatenation,Rbind,Stringr,我正在尝试从url连接文本文件，但我不知道如何处理html和不同的文件夹这是我尝试过的代码，但它只列出了文本文件，并且有很多html代码，比如我如何解决这个问题，以便我可以将文本文件合并到一个csv文件中 library(RCurl) url <- "http://weather.ggy.uga.edu/data/daily/" dir <- getURL(url, dirlistonly = T) filenames <- unlist(strsplit(dir,"\n")

我正在尝试从url连接文本文件，但我不知道如何处理html和不同的文件夹

这是我尝试过的代码，但它只列出了文本文件，并且有很多html代码，比如我如何解决这个问题，以便我可以将文本文件合并到一个csv文件中

library(RCurl)
url <- "http://weather.ggy.uga.edu/data/daily/"
dir <- getURL(url, dirlistonly = T)
filenames <- unlist(strsplit(dir,"\n")) #split into filenames
#append the files one after another
for (i in 1:length(filenames)) {
file <- past(url,filenames[i],delim='') #concatenate for urly 
if (i==1){
cp <- read_delim(file, header=F, delim=',')
}
else{
temp <- read_delim(file,header=F,delim=',')
cp <- rbind(cp,temp) #append to existing file
rm(temp)# remove the temporary file
}
}

库（RCurl）
url这是我为自己编写的代码片段。我喜欢在RCurl上使用rvest，因为这是我学到的。在本例中，我能够使用html\u节点
函数隔离以.txt结尾的每个文件。结果表将时间保存为字符串，但您可以稍后修复。如果你有任何问题，请告诉我
library(rvest)
library(readr)

url <- "http://weather.ggy.uga.edu/data/daily/"

doc <- xml2::read_html(url)
text <- rvest::html_text(rvest::html_nodes(doc, "tr td a:contains('.txt')"))


# define column types of fwf data ("c" = character, "n" = number)
ctypes <- paste0("c", paste0(rep("n",11), collapse = ""))
data <- data.frame()

for (i in 1:2){
  file <- paste0(url, text[1])

  date <- as.Date(read_lines(file, n_max = 1), "%m/%d/%y")

  # Read file to determine widths
  columns <- fwf_empty(file, skip = 3)

  # Manually expand `solar` column to be 3 spaces wider
  columns$begin[8] <- columns$begin[8] - 3

  data <- rbind(data, cbind(date,read_fwf(file, columns, 
                                          skip = 3, col_types = ctypes)))
}

库（rvest）
图书馆（readr）
非常感谢您的帮助！您好，由于某些原因，第9列（日光）的值不正确。它只有最后一个数字。这是唯一受影响的专栏。有没有办法解决这个问题？这似乎是fwf_empty
的一个问题，该函数决定列的结尾和开头。我创建了一个手动修复程序，虽然不理想，但应该适合您。