R 从网站下载具有特定扩展名的文件_R_Netcdf_Rcurl

R 从网站下载具有特定扩展名的文件

R 从网站下载具有特定扩展名的文件,r,netcdf,rcurl,R,Netcdf,Rcurl,如何下载网页的内容并查找其上列出的具有特定扩展名的所有文件。然后下载它们。例如，我想从以下网页下载所有netcdf文件（扩展名为*.nc4）：建议我查看Rcurl包，但找不到如何执行此操作。库（stringr） library(stringr) # Get the context of the page thepage = readLines('https://data.giss.nasa.gov/impacts/agmipcf/agmerra/') # Find the lines t

如何下载网页的内容并查找其上列出的具有特定扩展名的所有文件。然后下载它们。例如，我想从以下网页下载所有netcdf文件（扩展名为*.nc4）：

建议我查看Rcurl包，但找不到如何执行此操作。

库（stringr）
library(stringr)

# Get the context of the page
thepage = readLines('https://data.giss.nasa.gov/impacts/agmipcf/agmerra/')

# Find the lines that contain the names for netcdf files
nc4.lines <- grep('*.nc4', thepage) 

# Subset the original dataset leaving only those lines
thepage <- thepage[nc4.lines]

#extract the file names
str.loc <- str_locate(thepage,'A.*nc4?"')

#substring
file.list <- substring(thepage,str.loc[,1], str.loc[,2]-1)

# download all files
for ( ifile in file.list){
 download.file(paste0("https://data.giss.nasa.gov/impacts/agmipcf/agmerra/",
                      ifile),
               destfile=ifile, method="libcurl")

#获取页面的上下文
页面=读线（'https://data.giss.nasa.gov/impacts/agmipcf/agmerra/')
#查找包含netcdf文件名称的行
你试过写代码吗？请阅读并更新您的帖子。例如，在查看网页时，您的问题可能是关于删除网页以检索所有文件名，或者可能是关于如何下载一组文件，或者两者都有，或者两者都没有。一个可验证的例子将使人们更容易帮助你。以下是你可能想看的和。在您尝试了一些代码之后，如果仍然不起作用，请告诉我们。谢谢。这是有用的参考资料。我将进行读取。我收到以下错误消息正在尝试URL'https://data.giss.nasa.gov/impacts/agmipcf/agmerra/AgMERRA_2010_wndspd.nc'下载文件中出错（0（'https://data.giss.nasa.gov/impacts/agmipcf/agmerra/"，：无法打开URL'https://data.giss.nasa.gov/impacts/agmipcf/agmerra/AgMERRA_2010_wndspd.nc'此外：警告消息：在download.file（粘贴0（'https://data.giss.nasa.gov/impacts/agmipcf/agmerra/"，：URL'https://data.giss.nasa.gov/impacts/agmipcf/agmerra/AgMERRA_2010_wndspd.nc“：状态为“SSL连接错误”
你能试试file.list hmm吗？我仍然有这个问题。但是我知道如何使用这个功能从internet下载文件，所以我会接受这个答案。