Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/69.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 我如何建立一个搜索查询,通过谷歌搜索获取我的数据?_R_Google Chrome_Search_Google Search - Fatal编程技术网

R 我如何建立一个搜索查询,通过谷歌搜索获取我的数据?

R 我如何建立一个搜索查询,通过谷歌搜索获取我的数据?,r,google-chrome,search,google-search,R,Google Chrome,Search,Google Search,我有一个项目,我正在工作,我需要拉数据有关特定公园在佛罗里达州。例如,我在这篇文章中的问题是,我如何编程R,通过google进行搜索查询以获取区域。当我在谷歌搜索中输入“wekiva springs州立公园面积(单位:公顷)”时,我从页面顶部的“2833公顷”中得到了一个实际值。现在我有了52个公园的列表: structure(list(`unique(df$ParkName)` = structure(c(14L, 47L, 39L, 12L, 9L, 20L, 5L, 10L, 25L,

我有一个项目,我正在工作,我需要拉数据有关特定公园在佛罗里达州。例如,我在这篇文章中的问题是,我如何编程R,通过google进行搜索查询以获取区域。当我在谷歌搜索中输入“wekiva springs州立公园面积(单位:公顷)”时,我从页面顶部的“2833公顷”中得到了一个实际值。现在我有了52个公园的列表:

structure(list(`unique(df$ParkName)` = structure(c(14L, 47L, 
39L, 12L, 9L, 20L, 5L, 10L, 25L, 28L, 36L, 30L, 31L, 43L, 4L, 
35L, 44L, 48L, 51L, 6L, 21L, 32L, 38L, 42L, 1L, 41L, 27L, 45L, 
46L, 50L, 18L, 37L, 24L, 26L, 13L, 52L, 15L, 2L, 17L, 11L, 22L, 
34L, 49L, 16L, 40L, 7L, 8L, 29L, 33L, 3L, 23L, 19L), .Label = c("Alafia River State Park", 
"Amelia Island State Park", "Big Cypress National Park", "Big Talbot Island State Park", 
"Bill Baggs Cape Florida State Park", "Blue Spring State Park", 
"Caladesi Island State Park", "Cayo Costa State Park", "Collier-Seminole State Park", 
"Curry Hammock State Park", "Dade Battlefield Historic State Park", 
"De Leon Springs State Park", "Delanor-Wiggins Pass State Park", 
"Fakahatchee Strand Preserve State Park", "Faver-Dykes State Park", 
"Fort Cooper State Park", "Fort George Island Cultural State Park", 
"Fort Pierce Inlet State Park/Avalon State Park", "Fort Zachary Taylor Historic State Park", 
"Highlands Hammock State Park", "Hillsborough River State Park", 
"Honeymoon Island State Park", "Hugh Taylor Birch State Park", 
"John D. MacArthur Beach State Park", "John Pennekamp Coral Reef State Park/Key Largo Hammocks", 
"John U. Lloyd Beach State Park", "Jonathan Dickinson State Park", 
"Key Largo Hammocks", "Koreshan State Historic Site", "Lake Griffin State Park", 
"Lake Kissimmee State Park", "Lake Manatee State Park", "Lake Wales Ridge Geopark", 
"Little Manatee River State Park", "Little Talbot Island State Park", 
"Long Key State Park", "Lovers Key State Park", "Myakka River State Park", 
"Ocala National Forest", "Oleta River State Park", "Oscar Scherer State Park", 
"Paynes Creek Historic State Park", "Paynes Prairie Preserve State Park", 
"Pumpkin Hill Creek Preserve State Park", "Savannas Preserve State Park", 
"Seabranch Preserve State Park", "Sebastian Inlet State Park", 
"Talbot Islands State Parks", "Terra Ceia Preserve State Park", 
"Tosohatchee Wildlife Management Area", "Washington Oaks Gardens State Park", 
"Werner-Boyce Salt Springs State Park"), class = "factor")), .Names = "unique(df$ParkName)", row.names = c(NA, 
-52L), class = "data.frame")
我可以手动在谷歌搜索栏中输入每一个公园的名字,但我真的很想知道如何建立一个搜索查询,这样我就可以将它应用到未来的项目中。问题是,当涉及到构建任何如此复杂的东西时,我有点不知所措。我最近才开始学习“API”之类的东西“是等


如果您有任何帮助,我们将不胜感激。

要使用rvest软件包进行web抓取,结果在很大程度上取决于每个查询,因为并非所有查询都能返回页面顶部的值

library(rvest)


 parks <- data.frame(name = c("wekiva springs state park", "cayo costa 
                 state park"))

  url  <- "http://www.google.com"

  s <- html_session(url)
  search <- html_form(s)[[1]]
  for(i in 1:dim(parks)[1]){
    query <- paste("area of",parks[i,1], "in hectares")
    a <- set_values(search, q = query)

    session <- submit_form(s, a) 
    s1 <- html_nodes(session, "#res")
    result <- html_text(s1)

    parks$area[i] <- gsub("([A-Za-z]+).*", "\\1", result)
  }

  parks

                    name     area
1 wekiva springs state park 2.833 ha
2     cayo costa state park 1.014 ha 
库(rvest)

parks要使web抓取使用rvest包,结果在很大程度上取决于每个查询,因为并非所有查询都可以返回页面顶部的值

library(rvest)


 parks <- data.frame(name = c("wekiva springs state park", "cayo costa 
                 state park"))

  url  <- "http://www.google.com"

  s <- html_session(url)
  search <- html_form(s)[[1]]
  for(i in 1:dim(parks)[1]){
    query <- paste("area of",parks[i,1], "in hectares")
    a <- set_values(search, q = query)

    session <- submit_form(s, a) 
    s1 <- html_nodes(session, "#res")
    result <- html_text(s1)

    parks$area[i] <- gsub("([A-Za-z]+).*", "\\1", result)
  }

  parks

                    name     area
1 wekiva springs state park 2.833 ha
2     cayo costa state park 1.014 ha 
库(rvest)

公园感谢你,并且很好地说明了不是所有的公园都会把它们的区域放在顶端。也谢谢你链接到wickam的博客@不客气。任何问题我都听候你的吩咐。谢谢你,关于不是所有的公园都会把它们的区域放在顶端的问题,我说得很好。也谢谢你链接到wickam的博客@不客气。任何问题我都听你的。