如何在R中访问Https页面_R - Fatal编程技术网

如何在R中访问Https页面

如何在R中访问Https页面,r,R,在R中编码，我试图解析一些来自网站的信息，比如linkedin页面。linkedin的url是 url=“” 我可以使用readLines和XML包来收集我需要的信息。然而，这个url变成了 url=“” 读线功能失败读线（url）文件（con，“r”）中出错：无法打开连接文件中（con，“r”）：不支持的URL方案如果url在R中是https，您知道如何读取web信息吗？非常感谢只需在readLines之前使用setInternet2（TRUE） setInternet2(TRUE)

在R中编码，我试图解析一些来自网站的信息，比如linkedin页面。linkedin的url是

url=“”

我可以使用readLines和XML包来收集我需要的信息。然而，这个url变成了

url=“”

读线功能失败

读线（url）文件（con，“r”）中出错：无法打开连接文件中（con，“r”）：不支持的URL方案

如果url在R中是https，您知道如何读取web信息吗？非常感谢

只需在

readLines

之前使用

setInternet2（TRUE）

setInternet2(TRUE)
web_page <- readLines("https://www.linkedin.com/in/lillyzhu")

setInternet2（真）
网页只需在readLines
之前使用setInternet2（TRUE）

setInternet2(TRUE)
web_page <- readLines("https://www.linkedin.com/in/lillyzhu")

setInternet2（真）
网页您是否尝试过R软件包httr？简单到：
library('httr')    
content(GET('https://www.linkedin.com/in/lillyzhu'))

您试过R包httr吗？简单到：
library('httr')    
content(GET('https://www.linkedin.com/in/lillyzhu'))

readLines（）
不支持“https”。但是，使用包“RCurl”获取URL内容，使用包XML解析html文本，就可以很容易地做到这一点
library(XML)
library(RCurl)

content <- getURL("https://www.linkedin.com/in/lillyzhu")
doc <- htmlParse(content, asText = TRUE)

summary(doc)    
# $nameCounts
# 
#    span      div       li   script        a       br     meta        p 
#     104       92       79       77       73       22       19       14 
#    time       h5      img     link       h3       ul       h4   header 
#      14       13       13       10        9        8        7        7 
#      h2   strong       ol       td       th       tr    input noscript 
#       5        5        4        4        4        4        3        3 
#  button       dd       dt   iframe    label     body       dl       em 
#       2        2        2        2        2        1        1        1 
#    form       h1     head       hr     html    table    title 
#       1        1        1        1        1        1        1 
# 
# $numNodes
# [1] 613

库（XML）
图书馆（RCurl）
readLines（）
不支持内容“https”。但是，使用包“RCurl”获取URL内容，使用包XML解析html文本，就可以很容易地做到这一点
library(XML)
library(RCurl)

content <- getURL("https://www.linkedin.com/in/lillyzhu")
doc <- htmlParse(content, asText = TRUE)

summary(doc)    
# $nameCounts
# 
#    span      div       li   script        a       br     meta        p 
#     104       92       79       77       73       22       19       14 
#    time       h5      img     link       h3       ul       h4   header 
#      14       13       13       10        9        8        7        7 
#      h2   strong       ol       td       th       tr    input noscript 
#       5        5        4        4        4        4        3        3 
#  button       dd       dt   iframe    label     body       dl       em 
#       2        2        2        2        2        1        1        1 
#    form       h1     head       hr     html    table    title 
#       1        1        1        1        1        1        1 
# 
# $numNodes
# [1] 613

库（XML）
图书馆（RCurl）
内容智能。这是迄今为止我得到的最简单的解决方案。非常感谢。这只适用于R 3.3.0的Windows用户。setInternet2现在已经失效。因此，此解决方案对于R3.3.0或更高版本不再可行。这是迄今为止我得到的最简单的解决方案。非常感谢。这只适用于R 3.3.0的Windows用户。setInternet2现在已经失效。因此，对于R3.3.0或更高版本，此解决方案不再可行。谢谢。然而，使用这种方法，我只得到了而不是悟性，你得到了内容，它只是为了显示而截断了响应。谢谢。但是，使用这种方法，我只得到了内容，而不是Savvy，它只是为了显示而截断响应。