SSL验证导致RCurl和httr在本应合法的网站上中断

SSL验证导致RCurl和httr在本应合法的网站上中断,r,curl,ssl,rcurl,httr,R,Curl,Ssl,Rcurl,Httr,我正在尝试自动登录英国的数据存档服务。那个网站显然是值得信赖的。不幸的是,RCurl和httr在SSL验证时都会中断。我的网络浏览器不会给出任何警告。我可以在RCurl中使用ssl.verifypeer=FALSE来解决这个问题,但我想了解发生了什么 # breaks library(httr) GET( "https://www.esds.ac.uk/secure/UKDSRegister_start.asp" ) # breaks library(RCurl) cert <- sys

我正在尝试自动登录英国的数据存档服务。那个网站显然是值得信赖的。不幸的是,
RCurl
httr
在SSL验证时都会中断。我的网络浏览器不会给出任何警告。我可以在
RCurl
中使用
ssl.verifypeer=FALSE
来解决这个问题,但我想了解发生了什么

# breaks
library(httr)
GET( "https://www.esds.ac.uk/secure/UKDSRegister_start.asp" )

# breaks
library(RCurl)
cert <- system.file("CurlSSL/cacert.pem", package = "RCurl")
getURL("https://www.esds.ac.uk/secure/UKDSRegister_start.asp",cainfo = cert)

# works
library(RCurl)
getURL(
    "https://www.esds.ac.uk/secure/UKDSRegister_start.asp" , 
    .opts = list(ssl.verifypeer = FALSE)
) # note: use list(ssl.verifypeer = FALSE,followlocation=TRUE) to see content
#中断
图书馆(httr)
得到(”https://www.esds.ac.uk/secure/UKDSRegister_start.asp" )
#中断
图书馆(RCurl)
证书TL;博士
从TERENA处获取参数,并将此文件用作您的
cainfo
参数

编辑:您可能需要在该文件的开头添加两行。代码使用以下
TERENA.pem
文件为我工作:

TERENA
======
-----BEGIN CERTIFICATE-----
MIIEmDCCA4CgAwIBAgIQS8gUAy8H+mqk8Nop32F5ujANBgkqhkiG9w0BAQUFADCB
lzELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAlVUMRcwFQYDVQQHEw5TYWx0IExha2Ug
Q2l0eTEeMBwGA1UEChMVVGhlIFVTRVJUUlVTVCBOZXR3b3JrMSEwHwYDVQQLExho
dHRwOi8vd3d3LnVzZXJ0cnVzdC5jb20xHzAdBgNVBAMTFlVUTi1VU0VSRmlyc3Qt
SGFyZHdhcmUwHhcNMDkwNTE4MDAwMDAwWhcNMjAwNTMwMTA0ODM4WjA2MQswCQYD
VQQGEwJOTDEPMA0GA1UEChMGVEVSRU5BMRYwFAYDVQQDEw1URVJFTkEgU1NMIENB
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAw+NIxC9cwcupmf0booNd
ij2tOtDipEMfTQ7+NSUwpWkbxOjlwY9UfuFqoppcXN49/ALOlrhfj4NbzGBAkPjk
tjolnF8UUeyx56+eUKExVccCvaxSin81joL6hK0V/qJ/gxA6VVOULAEWdJRUYyij
8lspPZSIgCDiFFkhGbSkmOFg5vLrooCDQ+CtaPN5GYtoQ1E/iptBhQw1jF218bbl
p8ODtWsjb9Sl61DllPFKX+4nSxQSFSRMDc9ijbcAIa06Mg9YC18em9HfnY6pGTVQ
L0GprTvG4EWyUzl/Ib8iGodcNK5Sbwd9ogtOnyt5pn0T3fV/g3wvWl13eHiRoBS/
fQIDAQABo4IBPjCCATowHwYDVR0jBBgwFoAUoXJfJhsomEOVXQc31YWWnUvSw0Uw
HQYDVR0OBBYEFAy9k2gM896ro0lrKzdXR+qQ47ntMA4GA1UdDwEB/wQEAwIBBjAS
BgNVHRMBAf8ECDAGAQH/AgEAMBgGA1UdIAQRMA8wDQYLKwYBBAGyMQECAh0wRAYD
VR0fBD0wOzA5oDegNYYzaHR0cDovL2NybC51c2VydHJ1c3QuY29tL1VUTi1VU0VS
Rmlyc3QtSGFyZHdhcmUuY3JsMHQGCCsGAQUFBwEBBGgwZjA9BggrBgEFBQcwAoYx
aHR0cDovL2NydC51c2VydHJ1c3QuY29tL1VUTkFkZFRydXN0U2VydmVyX0NBLmNy
dDAlBggrBgEFBQcwAYYZaHR0cDovL29jc3AudXNlcnRydXN0LmNvbTANBgkqhkiG
9w0BAQUFAAOCAQEATiPuSJz2hYtxxApuc5NywDqOgIrZs8qy1AGcKM/yXA4hRJML
thoh45gBlA5nSYEevj0NTmDa76AxTpXv8916WoIgQ7ahY0OzUGlDYktWYrA0irkT
Q1mT7BR5iPNIk+idyfqHcgxrVqDDFY1opYcfcS3mWm08aXFABFXcoEOUIEU4eNe9
itg5xt8Jt1qaqQO4KBB4zb8BG1oRPjj02Bs0ec8z0gH9rJjNbUcRkEy7uVvYcOfV
r7bMxIbmdcCeKbYrDyqlaQIN4+mitF3A884saoU4dmHGSYKrUbOCprlBmCiY+2v+
ihb/MX5UR6g83EMmqZsFt57ANEORMNQywxFa4Q==
-----END CERTIFICATE-----
为什么?
httr
GET
方法在内部使用
RCurl::curlPerform
,就像
RCurl::getURL
一样,因此观察到的行为并不奇怪。带有“verbose”开关的
curl
命令行工具
-v
提供了以下附加提示:

$ curl -v  "https://www.esds.ac.uk/secure/UKDSRegister_start.asp"
* About to connect() to www.esds.ac.uk port 443 (#0)
*   Trying 155.245.69.4...
* Connected to www.esds.ac.uk (155.245.69.4) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: none
    CApath: /etc/ssl/certs
* SSLv3, TLS handshake, Client hello (1):
* SSLv3, TLS handshake, Server hello (2):
* SSLv3, TLS handshake, CERT (11):
* SSLv3, TLS alert, Server hello (2):
* SSL certificate problem: unable to get local issuer certificate
* Closing connection 0
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.
在枚举项3,上述错误消息中的包含有关获取服务器证书的说明:

$ openssl s_client -connect "www.esds.ac.uk:443"
CONNECTED(00000003)
depth=0 C = GB, ST = Essex, L = Colchester, O = University of Essex, OU = UK Data Archive, CN = www.esds.ac.uk
verify error:num=20:unable to get local issuer certificate
verify return:1
depth=0 C = GB, ST = Essex, L = Colchester, O = University of Essex, OU = UK Data Archive, CN = www.esds.ac.uk
verify error:num=27:certificate not trusted
verify return:1
depth=0 C = GB, ST = Essex, L = Colchester, O = University of Essex, OU = UK Data Archive, CN = www.esds.ac.uk
verify error:num=21:unable to verify the first certificate
verify return:1
---
Certificate chain
 0 s:/C=GB/ST=Essex/L=Colchester/O=University of Essex/OU=UK Data Archive/CN=www.esds.ac.uk
     i:/C=NL/O=TERENA/CN=TERENA SSL CA
---
Server certificate
-----BEGIN CERTIFICATE-----
MIIEIzCCAwugAwIBAgIQO9FPWbAYKDAuFHq61U3gDDANBgkqhkiG9w0BAQUFADA2
MQswCQYDVQQGEwJOTDEPMA0GA1UEChMGVEVSRU5BMRYwFAYDVQQDEw1URVJFTkEg
U1NMIENBMB4XDTEwMTIwNjAwMDAwMFoXDTEzMTIwNTIzNTk1OVowgYMxCzAJBgNV
......
对我来说,这读起来好像证书不可信。快速查找“TelENA SSL根证书”赫尔辛基大学发现:

不幸的是,这些权限的根证书并不总是存在于正在使用的设备中,相反,我们需要自己安装这些根证书


此网站还包含指向证书存储库的链接。

您可能需要在
getURL
中包含
cainfo=system.file(“curlsl”、“cacert.pem”、package=“RCurl”)
。。但那对我不起作用/奇怪的是,前几天我的RCurl也出现了同样的错误,通常解决方案都能解决,但它对我也不起作用。我发现这是一个新的网站证书的问题;但这似乎不是你要找的网站的问题(证书是从2010年开始的);对不起,这还是不起作用<代码>库(RCurl);cap@AnthonyDamico:您的代码对我有效,未经修改,包含以下软件包:
RCurl\u 1.95-4.1位操作\u 1.0-5
。我还发布了我正在使用的证书文件,但在我看来,您的代码是正确的。当然,我需要说
getURL(“https://www.esds.ac.uk/secure/UKDSRegister_start.asp,cainfo=tf,.opts=c(followLocation=T))
以实际执行重定向。刚刚更新并重新运行-仍然不起作用<代码>R版本3.0.0(2013-04-03)平台:x86_64-w64-mingw32/x64(64位)
且错误为
函数错误(type,msg,asError=TRUE):SSL证书问题,请验证CA证书是否正常。详细信息:错误:14090086:SSL例程:SSL3_获取_服务器_证书:证书验证失败
blah.@AnthonyDamico:我在Ubuntu Linux上。这可能是一个底线问题吗?您是否安装了
curl
openssl
可执行文件,以便使用命令行进行检查?